pgsql-4821

Version:

8.3.*

Bug Link:

http://archives.postgresql.org/pgsql-bugs/2009-05/msg00230.php

Patch Link:

http://archives.postgresql.org/pgsql-committers/2009-05/msg00310.php

Symptom:

When using LIKE with '%_' as wildcard, it gives wrong result. Here ‘%_’ is a regular expression pattern in postgres, where ‘%’ is like ‘*’ in perl-style regex (0 or more of any characters), while ‘_’ is ‘.’ (any single character) in perl style regex.

So for query like: select ‘foo’ like ‘%_’;

is to test whether string ‘foo’ matches the pattern ‘%_’. It should return ‘true’ since the pattern matches.

Failure type:

Wrong result

Is there any log message?

No

How it is diagnosed:

We reproduced the failure.

How to reproduce:

postgres=# select * from table where 'foo' like '%_';

 ?column?

----------

 f        // f stands for “false” should be t, “true”, since foo can be matched using %_

(1 row)

Root Cause:

Brief:  The code for handling ‘%’ followed by ‘_’ only matches the first char ‘f’, and then since pattern comes to the end while text does not, its logic for this situation is returns false. The correct logic is, however, return true.

Detail:

src/backend/utils/adt/like_match.c

/* t = ‘foo’, p = ‘%_’, plen = 2 (pattern length), tlen = 3 (string length). */

static int MatchText(char *t, int tlen, char *p, int plen)

{

        /* Fast path for match-everything pattern */

        if ((plen == 1) && (*p == '%'))

                return LIKE_TRUE;

        while ((tlen > 0) && (plen > 0))

        {

                if (*p == '\\')

                {

                        NextByte(p, plen);

                        if ((plen <= 0) || TCHAR(*p) != TCHAR(*t))

                                return LIKE_FALSE;

                }

                        /* The first iteration, since *p == ‘%’, it should fall into the branch below. */

                else if (*p == '%')

                {

                        /* %% is the same as % according to the SQL standard */

                        /* Advance past all %'s, so after this loop, p will point to ‘_’. */

                        while ((plen > 0) && (*p == '%'))

                                NextByte(p, plen);

                        /* Trailing percent matches everything. */

                        if (plen <= 0)

                                return LIKE_TRUE;

                                    /* Now *p == ‘_’. */

                        if (*p == '_')

                        {

                                                /* Exhaust all the ‘_’s to match against the input. */

                                do

                                {

                                NextChar(t, tlen);

                                NextByte(p, plen);

                                } while (tlen > 0 && plen > 0 && *p == '_');

-                                if (tlen <= 0)

-                                {

-                                        return (plen <= 0) ? LIKE_TRUE : LIKE_ABORT;

-                                }

                                                /* now we exhausted all the pattern ‘p’, so plen == 0. However

                                              * it should have returned LIKE_TRUE instead of LIKE_FALSE.*/

-                                else if (plen <= 0)

-                                {

                                                            mem[xxx] = xx

                                                            /* According to input check pattern, developers should

                                                             * log here since it indicates a failed pattern match! */

-                                        return LIKE_FALSE; // should return true !

-                                }

+                                /*

+                         * If we are at the end of the pattern, succeed: % followed by

+                        * n _'s matches any string of at least n characters, and we

+                        * have now found there are at least n characters.

+                                 */

+                                if (plen <= 0)

+                                {

+                                        return LIKE_TRUE;

+                                }

                                     … ...

                        }

                …

        }

        ...

}

Pattern to handle

input check

Information sufficient to diagnose?

The difference the patch made to the bug-triggering execution: tlen and plen at that return LIKE_FALSE point.

If we put a log message at ‘return LIKE_FALSE’, it would be sufficient to diagnose the failure!