pgsql-5302

Version:

8.4.2

Bug Link:

http://archives.postgresql.org/pgsql-bugs/2010-01/msg00270.php

Symptom:

When set client encoding to WIN1252, memory leaks quickly with savepoints and cursors. The leak does not occur without the WIN1252 encoding setting.

The following script can be used to reproduce the bug.

-- Causes leak:
SET client_encoding TO 'WIN1252';

BEGIN;
CREATE TEMP TABLE t1(pk INT PRIMARY KEY);

-- Repeat 1000 times
DECLARE mycur CURSOR WITH HOLD FOR SELECT * FROM t1;
FETCH 100 IN mycur;
SAVEPOINT mysp;
CLOSE mycur;
RELEASE mysp;
-- End repeat

COMMIT;

How it is diagnosed:

Tried to reproduce& source code analysis.

Root Cause:

An encoding conversion on command tag strings during EndCommand is performed. This would result that every SQL command above would execute an encoding conversion, which would allocate memory (and not immediately freed). This is unnecessary and should be avoided. Since all current and foreseeable future command tags will be pure ASCII, there is no need to do conversion on them. (We only need to perform conversion on the results).

This avoidance in conversion, as the developer put it “This saves a few cycles and also avoids polluting otherwise-pristine subtransaction memory contexts, which is the cause of the backend memory leak exhibited in bug #5302”

src/backend/tcop/dest.c

EndCommand(const char *commandTag, CommandDest dest)
         {

        switch (dest)

        {
                 case DestRemote:
                 case DestRemoteExecute:
                        /*
                         * We assume the commandTag is plain ASCII and therefore
                         * requires no encoding conversion.
                         */

-                        pq_puttextmessage('C', commandTag);
+                        pq_putmessage('C', commandTag, strlen(commandTag) + 1);
         break;  
case DestNone:

        …

        }


NullCommand(CommandDest dest)

{

        switch (dest)

        {

                case DestRemote:

                case DestRemoteExecute:
                         if (PG_PROTOCOL_MAJOR(FrontendProtocol) >= 3)
                                 pq_putemptymessage('I');
                         else

-                                pq_puttextmessage('I', "");

+                                pq_putmessage('I', "", 1);
                         break;
                 case DestNone:

                        ...

}

---------------------------------------------------------------------

/* --------------------------------

 *                pq_puttextmessage - generate a character set-converted message in one step

 *

 *                This is the same as the pqcomm.c routine pq_putmessage, except that

 *                the message body is a null-terminated string to which encoding

 *                conversion applies.

 * --------------------------------

 */

void

pq_puttextmessage(char msgtype, const char *str)

{

        int                        slen = strlen(str);

        char           *p;

        p = pg_server_to_client(str, slen);

        if (p != str)                                /* actual conversion has been done? */

        {

                (void) pq_putmessage(msgtype, p, strlen(p) + 1);

                pfree(p);

                return;

        }

        (void) pq_putmessage(msgtype, str, slen + 1);

}

/*

 * convert server encoding to client encoding.

 */

char *

pg_server_to_client(const char *s, int len)

{

        Assert(DatabaseEncoding);

        Assert(ClientEncoding);

        if (len <= 0)

                return (char *) s;

        ...

        return perform_default_encoding_conversion(s, len, false);

}

static char *

perform_default_encoding_conversion(const char *src, int len, bool is_client_to_server)

{

        char           *result;

        int                        src_encoding,

                                dest_encoding;

        ...

        /*

         * Allocate space for conversion result, being wary of integer overflow

         */

        if ((Size) len >= (MaxAllocSize / (Size) MAX_CONVERSION_GROWTH))

                ereport(ERROR,

                                (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),

                                 errmsg("out of memory"),

                 errdetail("String of %d bytes is too long for encoding conversion.",

                                   len)));

        result = palloc(len * MAX_CONVERSION_GROWTH + 1);

           … ...

        return result;

}

Is there any log message?

No.

One thing is that, even the memory was exhausted, we still did not have an error message. In fact, in postgres, they do have the following logic:

static void *

AllocSetAlloc(MemoryContext context, Size size)

{

  … …

     block = (AllocBlock) malloc(blksize);

       while (block == NULL && blksize > 1024 * 1024)

       {

           blksize >>= 1;

           if (blksize < required_size)

               break;

           block = (AllocBlock) malloc(blksize);

       }

       if (block == NULL)

       {

           MemoryContextStats(TopMemoryContext);

           ereport(ERROR,

                   (errcode(ERRCODE_OUT_OF_MEMORY),

                    errmsg("out of memory"),

                    errdetail("Failed on request of size %lu.",

                              (unsigned long) size)));

       }

}

malloc would return NULL only if the virtual address space is exhausted. But in reality, on 64 bit machines, this is unlikely to happen, since the virtual address space is 256 TB. It is very likely that Linux would kill the process before it exhausts the virtual memory address space. In fact, Linux would start to kill process if it detects both physical memory and swap space is exhausted.

In some cases, when linux kills the postgres server process due to memory exhaustion, the server can restart itself and print the following message at the start up:

LOG:  background writer process (PID 12116) was terminated by signal 9: Killed

LOG:  terminating any other active server processes

WARNING:  terminating connection because of crash of another server process

DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

HINT:  In a moment you should be able to reconnect to the database and repeat your command.

But sometimes the entire linux box would simply freeze and crash. Also, as we can see, with the above messages are hardly useful in diagnosing memory exhaustion: we do not even know it is killed because of memory exhaustion, not to mention we do not know which malloc caused the leak.

Can we automatically anticipate?

Yes. Adaptive sampling of resource allocation function malloc.