pgsql-4961

Version:

8.4.0

Bug Link:

http://postgresql.1045698.n5.nabble.com/BUG-4961-pg-standby-exe-crashes-with-no-args-td2131822.html

Patch Link:

http://archives.postgresql.org/pgsql-committers/2009-11/msg00019.php

Symptom:

pg_standby.exe will crash when running with no arguments(to be precise, it only works when running with --help or --version, in all other cases, it crash!) in windows environment.

How it is diagnosed:

Reproduced!

How to reproduce:

Note1: The binary version of 8.4 seems to be missing. Need to compile it by yourself on windows.

How to build from source on windows:

http://www.postgresql.org/docs/8.4/interactive/install-win32-full.html

Note2: The manual above uses Visual Studio 2005. If you are using Visual Studio 2008,

there’s something to change:

a. in src/tools/msvc:

    replace 8.00 in Project.pm with 9.00

b. in src/include/port:

    replace 0x0500 in win32.h with 0x0501

c. in contrib/fuzzystrmatch/dmetaphone.c:

    replace ‘? in line 464 with ‘\xc7’

    replace ‘? in line 1040 with ‘\xd1’

Note3: make sure to remove unnecessary options in src/tools/msvc/config.pl

and install bison and flex to a path with no spaces(c:\gnuwin32 for instance, but         

not c:\program files\...)

Root Cause:

Brief:

The signals used to trigger fail-over don’t exist in Windows.

 

Detail:

There's no way to trigger fail-over via signal on Windows,

because Windows doesn't do signal processing like other platforms do.  So even pg_standby tries to use a signal to do the job,  it never really worked(nor is it harmful in previous version).  but recent changes to the signal handling made it crash. “

So the patch is to simply disable the signal triggering thing in windows.

contrib/pg_standby/pg_standby.c

...

/*------------ MAIN ----------------------------------------*/

int

main(int argc, char **argv)

{

        int                        c;

        progname = get_progname(argv[0]);

        if (argc > 1)

        {

        /**it’s why pg_standby only works with --help and --version on windows! **/

           /**since it’s pre-handled**/

                if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)

                {

                        usage();

                        exit(0);

                }

                if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)

                {

                        puts("pg_standby (PostgreSQL) " PG_VERSION);

                        exit(0);

                }

        }

//If we add the following red macro, it won’t crash anymore!

//yeah, but it loses its support for windows as well:(

+ #ifndef WIN32

        /*

         * You can send SIGUSR1 to trigger failover.

         *

         * Postmaster uses SIGQUIT to request immediate shutdown. The default

         * action is to core dump, but we don't want that, so trap it and commit

         * suicide without core dump.

         *

         * We used to use SIGINT and SIGQUIT to trigger failover, but that turned

         * out to be a bad idea because postmaster uses SIGQUIT to request

         * immediate shutdown. We still trap SIGINT, but that may change in a

         * future release.

         *

         * There's no way to trigger failover via signal on Windows.

         */

             /**The following signals don’t exist on windows, that’s why it crash!**/

            /**Fix: SIGUSR1 is the actual crime!SIGQUIT and SIGINT are innocent!

                See the demo code below for verification.**/

   

        (void) signal(SIGUSR1, sighandler);

        (void) signal(SIGINT, sighandler);        /* deprecated, use SIGUSR1 */

        (void) signal(SIGQUIT, sigquit_handler);

+#endif

        while ((c = getopt(argc, argv, "cdk:lr:s:t:w:")) != -1)

        {

                switch (c)

                {

}

        }

        …

}

The following code is extracted from pg_standby for deciding who among the three signals is/are the evil guy(s).

#include<signal.h>

/* Some extra signals */

#define SIGHUP                                1

#define SIGQUIT                                3

#ifndef __BORLANDC__

#define SIGUSR1                                30

#define SIGUSR2                                31

#endif

static volatile sig_atomic_t signaled = false;

static void sighandler(int sig)

{

        signaled = true;

}

static void sigquit_handler(int sig)

{

}

int main()  

{  

        /**use controlled  variable method to comment or un-comment the following**/

        //(void) signal(SIGINT, sighandler);

        (void) signal(SIGUSR1, sighandler);

        //(void) signal(SIGQUIT, sigquit_handler);

        return 0;

}

Only leaving SIGUSR1, build it with release configuration, and you’ll get a fresh crash!

If we build it with debug configuration, we can see it’s actually an assertion failure(the signal number doesn’t exist on windows):

Is there any log message?:

No. The reported log message is not generated in  pg_standby.exe

Can we anticipate error?

The failure happens in the library, but there’s no return code thing for us to check, so I guess it’s hard to anticipate.

But since, it’s decides SIGINT is deprecated and turns to more standardized or new SIGUSR1. We can log before use it.