httpd-22030

Version:

2.0.49

Bug Link:

https://issues.apache.org/bugzilla/show_bug.cgi?id=22030

Symptom:

When CGI output more than 4096 bytes (64K in newer kernels) to STDERR, if there is further following output to stdout, it won’t be produced, the CGI script just hangs.

How to reproduce this failure?

We reproduced this failure.

Just install the 2.0.49 server, and run the following cgi script:

#!/usr/bin/perl

foreach my $x (1..24000) {

 print STDERR 'X' x 169 . "\n";

}

# + 17 more bytes, putting us at 4097

# Delete one char from the print below to make

# it work again

print STDERR "0123456789ABCDEF\n";

 # Our actual script output, which never comes

print "Content-type: text/plain\n\nASDF\n";

And used gdb to follow the bug. In fact, the duration of the time-out is the same as ‘time-out’ setting in httpd.conf (by default 5 minutes).

Root Cause:

A deadlock between httpd and the script(perl) subprocess.

httpd forks a subprocess to execute the script and establish three pipes, script_in, script_out, script_err between perl and httpd.

script_in(which is actually script’s output!), and script_err pipe’s read side is httpd, write side perl.

script_out’s read side is perl, write side httpd.

For the script above, when perl starts executing, it first write to script_err pipe. Since it tries to write more than 4096 bytes, which is larger than the pipe’s capacity, it will block on the write to wait for httpd to read it away.

But httpd’s handling cgi flow is(can refer to the code below for further details):

  1. Write input data(such as POST/PUT sent by client) to script_in.
  2. Read script’s output( First anticipate header part from output) and send to client
  3. Read script_err and write to log file.

...

Since perl is blocked on write() for httpd to read script_err, which is full now  and cannot proceed to write to script_out. But httpd cannot go to step 3 to unlock perl before completing step 2, where it’s blocked on read() since perl cannot proceed to write script_out. A dead lock!

But Apache’s developers kind of anticipate(smart!)  lock in using pipes, and use some timeout mechanism to avoid deadlock. Since ‘read’ or ‘write’ itself does not support timeout, the developers use non-blocking+polling to wrap a timeout read and write.

Non-blocking changes pipe’s flag to non-blocking, which means read or write operations will never be suspended. If the pipe’s too full to write or too empty to read, write or read will just fail.

But only using non-blocking may cause data loss if the full will soon be read away or empty will soon be filled.

To mitigate the loss, it uses polling  when the read or write fails. And poll system call supports timeout parameter. Upon a read/write failure caused by blocking, it will poll within a certain time. If the fd is unblocked during polling, the operation will be redone. If it’s still blocked after timeout, it gives up.

See the code blow for more concrete idea of how they implement this:

mod_cgi.c

int cgi_handler(request_rec *r)

{

               ….

   /* run the script in its own process */

   if ((rv = run_cgi_child(&script_out, &script_in, &script_err,

                           command, argv, r, p, &e_info)) != APR_SUCCESS) {

       ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,

                     "couldn't spawn child process: %s", r->filename);

       return HTTP_INTERNAL_SERVER_ERROR;

   }

   /* Transfer any put/post args, CERN style...

    * Note that we already ignore SIGPIPE in the core server.

    */

  ...

   do {

        ...

           /* Keep writing data to the child until done or too much time

            * elapses with no progress or an error occurs.

            */

           rv = apr_file_write_full(script_out, data, len, NULL);(script’s input data)

           

   }

   while (!seen_eos);

   ...

   /* Handle script return... */

   if (script_in && !nph) {

          

                 /**

* Read headers output from a script, ensuring that the output is valid.  If

* the output is valid, then the headers are added to the headers out of the

* current request

        */

           //(It’s expecting “Content-type” in the script about...)

               if ((ret = ap_scan_script_header_err_brigade(r, bb, sbuf))) {

                           return log_script(r, conf, ret, dbuf, sbuf, bb, script_err);

               }

        ..

               if (location && location[0] == '/' && r->status == 200) {

                           discard_script_output(bb);

           apr_brigade_destroy(bb);

           log_script_err(r, script_err);

        

        return OK;

}

        ...

  }

           ...

}

mod_cgi.c

apr_status_t run_cgi_child(apr_file_t **script_out, apr_file_t **script_in,apr_file_t                  

**script_err,  const char *command, const char * const argv[],

request_rec *r,apr_pool_t *p, cgi_exec_info_t *e_info)

{

           ….

               rc = ap_os_create_privileged_process(r, procnew, command, argv, env,

                                            procattr, p);

        ...

           *script_in = procnew->out;

           if (!*script_in)

               return APR_EBADF;

           apr_file_pipe_timeout_set(*script_in, r->server->timeout);

           if (e_info->prog_type == RUN_AS_CGI) {

               *script_out = procnew->in;

               if (!*script_out)

                   return APR_EBADF;

               apr_file_pipe_timeout_set(*script_out, r->server->timeout);

               *script_err = procnew->err;

               if (!*script_err)

                   return APR_EBADF;

               apr_file_pipe_timeout_set(*script_err, r->server->timeout);

           }

        ...

}

APR_DECLARE(apr_status_t) apr_file_pipe_timeout_set(apr_file_t *thepipe, apr_interval_time_t timeout)

{

   if (thepipe->is_pipe == 1) {

       thepipe->timeout = timeout;

       if (timeout >= 0) {

           if (thepipe->blocking != BLK_OFF) { /* blocking or unknown state */

               return pipenonblock(thepipe);

           }

       }

       else {

           if (thepipe->blocking != BLK_ON) { /* non-blocking or unknown state */

               return pipeblock(thepipe);

           }

       }

       return APR_SUCCESS;

   }

   return APR_EINVAL;

}

static apr_status_t pipenonblock(apr_file_t *thepipe)

{

#if !BEOS_BLOCKING

     int fd_flags = fcntl(thepipe->filedes, F_GETFL, 0);//system call to get fd’s flags

//add non_block flag to fd

#  if defined(O_NONBLOCK)

     fd_flags |= O_NONBLOCK;

#  elif defined(O_NDELAY)

     fd_flags |= O_NDELAY;

#  elif defined(FNDELAY)

     fd_flags |= O_FNDELAY;

#  else

     /* XXXX: this breaks things, but an alternative isn't obvious...*/

     return APR_ENOTIMPL;

#  endif

     if (fcntl(thepipe->filedes, F_SETFL, fd_flags) == -1) {//system call to set fd’s flags

         return errno;

     }

thepipe->blocking = BLK_OFF;

return APR_SUCCESS;

}

Now for pipe’s read:

readwrite.c

APR_DECLARE(apr_status_t) apr_file_read(apr_file_t *thefile, void *buf, apr_size_t *nbytes)

{

       … ...

       do {

           rv = read(thefile->filedes, buf, *nbytes); // won’t block for pipe now!

       } while (rv == -1 && errno == EINTR);

#ifdef USE_WAIT_FOR_IO

       if (rv == -1 &&

           (errno == EAGAIN || errno == EWOULDBLOCK) &&

           thefile->timeout != 0) {

         // If we enter here, then it is very likely it is going to hang

         // since the previous ‘read’ didn’t succeed.

         // We can put a log message HERE!

           apr_status_t arv = apr_wait_for_io_or_timeout(thefile, NULL, 1);

           if (arv != APR_SUCCESS) {

               *nbytes = bytes_read;

               return arv;

           }

           else {

               do {

                   rv = read(thefile->filedes, buf, *nbytes);//try again when event comes

               } while (rv == -1 && errno == EINTR);

           }

       }  

#endif

… ...

}

waitio.c

apr_status_t apr_wait_for_io_or_timeout(apr_file_t *f, apr_socket_t *s,

                                          int for_read)

{

   …

   do {

       srv = apr_poll(&pollset, 1, &n, timeout);

       if (n == 1 && pollset.rtnevents & type) {

           return APR_SUCCESS;

       }

   } while (APR_STATUS_IS_EINTR(srv));

   return srv;

}

apr_poll.c

APR_DECLARE(apr_status_t) apr_poll(apr_pollfd_t *aprset, apr_int32_t num,

                     apr_int32_t *nsds, apr_interval_time_t timeout)

{

  ...

   if (timeout > 0) {

       timeout /= 1000; /* convert microseconds to milliseconds */

   }

   rv = poll(pollset->pollset, pollset->nelts, timeout);

   (*num) = rv;

   if (rv < 0) {

       return apr_get_netos_error();

   }

   if (rv == 0) {

     // We can also put a message here -- indeed, it timed out!!!

       return APR_TIMEUP;

   }

}

The new mechanism is to use a bucket instead.

/* A CGI bucket type is needed to catch any output to stderr from the

* script; see PR 22030. */

static const apr_bucket_type_t bucket_type_cgi;

struct cgi_bucket_data {

   apr_pollset_t *pollset;

   request_rec *r;

};

Read method of CGI bucket: polls on stderr and stdout of the child, sending any stderr output immediately away to the error log.

Is there any Error message:

No

Can Errlog anticipate with an error message:

Yes!

Errlog will place error msgs in two possible places, both of the pattern ‘log at failed libc/system call’:

1) When a non-blocking read or write fails(because of an actual blocking happens), we give some warning messages, since this failure is probably caused by a potential deadlock.

2) When the polling times out, the anticipated read or write event still doesn’t come.