httpd-46949

Version:

2.2.11

Bug Link:

https://issues.apache.org/bugzilla/show_bug.cgi?id=46449

Symptom(Failure):

In short, apache sometimes serves wrong content that seems to belong to other threads. The environment producing the bug is: apache works as reverse proxy and also balancer for two backend tomcat servers with AJP protocol. A test case is: when a simple jsp file test.jsp intends to print only word “hello”, the client sometimes see words other than “hello”.

This only occurs when there are some error with the request header.

How it is diagnosed:

It only appear occasionally, hard to reproduce. We relied on source code to understand.

Root Cause:

When the client request has a “Content-Length”(>0) field in the header, but with no request body (i.e., the client closes the connection without sending the body), apache will still try to send the empty request message to the backend server. But the connection is already in a wrong state (back-end does not know). Client side closes the connection, next time when it establish a new connection, the wrong-state connection may be reused by the client, the data is sent to this new request, which is not expected by the client.

(mod_proxy_ajp.c)

/*

* process the request and write the response.

*/

static int ap_proxy_ajp_request(... ...)

{

   tenc = apr_table_get(r->headers_in, "Transfer-Encoding");

   // if there’s “Transfer-Encoding” field and with “chunked” method.

   // indicating that the client’s request body is chuncked

   if (tenc && (strcasecmp(tenc, "chunked") == 0)) {

       /* The AJP protocol does not want body data yet */

       ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,

                    "proxy: request is chunked");

   } else {

       /* Get client provided Content-Length header */

       content_length = get_content_length(r);

        … ..

      /* apr_brigade_flatten: Take a bucket brigade and store

         the data in a flat char*

           this will transfer the request body data from input

         bridage to msg’s buf.

      */

       status = apr_brigade_flatten(input_brigade, buff, &bufsiz);

       if (status != APR_SUCCESS) {

        ...

       }

       apr_brigade_cleanup(input_brigade);

      if (bufsiz > 0) {

        status = ajp_send_data_msg(conn->sock, msg, bufsiz);

        ...

      }

      else if (content_length > 0) {

        // the client indicate his request body has content_length contents

// we only read 0 bytes(the client close the connection without sending          

// request body.

           ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,

                        "proxy: read zero bytes, expecting"

                        " %" APR_OFF_T_FMT " bytes",

                        content_length);

        // The developer already anticipate this wrong state but the fact he’s trying

        // some additional efforts is the very error part of this failure.

         // This is unnecessary and would result in

         // wrong result to be sent!

-            status = ajp_send_data_msg(conn->sock, msg, 0);
-            if (status != APR_SUCCESS) {
-                /* We had a failure: Close connection to backend */
-                conn->close++;
-                ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,
-                            "proxy: send failed to %pI (%s)",
-                            conn->worker->cp->addr,
-                            conn->worker->hostname);
-                return HTTP_INTERNAL_SERVER_ERROR;
-            }
-            else {
-                /* Client send zero bytes with C-L > 0
-                 */
-                return HTTP_BAD_REQUEST;
-            }
+            /*
+             * We can only get here if the client closed the connection
+             * to us without sending the body.
+             * Now the connection is in the wrong state on the backend.
+             * Sending an empty data msg doesn't help either as it does
+             * not move this connection to the correct state on the backend
+             * for later resusage by the next request again.
+             * Close it to clean things up.
+             */
+            conn->close++;
+            return HTTP_BAD_REQUEST;
        }

      }

   }

The deleted control logic in the code above. When it’s bufsiz == 0 and content-length > 0 are triggered, the program will go into different logic control in this point. The error and fault point is near.

What is the error message pattern:

This is input checking: Developers knew it is an error state when content-length > 0 and bufsiz == 0!

But it is quite hard for Errlog to identify this automatically (as it’s domain specific).

Is there any log message?:

Yes! Quite helpful for diagnosing the bug.