httpd-39647

Version:

2.2.2

Failure report link:

https://issues.apache.org/bugzilla/show_bug.cgi?id=39647

How it is diagnosed (reproduced or source analysis)?

We cannot reproduce the failure so we studied the source code.

Symptom:

The setting is as above. When using Apache as a proxy cache, with a backend server connected to a Tomcat application server (which is responsible to serve image files), client periodically cannot display the image correctly on their browsers.

Root cause:

The bug is in the proxy cache server. In some corner cases, it send a wrong header ‘Content-type: text/html’ as response to client when the actual content is an image.

The triggering condition is that the backend server sent the response 304: ‘HTTP_NOT_MODIFIED’, and also a ‘content-type’ field being: ‘text/html’. The apache proxy cache found the file in the cache, however, it also set the ‘content-type’ based on the response from back-end server, resulting in the response it sent to the client contains ‘Content-type: text/html’. The correct behavior is that when there is a cache hit, Apache proxy server should not modify any stored http header in the cache.

Now here is the code:

Comments written by us are in blue color.

Index: modules/cache/cache_storage.c

===================================================================

--- modules/cache/cache_storage.c        (Revision 409457)

+++ modules/cache/cache_storage.c        (Arbeitskopie)

CACHE_DECLARE(void) ap_cache_accept_headers(... ...) {

   apr_table_t *cookie_table, *hdr_copy;

   const char *v;

   v = apr_table_get(h->resp_hdrs, "Content-Type");

  // The cached object has a ‘content-type’ field.

@@ -118,6 +118,17 @@

    if (v) {

      /* If the cached object has a “Content-Type” field, then it sets the

         response’s “content_type” to the same as ‘h->resp_hdrs’s

         Content-Type. It also unset the cached object’s

         “Content-Type” field. */

      /* This branch decision is not tested at all!!! */

      ap_set_content_type(r, v); // this will set “r->content_type”

                                 // to h->resp_hdrs’s content-type,

                                 // (the cached)

      apr_table_unset(h->resp_hdrs, "Content-Type");

+     /*

+      * Also unset possible Content-Type headers in r->headers_out and

+      * r->err_headers_out as they may be different to what we have received

+      * from the cache.

+      * Actually they are not needed as r->content_type set by

+      * ap_set_content_type above will be used in the store_headers functions

+      * of the storage providers as a fallback and the HTTP_HEADER filter

+      * does overwrite the Content-Type header with r->content_type anyway.

+      */

+        apr_table_unset(r->headers_out, "Content-Type");

+        apr_table_unset(r->err_headers_out, "Content-Type");

    }

    /* If the cache gave us a Last-Modified header, we can't just

   … …

  // This function will set ‘r->headers_out = h->resp_hdrs unless there is already a same field in r->headers_out’’

   apr_table_overlap(r->headers_out, h->resp_hdrs, APR_OVERLAP_TABLES_SET);

}

Is there log message?

No.

Can developers anticipate the error?

Yes. Using the “log for untested branch decision” pattern. We can insert an error msg as below (which will be very useful for diagnosis):

   if (v) {

  +     Elog (“untested region”...);

        ap_set_content_type(r, v); // this will set “r->content_type”

                                 // to h->resp_hdrs’s content-type,

                                 // (the cached)

                 .. .. ..

         }