httpd-45023
Version:
2.2.8
Bug report link:
https://issues.apache.org/bugzilla/show_bug.cgi?id=45023
Symptom:
When adding DEFLATE filter in the request header, “304 NOT MODIFIED” gets disabled; it always returns “200 OK”. This will in the end introduce performance problems.
*background: ETag field
For resource caching, HTTP protocol needs a mechanism for which ETag provides cache validation. ETag (entity tag) is an identifier assigned by a web server to the specific resource at a URL. If the resource content changes, a new and different ETag is assigned. It’s like a fingerprint of each resource. When a URL is retrived, the web server will return the resource along with its ETag.(e.g. ETag: “123981ds7f3”). Then, the client may decide to cache the resource along with the ETag. Later, if the client wants to retrieve the same URL again, it will send its saved ETag value in a “If-None-Match” field. (e.g. If-None_match: “123981ds7f3”). Next, the server compares the client’s ETag value with current ETag value of the resource in the server side. If it matches, it means that the resource has not changed and the server returns “304 Not Modified status”. If not, a ful lresponse including the resource content is returned.
*DEFLATE filter
you might want to use DEFLATE filter to reduce the packet size by encoding the content. In this report, gzip encoding doesn’t work correctly.
How it is diagnosed?
We reproduced the failure.
How to reproduce?
1. ‘Proof Of Concept’ of normal case (Not Modified status) when no DEFLATE
*Request Packet (can use telnet):
GET /test.js HTTP/1.1
Host: xxxxxxxxxxxxx
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Mon, 18 Oct 2010 23:08:30 GMT
If-None-Match: "a488bc1-1d-492ec41293780"
Cache-Control: max-age=0
*Reposonse packet from the web server:
HTTP/1.1 304 Not Modified
Date: Tue, 19 Oct 2010 00:43:49 GMT
Server: Apache/2.2.8 (Unix)
Connection: Keep-Alive
Keep-Alive: timeout=5, max=95
ETag: "a488bc1-1d-492ec41293780"
2. HOWTO apply deflate filter by enabling ‘mod_deflate’ library
mod_deflate.c compilation & make run-time library
./apxs -i -c -Wl,--rpath -Wl,-lz ../httpd-2.2.8/modules/filters/mod_deflate.c
3. Reproducing the normal case("not modified" status) vs. the error case("200 http/ok") when using DEFLATE filter.
* the normal case: the response with “not modified” status is coming after the following packet request
GET /test.js HTTP/1.1
Host: xxxxxxxxxxxxx
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Mon, 18 Oct 2010 23:08:30 GMT
If-None-Match: "a488bc1-1d-492ec41293780"
Cache-Control: max-age=0
* the error case: the response with “200 http/ok” status is coming after the following packet request
3-1. [Client Request] Request ‘/test.js’ with gzip encoding.
GET /test.js HTTP/1.1
Host: xxxxxxxxxxxxx
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
3-2. [Server Response] The whole content returns with ETag value
HTTP/1.1 200 OK
Date: Mon, 07 Feb 2011 02:49:27 GMT
Server: Apache/2.2.8 (Unix)
Last-Modified: Mon, 18 Oct 2010 23:08:30 GMT
ETag: "a488bc1-1d-492ec41293780"-gzip
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 49
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/javascript
3-3. [Client Request] Later, the clients request the same resource again with the given ETag value
GET /test.js HTTP/1.1
Host: xxxxxxxxxxxxx
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Mon, 18 Oct 2010 23:08:30 GMT
If-None-Match: "a488bc1-1d-492ec41293780"-gzip ( * ‘-gzip’ is the difference bet. non-deflate one)
Cache-Control: max-age=0
3-4. [Server Response] We are expecting ‘Not Modified’, but still returns the whole content
HTTP/1.1 200 OK
Date: Mon, 07 Feb 2011 02:55:16 GMT
Server: Apache/2.2.8 (Unix)
Last-Modified: Mon, 18 Oct 2010 23:08:30 GMT
ETag: "a488bc1-1d-492ec41293780"-gzip
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 49
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/javascript
* If we reqeust a packet in the client side with ETag value "a488bc1-1d-492ec41293780", then the server returns “Not Modified” status. That is, “-gzip” seems misleading.
Root cause:
Server returns modified ETag value and it does not match with the original ETag, so that it always does not return “Not Modified 304” status.
In the first request with encoding option ‘deflate, gzip’, the server returns ETag value “5954c6-10f4-449d11713aac0”-gzip.
/* ETag value for gzip encode is modified in “deflate_check_etag” function.*/
/* this function is called by ‘deflate_out_filter’ */
static void deflate_check_etag(request_rec *r, const char *transform)
{
/* retrieve ETag value from the response header table. it is “5954c6-10f4-449d11713aac0” */
const char *etag = apr_table_get(r->headers_out, "ETag");
if (etag && (((etag[0] != 'W') && (etag[0] !='w')) || (etag[1] != '/'))) {
apr_table_set(r->headers_out, "ETag",
apr_pstrcat(r->pool, etag, "-", transform, NULL));
/* modify ETag value to “5954c6-10f4-449d11713aac0”-gzip to represent that this packet is gzip-encoded */
}
}
In the next request of the client, the ETag value of the packet is “5954c6-10f4-449d11713aac0”-gzip. (not “5954c6-10f4-449d11713aac0”) because the server returns this value.
The only part which changes the status code into “HTTP_NOT_MODIFIED 304” is in the function “ap_meets_conditions”.
AP_DECLARE(int) ap_meets_conditions(request_rec *r)
{
const char *etag;
const char *if_match, *if_modified_since, *if_unmodified, *if_nonematch;
/* retrieve the ETag value in the server side. it is “5954c6-10f4-449d11713aac0” */
etag = apr_table_get(r->headers_out, "ETag");
…
/* retrieve the “If-None_Match” field, which contains the ETag value of client side. it is “5954c6-10f4-449d11713aac0”-gzip. That is, it is different from the one in the server side */
if_nonematch = apr_table_get(r->headers_in, "If-None-Match");
if (if_nonematch != NULL) {
...
if (r->method_number == M_GET) {
if (if_nonematch[0] == '*') {
not_modified = 1;
}
else if (etag != NULL) {
if (apr_table_get(r->headers_in, "Range")) {
not_modified = etag[0] != 'W'
&& ap_find_list_item(r->pool,
if_nonematch, etag);
}
else {
not_modified = ap_find_list_item(r->pool,
if_nonematch, etag);
}
/* compare ‘if_nonematch’ with ‘etag’. Definitely, they are different due to the postfix “-gzip”. */
}
}
… ...
}
… ….
if (not_modified) {
return HTTP_NOT_MODIFIED;
}
return OK;
}
Patch:
remove the function “deflate_check_etag” so that ETag value has not changed.
Is there Error Message?
No.
Can Errlog anticipate error and put a magic error message?
No.