squid-1908

Version:

2.6 (2.5.STABLE9)

Bug link:

http://bugs.squid-cache.org/show_bug.cgi?id=1908

How it is diagnosed (reproduced or source analysis)?

We partially reproduced the failure. (We cannot see the exact same symptom but we can observe the code logic).

Symptom:

In some rare circumstances squid do not delete the cache object it is supposed to delete.

Root cause:

In the simple case where an object is added (SWAP_LOG_ADD) and then

removed (SWAP_LOG_DEL) without being referenced, s.lastref will

be equal to e->lastref, but squid failed to release the object in this case. The

SWAP_LOG_DEL is ignored and the object, which should have been deleted,

now remains in the cache.

diff -u -3 -p -r1.64 store_dir_ufs.c

--- src/fs/ufs/store_dir_ufs.c        21 Jan 2007 12:54:06 -0000        1.64

+++ src/fs/ufs/store_dir_ufs.c        1 Mar 2007 07:17:48 -0000

@@ -603,7 +603,7 @@ storeUfsDirRebuildFromSwapLog(void *data

            (void) 0;

562:         } else if (s.op == SWAP_LOG_DEL) {

            /* Delete unless we already have a newer copy */

-            if ((e = storeGet(s.key)) != NULL && s.lastref > e->lastref) {

+            if ((e = storeGet(s.key)) != NULL && s.lastref >= e->lastref) {

                // Release (e)

            storeRelease(e);

/* release an object from a cache */

storeRelease(StoreEntry * e)  ->

destroy_StoreEntry(void *data)

{

   StoreEntry *e = data;

   debug(20, 3) ("destroy_StoreEntry: destroying %p\n", e);

   assert(e != NULL);

   if (e->mem_obj)

 destroy_MemObject(e);

   storeHashDelete(e);

   assert(e->hash.key == NULL);

   memFree(e, MEM_STOREENTRY);

}

Note that since under "normal" operations squid writes clean swap.state

files, this bug should not occur.  It is only a problem when Squid exits

without writing clean swap.state files (hard crash, or with -C command line

option).

How to reproduce?

Start the squid server at:

squid-2.5.STABLE9/bin/sbin

Since this failure is hard to observe, we can use GDB to see the condition where object delete function is called.

$ gdb ./squid

(gdb) b  store_dir_ufs.c:562

(gdb) run -N

562             } else if (s.op == SWAP_LOG_DEL) {

(gdb) n

564                 if ((e = storeGet(s.key)) != NULL && s.lastref > e->lastref) {

(gdb) p s.lastref

$1 = 1288771773

(gdb) p e->lastref

$2 = 1288771772

The various store*DirRebuildFromSwapLog() functions have this

check for the SWAP_LOG_DEL case:

  /* Delete unless we already have a newer copy */

  if ((e = storeGet(s.key)) != NULL && s.lastref > e->lastref) {

Is there Error Message?

No.

Can Errlog anticipate error and put a magic error message?

No. The error condition is domain specific. If we can anticipate this error it means we can also fix the bug.