Apparent bug in exp_prefetch()
Adrian Otto
aotto at mosso.com
Sat Jan 26 10:11:34 CET 2008
Hello,
I'm new to varnish but I think I've found a bug and I'm seeking help
to confirm it. This is varnish 1.1.2 on RHEL4. I'm trying to use this
in my VCL:
sub vcl_timeout {
fetch;
}
This is in contrast to the default setting:
sub vcl_timeout {
discard;
}
According to the man page, this is supposed to fetch a fresh copy of
the document as it's about to expire:
vcl_timeout
Called by the reaper thread shortly before a cached
document reaches its expiry time.
The vcl_timeout subroutine may terminate with one of
the following keywords:
fetch Request a fresh copy of the object from the
backend.
discard
Discard the object.
What I want is a way to always have a fresh copy of a given document
cached, so that no client connections actually have to fall through
to "pass" to fetch the document from the backend.
Now, I looked into the source, and tracked that capability back to
the file cache_expire.c and function exp_prefetch. This is where I
found it using GDB:
(gdb) info threads
6 Thread -1217016912 (LWP 17397) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
5 Thread -1227506768 (LWP 17398) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
4 Thread -1237996624 (LWP 17399) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
3 Thread 1972149168 (LWP 17400) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
2 Thread 1961659312 (LWP 17401) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
1 Thread -1208621376 (LWP 17396) 0x002777a2 in _dl_sysinfo_int80
() from /lib/ld-linux.so.2
(gdb) thread 5
[Switching to thread 5 (Thread -1227506768 (LWP 17398))]#0
0x002777a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0 0x002777a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0031bce6 in __nanosleep_nocancel () from /lib/tls/libc.so.6
#2 0x0031baac in sleep () from /lib/tls/libc.so.6
#3 0x080516d2 in exp_prefetch (arg=0x0) at cache_expire.c:178
#4 0x004f33b8 in start_thread () from /lib/tls/libpthread.so.0
#5 0x0035a2fe in clone () from /lib/tls/libc.so.6
It's always running in thread 5. Looking carefully at the source, I
only see how an object could be removed from the heap. I don't see
how it could actually get replaced. It only seems to know how to
expire something. It seems that line 183 runs, and takes it out of
the binheap, and then the lock is released. After that happens, it
follows what seemed to be a rather confusing sequence of code under
VCL_timeout_method(sp). I have not yet learned how the VCL actually
controls the internals of varnish, so I'm not pretending to fathom
it. I think what might work nicely is a code sequence like this:
A) Find an object that's expired.
B) Fetch a new copy if it was modified since the last fetch, and
verify it.
C) Lock the binheap.
D) Drop the old object, and swap in the new one.
E) Unlock the binheap.
It might also be nice if the 'varnishlog' process could have
visibility into the progress of the above code sequence as well. Am I
missing something, or is there a bug here that makes the 'fetch'
method in 'vcl_timeout' impossible to use?
Here are steps to reproduce the problem:
1) Redefine your vcl_timeout like this:
sub vcl_timeout {
fetch;
}
2) Start varnish, and varnishlog.
3) Browse a page that has a short TTL, like 120 seconds.
4) Wait for the TTL to expire.
At this point I expect to see the varnishlog output indicate that
it's fetching a fresh copy, but I don't.
5) Browse to the same page again, and you'll notice that it re-
fetches the document from the origin at the time your browser
requests the file via 'miss', not before the miss occured.
I look forward to any guidance you can provide.
Thanks,
Adrian
More information about the varnish-dev
mailing list