child process restarts frequently

Håvard Futsæter havard.futseter at met.no
Wed Jul 4 12:46:50 CEST 2007


We have a problem with the child process regularly dying and restarting.

We use Varnish in front of three backends, each serving different types 
of content. One of these backends is quite slow, and initially we 
thought the problem with the child process was related to this slow 
backend. We hoped that upgrading from 1.0.3 to 1.0.4 would solve this, 
but unfortunately is has not.

We have a 64-bits machine with a 400G disk cache.

Below I have attached excerpts from what I think is relevant parts of 
the log( with some description in between) for one of these restart events:

-------------------------------------

First request with id 138012248 is fetched and inserted in the cache:

76 ReqStart     c xxx.xx.xxx.xx 4589 138012248
    76 RxRequest    c GET
    76 RxURL        c 
/weatherapi/locationforecast/1.1/?lat=59.2082531856325;lon=
9.60892344715165
    76 RxProtocol   c HTTP/1.1
    76 VCL_call     c recv
    76 VCL_return   c lookup
    76 VCL_call     c hash
    76 VCL_return   c hash
    76 VCL_call     c miss
    76 VCL_return   c fetch
    55 BackendXID   b 138012248
    76 Backend      c 55 weatherapi
    55 TxRequest    b GET
    55 TxURL        b 
/weatherapi/locationforecast/1.1/?lat=59.2082531856325;lon=
9.60892344715165
    55 TxProtocol   b HTTP/1.1
    55 TxHeader     b X-Varnish: 138012248
    55 TxHeader     b X-Forwarded-for:
    55 RxProtocol   b HTTP/1.1
    55 RxStatus     b 203
    55 RxResponse   b Non-Authoritative Information
    55 RxHeader     b Date: Tue, 03 Jul 2007 14:07:08 GMT
    55 RxHeader     b Server: Apache/1.3.34 (Debian) mod_perl/1.29
    55 RxHeader     b Expires: Tue, 03 Jul 2007 15:00:00 GMT
    55 RxHeader     b Content-length: 115322
    55 RxHeader     b Content-Type: text/xml; charset=utf-8
    76 ObjProtocol  c HTTP/1.1
    76 ObjStatus    c 203
    76 ObjResponse  c Non-Authoritative Information
    76 ObjHeader    c Date: Tue, 03 Jul 2007 14:07:08 GMT
    76 ObjHeader    c Server: Apache/1.3.34 (Debian) mod_perl/1.29
    76 ObjHeader    c Expires: Tue, 03 Jul 2007 15:00:00 GMT
    76 ObjHeader    c Content-Type: text/xml; charset=utf-8
    76 ObjHeader    c Content-Length: 115322
    55 BackendReuse b weatherapi
    76 TTL          c 138012248 RFC 3172 1183471628 1183471628 
1183474800 0 0
    76 VCL_call     c fetch
    76 VCL_return   c insert
    76 Length       c 115322
    76 TxProtocol   c HTTP/1.1
    76 TxStatus     c 203
    76 TxResponse   c Non-Authoritative Information
76 ObjHeader    c Date: Tue, 03 Jul 2007 14:07:08 GMT
    76 ObjHeader    c Server: Apache/1.3.34 (Debian) mod_perl/1.29
    76 ObjHeader    c Expires: Tue, 03 Jul 2007 15:00:00 GMT
    76 ObjHeader    c Content-Type: text/xml; charset=utf-8
    76 ObjHeader    c Content-Length: 115322
    55 BackendReuse b weatherapi
    76 TTL          c 138012248 RFC 3172 1183471628 1183471628 
1183474800 0 0
    76 VCL_call     c fetch
    76 VCL_return   c insert
    76 Length       c 115322
    76 TxProtocol   c HTTP/1.1
    76 TxStatus     c 203
    76 TxResponse   c Non-Authoritative Information
    76 TxHeader     c Date: Tue, 03 Jul 2007 14:07:08 GMT
    76 TxHeader     c Server: Apache/1.3.34 (Debian) mod_perl/1.29
    76 TxHeader     c Expires: Tue, 03 Jul 2007 15:00:00 GMT
    76 TxHeader     c Content-Type: text/xml; charset=utf-8
    76 TxHeader     c Content-Length: 115322
    76 TxHeader     c X-Varnish: 138012248
    76 TxHeader     c Age: 0
    76 TxHeader     c Via: 1.1 varnish
    76 ReqEnd       c 138012248 1183471628.619347000 
1183471628.781244000 0.05796
5000 0.161765000 0.000132000
     0 StatAddr       xxx.xx.xxx.xxx 0 22038 3349 127106 0 0 18810 
37429990 95687
09644


This object is then requested and delivered from cache several times. 
Then we see the following in the log, at a time immediately after 14:59:27:


0 ExpPick        138012248
0 VCL_call       timeout
0 VCL_return     discard

...immediately following this( ca.14:59:33 ) we get:

0 ExpBan         138012248 was banned
0 CLI            Rd vcl.load boot /tmp/vcl.XXfdES7y
0 CLI            Wr 0 200 Loaded "/tmp/vcl.XXfdES7y" as "boot"

0 CLI            Rd vcl.use boot
0 CLI            Wr 0 200
0 CLI            Rd start
0 CLI            Wr 0 200

This pattern repeats itself regularly every 5 - 15 hours. I am not sure 
  if the request and the ExpBan tag is related to the child process 
beeing restarted, but since a restart is always preceded by such a tag 
it does seem likely.

The weird thing is that it seems to happen at the exact same time every 
day. At 07:59 and 14:59 the child processs is restarted in the manner 
shown above. We have a lot of requests which expires at each whole hour, 
but I have no idea if this is related in any way.

---------------------------------------------

It would be great if anyone could explain this behaviour? What does the 
ExpBan tag really mean? It seems kind of counterintuitive that a request 
can be banned after beeing discarded....

Also, please let me know if you need any more information to be able to 
look into the problem.


Best regards,
Håvard Futsæter



More information about the varnish-misc mailing list