Fwd: varnish 2.0.4 backend errors

Lazy lazy404 at gmail.com
Mon Jul 13 23:31:16 CEST 2009


2009/7/13 Ken Brownfield <kb+varnish at slide.com>:
> I would try correlating these 503's with actual Apache log lines.  That way,
> you'll see what Apache told Varnish (if it made it to Apache at all).
there are no sign of thiese requests in apache logs

>
> Also, try tcpdumping the backend port to a file, and then use wireshark to
> go back to the traffic at the time you see another 503.

it happens with or without keep-alive (it was set before and I
disabled it recently)

it's likelly the same issue as in "tcp reset problem with varnish
2.0.4 o n Solaris 10 (SPARC)" thread

it happens on 0,09% of request's, mostly POST'S but not only

now I rechecked the logs and I found that varnish closed the
connection just after sending the request
tcpdump follows, there are no backend connection errors
01:03:50.181412 IP x.x.x.x.48563 > x.x.x.x.88: S
3999574413:3999574413(0) win 32792 <mss 16396,sackOK,timestamp
458935307 0,nop,wscale 7>
       0x0000:  4500 003c 6586 4000 4006 3b56 566f f680  E..<e. at .@.;VVo..
       0x0010:  566f f680 bdb3 0058 ee64 a98d 0000 0000  Vo.....X.d......
       0x0020:  a002 8018 b64b 0000 0204 400c 0402 080a  .....K.... at .....
       0x0030:  1b5a cc0b 0000 0000 0103 0307            .Z..........
01:03:50.181422 IP x.x.x.x.88 > x.x.x.x.48563: S
4006066350:4006066350(0) ack 3999574414 win 32768 <mss
16396,sackOK,timestamp 458935307 458935307,nop,wscale 7>
       0x0000:  4500 003c 0000 4000 4006 a0dc 566f f680  E..<.. at .@...Vo..
       0x0010:  566f f680 0058 bdb3 eec7 b8ae ee64 a98e  Vo...X.......d..
       0x0020:  a012 8000 2776 0000 0204 400c 0402 080a  ....'v.... at .....
       0x0030:  1b5a cc0b 1b5a cc0b 0103 0307            .Z...Z......
01:03:50.181433 IP x.x.x.x.48563 > x.x.x.x.88: . ack 1 win 257
<nop,nop,timestamp 458935307 458935307>
       0x0000:  4500 0034 6587 4000 4006 3b5d 566f f680  E..4e. at .@.;]Vo..
       0x0010:  566f f680 bdb3 0058 ee64 a98e eec7 b8af  Vo.....X.d......
       0x0020:  8010 0101 0f9a 0000 0101 080a 1b5a cc0b  .............Z..
       0x0030:  1b5a cc0b                                .Z..
01:03:54.581224 IP x.x.x.x.88 > x.x.x.x.48563: S
4006066350:4006066350(0) ack 3999574414 win 32768 <mss
16396,sackOK,timestamp 458936407 458935307,nop,wscale 7>
       0x0000:  4500 003c 0000 4000 4006 a0dc 566f f680  E..<.. at .@...Vo..
       0x0010:  566f f680 0058 bdb3 eec7 b8ae ee64 a98e  Vo...X.......d..
       0x0020:  a012 8000 232a 0000 0204 400c 0402 080a  ....#*.... at .....
       0x0030:  1b5a d057 1b5a cc0b 0103 0307            .Z.W.Z......
01:03:54.581245 IP x.x.x.x.48563 > x.x.x.x.88: . ack 1 win 257
<nop,nop,timestamp 458936407 458936407,nop,nop,sack 1 {0:1}>
       0x0000:  4500 0040 6588 4000 4006 3b50 566f f680  E.. at e.@. at .;PVo..
       0x0010:  566f f680 bdb3 0058 ee64 a98e eec7 b8af  Vo.....X.d......
       0x0020:  b010 0101 81fc 0000 0101 080a 1b5a d057  .............Z.W
       0x0030:  1b5a d057 0101 050a eec7 b8ae eec7 b8af  .Z.W............
01:03:55.181239 IP x.x.x.x.48563 > x.x.x.x.88: P 1:896(895) ack 1 win
257 <nop,nop,timestamp 458936557 458936407>
       0x0000:  4500 03b3 6589 4000 4006 37dc 566f f680  E...e. at .@.7.Vo..
       0x0010:  566f f680 bdb3 0058 ee64 a98e eec7 b8af  Vo.....X.d......
       0x0020:  8018 0101 9d85 0000 0101 080a 1b5a d0ed  .............Z..
       0x0030:  1b5a d057 504f 5354 202f 7175 697a 2f71  .Z.WPOST./quiz/q
       0x0040:  7565 7374 696f 6e20 4854 5450 2f31 2e31  uestion.HTTP/1.1
       0x0050:  0d0a 782d 7265 7175 6573 7465 642d 7769  ..x-requested-wi
       0x0060:  7468 3a20 584d 4c48 7474 7052 6571 7565  th:.XMLHttpReque
...................... request data

01:03:55.181256 IP x.x.x.x.88 > x.x.x.x.48563: . ack 896 win 270
<nop,nop,timestamp 458936557 458936557>
                       0x0000:  4500 0034 6842 4000 4006 38a2 566f
f680  E..4hB at .@.8.Vo..
                       0x0010:  566f f680 0058 bdb3 eec7 b8af ee64
ad0d  Vo...X.......d..
                       0x0020:  8010 010e 024a 0000 0101 080a 1b5a
d0ed  .....J.......Z..
                       0x0030:  1b5a d0ed                                .Z..
........... acked
01:03:55.181302 IP x.x.x.x.48563 > x.x.x.x.88: F 896:896(0) ack 1 win
257 <nop,nop,timestamp 458936557 458936557>
                       0x0000:  4500 0034 658a 4000 4006 3b5a 566f
f680  E..4e. at .@.;ZVo..
                       0x0010:  566f f680 bdb3 0058 ee64 ad0d eec7
b8af  Vo.....X.d......
                       0x0020:  8011 0101 0256 0000 0101 080a 1b5a
d0ed  .....V.......Z..
                       0x0030:  1b5a d0ed                                .Z..
........... connection closed ???
01:03:55.185068 IP x.x.x.x.88 > x.x.x.x.48563: P 1:2485(2484) ack 897
win 270 <nop,nop,timestamp 458936557 458936557>
                       0x0000:  4500 09e8 6843 4000 4006 2eed 566f
f680  E...hC at .@...Vo..
                       0x0010:  566f f680 0058 bdb3 eec7 b8af ee64
ad0e  Vo...X.......d..
                       0x0020:  8018 010e a3ba 0000 0101 080a 1b5a
d0ed  .............Z..
                       0x0030:  1b5a d0ed 4854 5450 2f31 2e31 2032
3030  .Z..HTTP/1.1.200
                       0x0040:  204f 4b0d 0a44 6174 653a 2046 7269
2c20  .OK..Date:.Fri,.
                       0x0050:  3130 204a 756c 2032 3030 3920 3233
3a30  10.Jul.2009.23:0
                       0x0060:  333a 3535 2047 4d54 0d0a 5365 7276
6572  3:55.GMT..Server
                       0x0070:  3a20 4170 6163 6865 0d0a 4578 7069
7265  :.Apache..Expire
                       0x0080:  733a 2054 6875 2c20 3139 204e 6f76
2031  s:.Thu,.19.Nov.1
                       0x0090:  3938 3120 3038 3a35 323a 3030 2047
4d54  981.08:52:00.GMT
                       0x00a0:  0d0a 4361 6368 652d 436f 6e74 726f
6c3a  ..Cache-Control:
                       0x00b0:  206e 6f2d 7374 6f72 652c 206e 6f2d
6361  .no-store,.no-ca
                       0x00c0:  6368 652c 206d 7573 742d 7265 7661
6c69  che,.must-revali
                       0x00d0:  6461 7465 2c20 706f 7374 2d63 6865
636b  date,.post-check
                       0x00e0:  3d30 2c20 7072 652d 6368 6563 6b3d
300d  =0,.pre-check=0.
                       0x00f0:  0a50 7261 676d 613a 206e 6f2d 6361
6368  .Pragma:.no-cach
                       0x0100:  650d 0a43 6f6e 7465 6e74 2d4c 656e
6774  e..Content-Lengt
                       0x0110:  683a 2032 3230 390d 0a43 6f6e 6e65
6374  h:.2209..Connect
.......... reset
01:03:55.185091 IP x.x.x.x.48563 > x.x.x.x.88: R 3999575310:3999575310(0) win 0
       0x0000:  4500 0028 0000 4000 4006 a0f0 566f f680  E..(.. at .@...Vo..
       0x0010:  566f f680 bdb3 0058 ee64 ad0e 0000 0000  Vo.....X.d......
       0x0020:  5004 0000 bc81 0000                      P.......

>
> Varnish should be logging failed health checks, which could explain periodic
> 503s.  If the health checks are succeeding, then Apache is likely
> periodically closing a connection or sending bad data.
>
> You could also try turning off HTTP 1.1 to Apache from Varnish
> (backend_http11=off) to see if that does anything.  With Varnish fronting
> Apache, doing 1.1 to the backend isn't necessarily a huge win.
>
> In the 503 case specifically, I'm not sure if restarts are respected.  Does
> this happen if you relax the connect_timeout, or if you remove the restart?
I tried to fix the issue by adding a restart, it works as expected
except that there is no automatic max restarts limit (it whould be
nice if there was some just like in vcl_retr)

>  Does this happen only with POST?  Do you have any custom settings in
> vcl_pipe(), or do you bail out of vcl_recv() without falling through to the
> default VCL?
no, vcl_recv() was empty before, now it only has a restart (not
trigerd by the closing connection issue)

>
> Finally, did you edit the Cookie header?  It's malformed, which in theory
> could cause either Apache or Varnish to dump it.
no



More information about the varnish-misc mailing list