varnish 2.0.4 backend errors

Lazy lazy404 at gmail.com
Mon Jul 13 23:51:26 CEST 2009


One more thing
I set the processor affinity for all varnish threads so it will run on
a single cpu, to check if it's not some race condition, so far I
failed to trigger any errors by using ab, I will see tommorow when
some real traffic will show up.

2009/7/13 Ken Brownfield <kb+varnish at slide.com>:
> I would try correlating these 503's with actual Apache log lines.  That way,
> you'll see what Apache told Varnish (if it made it to Apache at all).
>
> Also, try tcpdumping the backend port to a file, and then use wireshark to
> go back to the traffic at the time you see another 503.
>
> Varnish should be logging failed health checks, which could explain periodic
> 503s.  If the health checks are succeeding, then Apache is likely
> periodically closing a connection or sending bad data.
>
> You could also try turning off HTTP 1.1 to Apache from Varnish
> (backend_http11=off) to see if that does anything.  With Varnish fronting
> Apache, doing 1.1 to the backend isn't necessarily a huge win.
>
> In the 503 case specifically, I'm not sure if restarts are respected.  Does
> this happen if you relax the connect_timeout, or if you remove the restart?
>  Does this happen only with POST?  Do you have any custom settings in
> vcl_pipe(), or do you bail out of vcl_recv() without falling through to the
> default VCL?
>
> Finally, did you edit the Cookie header?  It's malformed, which in theory
> could cause either Apache or Varnish to dump it.
> --
> Ken.
>
> On Jul 10, 2009, at 3:21 PM, Lazy wrote:
>
>> Hi,
>>
>> We are having hard time figuring out what's cosing varnish 503 error,
>> our backend is apache is debian 5 default, os is linux x86_64 2.6.26,
>> everything is running on a single machine
>>
>> /usr/local/sbin/varnishd -a 0.0.0.0:80 -f
>> /usr/local/etc/varnish/default.vcl -s malloc -T localhost:9999 -w
>> 10,6000,300 -u nobody
>>
>> running with a single backend
>> .connect_timeout = 1s; added to the backend definition
>>
>> I added
>>
>> sub vcl_error {
>>   if (req.restarts < 10) {
>>       restart;
>>   }
>> }
>>
>> (is it possible to add a pause before doing restart ?)
>>
>> which helps in some cases but not always
>>
>>
>> In about 0.1% of request we get
>>
>>  10 TxRequest    b POST
>>  10 TxURL        b /php
>>  10 TxProtocol   b HTTP/1.1
>>  10 TxHeader     b x-requested-with: XMLHttpRequest
>>  10 TxHeader     b Accept-Language: pl
>>  10 TxHeader     b Referer: http://www.xxxxx/php
>>  10 TxHeader     b Accept: text/html, */*
>>  10 TxHeader     b Content-Type: application/x-www-form-urlencoded
>>  10 TxHeader     b UA-CPU: x86
>>  10 TxHeader     b Accept-Encoding: gzip, deflate
>>  10 TxHeader     b User-Agent: Mozilla/4.0 (compatible; MSIE 7.0;
>> Windows NT 5.1)
>>  10 TxHeader     b Content-Length: 8
>>  10 TxHeader     b Cookie: _.1
>>  10 TxHeader     b X-NovINet: v1.2
>>  10 TxHeader     b Host: www.kinograj.cinema-city.pl
>>  10 TxHeader     b X-Varnish: 603437812
>>  10 TxHeader     b X-Forwarded-For: 79.162.xxx
>>  10 BackendClose b default
>>  31 VCL_call     c error
>>  31 VCL_return   c deliver
>>  31 Length       c 465
>>  31 VCL_call     c deliver
>>  31 VCL_return   c deliver
>>  31 TxProtocol   c HTTP/1.1
>>  31 TxStatus     c 503
>>
>> machine is not overloaded, there are 150 apache running 80% of them is
>> idle
>>
>> what does
>> 31 VCL_call     c error mean , a connection error, apache returned
>> invalid response ?
>>
>> can I get some more information about this error using some syslog in
>> vcl_error or mayby in some other way ?
>>
>> my param.show
>> accept_fd_holdoff          50 [ms]
>> acceptor                   default (epoll, poll)
>> auto_restart               on [bool]
>> backend_http11             on [bool]
>> between_bytes_timeout      60.000000 [s]
>> cache_vbe_conns            off [bool]
>> cc_command                 "exec cc -fpic -shared -Wl,-x -o %o %s"
>> cli_buffer                 8192 [bytes]
>> cli_timeout                5 [seconds]
>> client_http11              off [bool]
>> clock_skew                 10 [s]
>> connect_timeout            1.000000 [s]
>> default_grace              10
>> default_ttl                120 [seconds]
>> diag_bitmap                0x0 [bitmap]
>> err_ttl                    0 [seconds]
>> esi_syntax                 0 [bitmap]
>> fetch_chunksize            128 [kilobytes]
>> first_byte_timeout         60.000000 [s]
>> group                      nogroup (65534)
>> listen_address             0.0.0.0:80
>> listen_depth               1024 [connections]
>> log_hashstring             off [bool]
>> log_local_address          off [bool]
>> lru_interval               2 [seconds]
>> max_esi_includes           5 [includes]
>> max_restarts               4 [restarts]
>> obj_workspace              8192 [bytes]
>> overflow_max               100 [%]
>> ping_interval              3 [seconds]
>> pipe_timeout               60 [seconds]
>> prefer_ipv6                off [bool]
>> purge_dups                 off [bool]
>> purge_hash                 on [bool]
>> rush_exponent              3 [requests per request]
>> send_timeout               600 [seconds]
>> sess_timeout               5 [seconds]
>> sess_workspace             16384 [bytes]
>> session_linger             0 [ms]
>> shm_reclen                 255 [bytes]
>> shm_workspace              8192 [bytes]
>> srcaddr_hash               1049 [buckets]
>> srcaddr_ttl                30 [seconds]
>> thread_pool_add_delay      20 [milliseconds]
>> thread_pool_add_threshold  2 [requests]
>> thread_pool_fail_delay     200 [milliseconds]
>> thread_pool_max            6000 [threads]
>> thread_pool_min            10 [threads]
>> thread_pool_purge_delay    1000 [milliseconds]
>> thread_pool_timeout        300 [seconds]
>> thread_pools               2 [pools]
>> user                       nobody (65534)
>> vcl_trace                  off [bool]
>>
>> stats
>>    4571680  Client connections accepted
>>   13074671  Client requests received
>>    9246516  Cache hits
>>       8084  Cache hits for pass
>>     159743  Cache misses
>>    3768909  Backend connections success
>>          0  Backend connections not attempted
>>          0  Backend connections too many
>>      60064  Backend connections failures
>> this is old and it's not changing now
>>
>>    2610440  Backend connections reuses
>>    3471493  Backend connections recycles
>>          0  Backend connections unused
>>       1053  N struct srcaddr
>>          6  N active struct srcaddr
>>       1017  N struct sess_mem
>>         60  N struct sess
>>        852  N struct object
>>       4183  N struct objecthead
>>          0  N struct smf
>>          0  N small free smf
>>          0  N large free smf
>>         22  N struct vbe_conn
>>        494  N struct bereq
>>         20  N worker threads
>>       4152  N worker threads created
>>          0  N worker threads not created
>>          0  N worker threads limited
>>          0  N queued work requests
>>     226847  N overflowed work requests
>>          0  N dropped work requests
>>          2  N backends
>>     159680  N expired objects
>>          0  N LRU nuked objects
>>          0  N LRU saved objects
>>    4008654  N LRU moved objects
>>          0  N objects on deathrow
>>          0  HTTP header overflows
>>          0  Objects sent with sendfile
>>    5482793  Objects sent with write
>>          0  Objects overflowing workspace
>>    4564281  Total Sessions
>>   13075788  Total Requests
>>         25  Total pipe
>>    3669512  Total pass
>>    3766737  Total fetch
>>  3167362228  Total header bytes
>> 129249988603  Total body bytes
>>     446282  Session Closed
>>      22017  Session Pipeline
>>      78623  Session Read Ahead
>>          0  Session Linger
>>   12628704  Session herd
>>  650794078  SHM records
>>   44487143  SHM writes
>>        627  SHM flushes due to overflow
>>      81635  SHM MTX contention
>>        283  SHM cycles through buffer
>>          0  allocator requests
>>          0  outstanding allocations
>>          0  bytes allocated
>>          0  bytes free
>>    7558668  SMA allocator requests
>>        122  SMA outstanding allocations
>>    1781820  SMA outstanding bytes
>> 198293321655  SMA bytes allocated
>> 198291539835  SMA bytes free
>>      62427  SMS allocator requests
>>          0  SMS outstanding allocations
>> 18446744073709546036  SMS outstanding bytes
>>   29026230  SMS bytes allocated
>>   29028555  SMS bytes freed
>>    3766947  Backend requests made
>>          4  N vcl total
>>          4  N vcl available
>>          0  N vcl discarded
>>          1  N total active purges
>>          1  N new purges added
>>          0  N old purges deleted
>>          0  N objects tested
>>          0  N regexps tested against
>>          0  N duplicate purges removed
>>          0  HCB Lookups without lock
>>          0  HCB Lookups with lock
>>          0  HCB Inserts
>>          0  Objects ESI parsed (unlock)
>>          0  ESI parse errors (unlock)
>>
>>
>> I would be grateful if anyone gave me some pointers about where to look
>> next.
>> _______________________________________________
>> varnish-misc mailing list
>> varnish-misc at projects.linpro.no
>> http://projects.linpro.no/mailman/listinfo/varnish-misc
>
>



More information about the varnish-misc mailing list