Varnish is serving an incomplete response

Igor Minar iiminar at gmail.com
Thu Jul 22 03:03:21 CEST 2010


I'm experiencing something similar. Varnish (2.1.2 on OpenSolaris)
occasionally  passes incomplete requests to the backend or delivers
incomplete responses to the client.

That said, I don't see any "Write error" in the logs. Even inspecting
log for a request that I know was truncated doesn't show anything
unusual.

I'm not able to reproduce the issue at will, but on two occasions
(days apart) when varnish was in front of my app users reported that
their POSTs were truncated or that they received an incomplete
responses from the server. Once I removed varnish from the stack, the
problem never reappeared.

I wonder if this has anything to do with the EBADF issue discussed
here: http://www.mail-archive.com/varnish-misc@projects.linpro.no/msg03626.html

I'm compiling varnish with -mt, but I still see errno=9 (EBADF) in
TCP_Assert called from TCP_blocking in tcp.c. This might not be
related, but I just wanted to mention it in case it is significant.

/i


On Wed, Mar 3, 2010 at 5:15 PM, Bayron Guevara <bayron.guevara at gmail.com> wrote:
> Hello!
>
> I'm using Varnish 2.0.5 running on the following server's specification:
>
>  2 Quadcore Intel Xeon 2.00Ghz 64bits
>  OS: RHEL 5 (64 bits)
>  8MB RAM
>  1GB Ethernet
>
> I've configured my network infraestructure with a Load Balancer, a Varnish
> dedicated server and five web servers plus database servers. We have the
> following network configuration:
> external client ---> Load Balancer (public VIP) ---> Varnish Proxy --> Load
> Balancer (private VIP) --> Web Servers
>
> In this configuration, the Load Balancer have the responsability for send
> the request to the respective server according to the domain. The Varnish
> server have configurated the Load Balancer's private VIP as unique backend.
>
> Now, let me explain the issue. On a low traffic scenario, the websites are
> served correctly, but sometimes the page get blank or partially loaded. In
> both cases a 200 OK response code is received and also the response body,
> however it is received incomplete. Then I proceed to check the varnishstat
> and varnishlog output, and I have some observations: The varnish frecuently
> restarted and at execute  varnishlog -i Debug -I I got the following output:
> 400 Debug        c "Write error, len = 34500/55022, errno = Success"
>
> I don't know what it means exactly, but some google seach give me a clue:
> maybe be caused by an interruption during client communication. So, this
> error could show the cause of the problem. Although I don't know why the
> cause of this error, I guess a network buffer overflow, so I show you some
> OS related values:
>
> /proc/sys/net/ipv4/ip_local_port_range = 32768   61000
> /proc/sys/net/core/rmem_max = 131071
> /proc/sys/net/core/wmem_max = 131071
> /proc/sys/net/ipv4/tcp_mem = 196608  262144  393216
> /proc/sys/net/ipv4/tcp_wmem = 4096    16384   4194304
> /proc/sys/net/ipv4/tcp_fin_timeout = 60
> /proc/sys/net/core/netdev_max_backlog = 1000
> /proc/sys/net/core/somaxconn = 128
> /proc/sys/net/ipv4/tcp_syncookies = 1
> /proc/sys/net/ipv4/tcp_max_orphans = 65536
> /proc/sys/net/ipv4/tcp_max_syn_backlog = 1024
> /proc/sys/net/ipv4/tcp_synack_retries = 5
> /proc/sys/net/ipv4/tcp_syn_retries = 5
>
> This same values can be found in this varnish performance article:
> http://varnish-cache.org/wiki/Performance. The mine ones seems very low and
> maybe it is one of the causes. With the average traffic (around 500
> concurrent users for all sites), the Varnish service not respond and the
> server load raise up to 612. Respect to the web site response, a Connection
> refused error (Code 503) is returned. In this ocassion I didn't can review
> the varnish statistics.
>
> Here are my varnish params, maybe it can help:
> 200 2224
> accept_fd_holdoff          50 [ms]
> acceptor                   default (epoll, poll)
> auto_restart               on [bool]
> backend_http11             on [bool]
> between_bytes_timeout      60.000000 [s]
> cache_vbe_conns            off [bool]
> cc_command                 "exec cc -fpic -shared -Wl,-x -o %o %s"
> cli_buffer                 8192 [bytes]
> cli_timeout                5 [seconds]
> client_http11              off [bool]
> clock_skew                 10 [s]
> connect_timeout            0.400000 [s]
> default_grace              10
> default_ttl                180 [seconds]
> diag_bitmap                0x0 [bitmap]
> err_ttl                    0 [seconds]
> esi_syntax                 0 [bitmap]
> fetch_chunksize            128 [kilobytes]
> first_byte_timeout         60.000000 [s]
> group                      varnish (103)
> listen_address             :80
> listen_depth               1024 [connections]
> log_hashstring             off [bool]
> log_local_address          off [bool]
> lru_interval               360 [seconds]
> max_esi_includes           5 [includes]
> max_restarts               4 [restarts]
> obj_workspace              8192 [bytes]
> overflow_max               100 [%]
> ping_interval              3 [seconds]
> pipe_timeout               60 [seconds]
> prefer_ipv6                off [bool]
> purge_dups                 on [bool]
> purge_hash                 on [bool]
> rush_exponent              3 [requests per request]
> send_timeout               600 [seconds]
> sess_timeout               5 [seconds]
> sess_workspace             65536 [bytes]
> session_linger             100 [ms]
> session_max                100000 [sessions]
> shm_reclen                 255 [bytes]
> shm_workspace              8192 [bytes]
> srcaddr_hash               1049 [buckets]
> srcaddr_ttl                0 [seconds]
> thread_pool_add_delay      2 [milliseconds]
> thread_pool_add_threshold  2 [requests]
> thread_pool_fail_delay     200 [milliseconds]
> thread_pool_max            5000 [threads]
> thread_pool_min            150 [threads]
> thread_pool_purge_delay    1000 [milliseconds]
> thread_pool_stack          unlimited [bytes]
> thread_pool_timeout        120 [seconds]
> thread_pools               8 [pools]
> user                       varnish (100)
> vcl_trace                  off [bool]
>
> What are your suggestions?
> Is this a Varnish or Operating System configuration problem?
>
>
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>
>




More information about the varnish-misc mailing list