Hello,<div><br><div> After a few days efforts, I did not get my problem fixed, I almost exhaust every possible methods which I could do, so I am trying to get help from the community. </div><div><br></div><div>I use varnish as web cache and load balancer to manage 3 web nodes, but recently, I get 503 errors frequently, </div>
<div><br></div><div>My varnish configuration file:</div><div>=======================================================</div><div><div>backend nanjing {</div><div> .host = "10.80.125.66";</div><div> .port = "80";</div>
<div> .connect_timeout = 1800s;</div><div> .first_byte_timeout = 1800s;</div><div> .between_bytes_timeout = 1800s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.probe = {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.url = "/live.html";</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.interval = 1s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.timeout = 3s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.window = 10;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.threshold = 2;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> }</div><div> }</div><div><br></div><div>backend hangzhou {</div><div>
.host = "10.80.125.68";</div><div> #.host = "10.36.146.202";</div><div> .port = "80";</div><div> .connect_timeout = 1800s;</div><div> .first_byte_timeout = 1800s;</div><div>
.between_bytes_timeout = 1800s;</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.probe = {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.url = "/live.html";</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.interval = 1s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.timeout = 3s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.window = 10;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.threshold = 2;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> }</div><div> }</div><div>backend chongqing {</div><div> .host = "10.80.125.76";</div>
<div> .port = "80";</div><div> .connect_timeout = 1800s;</div><div> .first_byte_timeout = 1800s;</div><div> .between_bytes_timeout = 1800s;</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.probe = {</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.url = "/live.html";</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.interval = 1s;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.timeout = 3s;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.window = 10;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.threshold = 2;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span> }</div>
<div> }</div><div><br></div><div><br></div><div><br></div><div>director proxy random {</div><div> {</div><div> .backend = chongqing;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.weight = 2;</div>
<div> }</div><div> {</div><div> .backend = nanjing;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>.weight = 4;</div><div> }</div><div> {</div><div> .backend = hangzhou;</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>.weight = 4;</div><div> }</div><div>}</div><div><br></div><div>acl purge {</div><div> "localhost";</div><div> "10.80.125.0"/24;</div>
<div>}</div><div><br></div><div>sub vcl_recv {</div><div> set req.backend = proxy;</div><div><br></div><div> if (req.request != "GET" && req.request != "HEAD") {</div><div><br></div>
<div> # POST - Logins and edits</div><div> if (req.request == "POST") {</div><div> return(pass);</div><div> }</div><div> </div><div>
# PURGE - The CacheFu product can invalidate updated URLs</div><div> if (req.request == "PURGE") {</div><div> if (!client.ip ~ purge) {</div><div> error 405 "Not allowed.";</div>
<div> }</div><div> return(lookup);</div><div> }</div><div> }</div><div><br></div><div> # Don't cache authenticated requests</div><div> if (req.http.Cookie && req.http.Cookie ~ "__ac(|_(name|password|persistent))=") {</div>
<div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span># Force lookup of specific urls unlikely to need protection</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if (req.url ~ "\.(js|css)") {</div>
<div> remove req.http.cookie;</div><div> return(lookup);</div><div> }</div><div> return(pass);</div><div> }</div><div><br></div><div> # The default vcl_recv is used from here.</div>
<div> }</div><div><br></div><div>sub vcl_hit {</div><div> # if (req.request == "PURGE") {</div><div> # purge('');</div><div> # error 200 "Purged";</div><div> # }</div>
<div>}</div><div>sub vcl_miss {</div><div> # if (req.request == "PURGE") {</div><div> # purge('');</div><div> # error 200 "Purged";</div><div> # }</div><div>
}</div><div><br></div><div># Enforce a minimum TTL, since we can PURGE changed objects actively</div><div># from Zope by using the CacheFu product</div><div><br></div><div>sub vcl_fetch {</div><div> if (beresp.ttl < 3600s) {</div>
<div> set beresp.ttl = 3600s;</div><div> }</div><div>}</div></div><div><br></div><div><div><br></div><div>Varnish boots up script</div><div>==========================================</div><div><div><span class="Apple-tab-span" style="white-space:pre"> </span>varnishd -f /etc/varnish/my.vcl -s malloc,8192M -a $ip:80 \</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>-T $ip:2048 \</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-n vcache-my\</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p thread_pools=2 \</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>-p thread_pool_max=15000\</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p thread_pool_min=500\</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p listen_depth=2048 \</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>-p lru_interval=1800 \</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-h classic,169313 \</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p connect_timeout=1800 \</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>-p http_max_hdr=8192\</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p http_resp_hdr_len=18192\</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-p max_restarts=6 </div>
</div><div><br></div><div>I try to the backend status:</div><div><div>[root@hongkong varnish]# varnishadm -n vcache-my backend.list</div><div>==============================================</div><div>Backend name Refs Admin Probe</div>
<div>nanjing(10.80.125.66,,80) 68 probe Healthy 8/10</div><div>hangzhou(10.80.125.68,,80) 66 probe Healthy 7/10</div><div>chongqing(10.80.125.76,,80) 23 probe Healthy 9/10</div></div>
<div><br></div><div><br></div><div>I already downgrade the .threshold from 8 to 2, so it can make sure the all the node is in Healthy status, if I set the .threshold to 8,</div><div>most of the node will be Sick.</div><div>
<br></div><div>I try to use a script to wget the probe page every 2 seconds, there is no failure, but it is always have failure in the command 'backend.list', </div><div><br></div><div>I have to script to watch the status of my website:</div>
<div>----------------------------------------------------------------------------------</div><div><div>#!/bin/bash</div><div>pass=0</div><div>fail=0</div><div><br></div><div>while [ 1 ]</div><div>do</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>wget <a href="http://mysite/live.html">http://mysite/live.html</a> -O /dev/null</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>if [ $? -eq 0 ];then</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>pass=$(expr $pass + 1)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>else</div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>fail=$(expr $fail + 1)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>fi</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>echo -e "pass: $pass\n fail: $fail" </div>
<div><span class="Apple-tab-span" style="white-space:pre"> </span>sleep 5</div><div>done</div></div><div><br></div><div>25% failed, it is very strange thing, I have no clue about it, </div><div><br></div><div>Example result about the varnish log:</div>
<div>=======================================</div><div>varnishlog -n vcache-my| tee -a /var/log/varnish.log</div><div><br></div><div><div> 977 RxHeader c Connection: keep-alive</div><div> 977 RxHeader c User-Agent: Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25</div>
<div> 977 VCL_call c recv pass</div><div> 977 VCL_call c hash</div><div> 977 Hash c /</div><div> 977 Hash c <a href="http://www.mywebsite.com">www.mywebsite.com</a></div><div> 977 VCL_return c hash</div>
<div> 977 VCL_call c pass pass</div><div> 977 FetchError c no backend connection</div><div> 977 VCL_call c error deliver</div><div> 977 VCL_call c deliver deliver</div><div> 977 TxProtocol c HTTP/1.1</div>
<div> 977 TxStatus c 503</div><div> 977 TxResponse c Service Unavailable</div><div> 977 TxHeader c Server: Varnish</div><div> 977 TxHeader c Content-Type: text/html; charset=utf-8</div><div> 977 TxHeader c Retry-After: 5</div>
<div> 977 TxHeader c Content-Length: 419</div><div> 977 TxHeader c Accept-Ranges: bytes</div><div> 977 TxHeader c Date: Mon, 07 Jan 2013 18:03:02 GMT</div><div> 977 TxHeader c X-Varnish: 2122413499</div>
</div><div><br></div><div>more varnish log:</div><div>shaohui dot org/downloads/varnish.tgz</div><div><br></div><div>this 503 error make my website trap in troubles, my customers can not access my site, I did not have any clue, can some body provide some advices, thanks so much. </div>
<div><br></div>-- <br>Best regards<br>Shaohui
</div></div>