<div dir="ltr"><br>I have a fairly busy 2.1 box that stopped serving traffic a couple of times recently. Since it's actively serving our sites, it ended up getting bounced before we could look at Varnish to see what was going on. But restarting Varnish did get things back to working. <br>
<br>In the process of trying to troubleshoot this, I came across something I don't understand. If I run 'sar -n ALL', I get this for open sockets in the period leading up to the crisis:<br><div><br><br> totsck tcpsck udpsck rawsck ip-frag tcp-tw<br>
<br> 11:45:01 AM 14921 13657 0 0 0 118<br> 11:55:01 AM 23930 20664 0 0 0 152<br> 12:05:02 PM 32092 25969 0 0 0 113<br>
12:15:01 PM 39461 30043 0 0 0 113<br> 12:25:02 PM 46668 33715 0 0 0 94<br> 12:35:01 PM 54069 36689 0 0 0 107<br>
12:45:01 PM 61508 39127 0 0 0 90<br> 12:55:03 PM 68697 40981 0 0 0 101<br> 01:05:01 PM 75922 42634 0 0 0 93<br>
01:15:02 PM 82843 43848 0 0 0 98<br> 01:25:01 PM 89889 45174 0 0 0 103<br> 01:35:01 PM 97185 46296 0 0 0 94<br>
01:45:01 PM 100184 39404 0 0 0 85<br> <br>Varnish became unresponsive near the 100,000 sockets point. <br><br>My question is this: if totsck is total sockets open, how is it that the numbers in the other columns don't sum to equal totsck? What kind of socket might be missing from this output? Does this even tell me anything useful at all? <br>
</div></div>