<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 09/28/2012 12:02 AM, Johan Olsson
wrote:<br>
</div>
<blockquote
cite="mid:C1C0937CADC64B49A0B1F5C83E284FE50A3816AF@EXCH2.bds.internal"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I’ve been looking on how
to monitor varnish. I’ve found that there exists a snmp for
varnish which gives some info that is good to have. I’ve
found it and looked at it (</span><a moz-do-not-send="true"
href="http://sourceforge.net/projects/varnishsnmp/"><span
lang="EN-US">http://sourceforge.net/projects/varnishsnmp/</span></a><span
lang="EN-US">), but it dosen’t give all that I need (I
think). What I’m missing is to be able to monitor how much
traffic one site is using. So if I have two sites like
<a moz-do-not-send="true" href="http://www.example1.com">www.example1.com</a>
and <a moz-do-not-send="true"
href="http://www.example2.com">
www.example2.com</a>, I would like to be able to get how
many connections each one gets and how much Mbps each one is
using.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Is this possible to do?
</span></p>
</div>
</blockquote>
Hi Johan,<br>
<br>
Maybe I can help. I've got about 35 sites running on a 4 node
Varnish cluster here and monitor throughput, request rate and http
status codes per-site using Cacti and Nagios via SNMP . The way it
works is like this:<br>
<br>
Each server is running the exact same varnish config. In this config
there's a VCL chunk that defines a bunch of macro's for gathering
site info:<br>
<br>
STATS_NODE - Define a new node. This generates a structure at
compile time, where the statistics will be stored.<br>
STATS_INIT - Initialize a node. Unfortunately this gets called each
time a site is accessed. But the code is only a few lines and very
lightweight.<br>
STATS_SET_BACKEND - Defines the current backend to use. This is
called each time a site is accessed.<br>
STATS_UPDATE - Update the site's statistics. This is called each
time a site is accessed.<br>
STATS_DUMP - This gets called periodically to dump the entire
statistics linked-list to a syslog.<br>
<br>
The flow is roughly as follows:<br>
<br>
- Each site references the macros at certain points to generate the
statistics;<br>
- The main configuration calls STATS_DUMP periodically, which sends
statistics info to syslog;<br>
- Syslog then sends it to a dedicated FIFO;<br>
- A script (called varnish-snmp-stats-prep-backends.sh) is listening
to the FIFO and parses the stats;<br>
- The stats are then parsed and written to a per-site textfile;<br>
- SNMPD is configured to access the per-site stats files.<br>
<br>
Each server also generates varnishd-specific data every 5 minutes
using a script (varnish-snmp-stats-prep-srv.sh) that calls
varnishstat. The parsed output of varnish is dumped to a text file
and made available to SNMPD.<br>
<br>
One varnish server is appointed the main statistics server. On that
server a cronjob calls "varnish-snmp-summarize-backends.py" every 5
minutes, which gathers and summarizes the statistics of all 4
servers, using SNMP. This data is then dumped to per-site text files
again, but then containing the aggregate per-site counts. Cacti can
the query this server for the combined per-site and varnishd
statistics.<br>
<br>
Another approach to generate the per-site statistics could be to
pipe the varnishlog output to a script that parses this data. Though
I fear this method might cause quite a heavy load on the machine
doing the parsing, so this may have to be offloaded to another
machine. But this is not the path I chose.<br>
<br>
Note: We're still running varnish 2 in production. Version 3 is in
test. But the conversion is trivial. I've prepared a tarball of this
setup for sharing, but I have to get permission to release this
(anonimized) configuration to the public. I'll get back to you on
this subject tomorrow or the day after to hopefully supply you with
the entire setup (varnish, cron, support scripts, syslog). Just let
me know the best way to share this on this list.<br>
<br>
And somewhat unrelated to your question, but interresting
nonetheless: Another bit of VCL code dumps each request tot syslog
in a modified NSCA format, for debugging, traceability and such. But
because the machines sometimes generate beyond 2MB of logdata per
second per server, and I like to keep the logs for a few weeks, the
logs need to be rotateted fairly often to prevent gigantic files and
they need to be compressed to minimize storage requirements.There
are two seperate scripts to handle logrotation:<br>
<br>
- varnish-log-rotate.sh - Checks the size of the log and rotates it
if it excceeds 2GB.<br>
- varnish-log-compress.sh - Waits for rotated logs and compresses
and archives them using idle-priority, to minimize CPU impact.<br>
<br>
This allows you to store 2.5TB of logs on 250GB of storage and
minimize the log compression load on the servers.<br>
<br>
Cheers,<br>
<br>
Johnny<br>
<br>
</body>
</html>