varnishncsa logs split per domain

Mon Nov 14 18:33:46 CET 2016

Hello Andrei,

No offense taken ;) that's the reason we talk to find the best solution!

and I think you are right. It's better to have one instance of 
varnishncsa and shift the log writing away.

I tried piping varnishncsa to the apache split-logfile perl script:

/usr/sbin/varnishncsa -f /etc/varnish/varnishncsa-log-format-string -P 
/var/run/varnishlog.pid | sed -e 's#^www\.##g'  | /usr/bin/split-logfile &

and split-logfile started creating different log files for the vhosts.

but still having a bit of an issue with this approach as most of the 
data just gets written to the logs once I kill the background process.
the different log files for the vhosts get created but are mostly empty 
and start filling up on killing the varnishncsa main process

perhaps the apache split-logfile is not the same as the splitlogs from 
cpanel?
where could I download a version to test?

another thing is the naming of the vhosts. you actually never know what 
clients are requesting?
so Baidu spider requests URL like 333.domain.com ... or ww3.domain.com 
... wwq.domain.com

in your pipe command I see sed removing the www. subdomain but you can 
actually never know what comes in?

in apache I can easily filter all those subdomains and redirect to the 
proper domain.

even in varnish VCL I can normalize the host

not sure how to do this in varnishncsa? is there a way to do that?

anyway, I think a single varnishncsa process and shifting the log 
writing away is the way to go!
but still needs a bit of tweaking if you ask me ;)

thanks & greetings
becki

Am 13.11.2016 um 16:01 schrieb Andrei:
> Hello,
>
> By not using splitlogs or an intermediary script, you will be forced 
> to run multiple instances of varnishncsa, which isn't optimal if you 
> host multiple domains. The more traffic/domains you have, the more 
> resources you will consume on parsing the same data across multiple 
> channels. Yes, varnishncsa supports VSL, however I think your approach 
> is a bit off (no offense). As in, you need to shift the log writing 
> process away from varnishncsa. By doing so, you only have one instance 
> of varnishncsa using resources to gather the data, which is then fed 
> to a parser that handles the per domain log splits and writes. That's 
> where 'splitlogs' came into play.
>
> As your question does raise some interest in the cPanel community 
> (myself and Miguel González on this list for example), I threw 
> together a quick Perl script that will in short, pipe and parse data 
> between varnishncsa and the splitlogs binary for cache hits. This lets 
> splitlogs handle the queued log writes which are later parsed for 
> cPanel bandwidth usage and graphs, webalizer, awstats, logaholic, etc 
> - https://github.com/AndreiG6/vscp
>
>
>
> On Sun, Nov 13, 2016 at 12:18 PM, Admin Beckspaced 
> <admin at beckspaced.com <mailto:admin at beckspaced.com>> wrote:
>
>     Same Hello here ;)
>
>     did have a more in depth look in the manual and figured out that
>     varnishncsa does support VSL query.
>     so someone could filter on the Request Header and Host
>
>     varnishncsa -q "ReqHeader ~ '^Host: .*example.com
>     <http://example.com>'"
>
>     which would produce a log for a specific domain only
>
>     it then would need multiple varnishncsa instances for logging per
>     domain, which I found here:
>
>     https://kevops.com/2015/11/varnish-logging-per-host-with-init-script/
>     <https://kevops.com/2015/11/varnish-logging-per-host-with-init-script/>
>
>     I use varnish version 5 and then there would be no need for
>     splitlog and the logs would be created directly.
>
>     please correct me if I'm wrong?
>
>     thanks for your time & help
>     Becki
>
>
>     Am 12.11.2016 um 17:05 schrieb Andrei:
>
>         Hello again,
>
>         My apologies for not explaining my thoughts better earlier
>         then. Afaik,
>         varnishncsa does not have a native method to split output based on
>         different parameters. The method I was thinking of was based
>         on piping
>         varnishncsa output through splitlogs (or similar) for the log
>         processing
>         and writeouts. Since replying earlier, I've got this working
>         on a cPanel
>         server with piped logging enabled for Apache using the
>         following two for
>         example (X-Port is a custom header set in vcl_recv related to SSL
>         offloading, but you can use a static value or similar custom
>         header):
>
>         varnishncsa -F "%{HOST}i:%{X-Port}i %h %l %u %t \"%m %U%q %H\"
>         %s %b
>         \"%{Referer}i\" \"%{User-agent}i\""|sed -e
>         's#^www\.##g'|/usr/local/cpanel/bin/splitlogs --main=`hostname`
>         --mainout=/usr/local/apache/logs/access_log
>         varnishncsa -F "%{HOST}i %{%s}t %b ."|sed -e
>         's#^www\.##g'|/usr/local/cpanel/bin/splitlogs --main=`hostname`
>         --suffix=-bytes_log
>
>         The above pipes the requests to the splitlogs binary which
>         queues then
>         writes to separate logs per domain, that are later processed
>         by the
>         cPanel log stats apps. Either way, I believe you need an
>         intermediary
>         script to queue and write the log entries per domain. While
>         looking into
>         this process, I ran across this little tidbit which you may
>         find of use
>         https://gist.github.com/garlandkr/4954272
>         <https://gist.github.com/garlandkr/4954272> for logstash style
>         output.
>
>
>
>
>     _______________________________________________
>     varnish-misc mailing list
>     varnish-misc at varnish-cache.org <mailto:varnish-misc at varnish-cache.org>
>     https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>     <https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc>
>
>