Centralizing varnish logs
Guillaume Quintard
guillaume.quintard at gmail.com
Wed Jan 11 07:51:49 UTC 2023
Hi Justin, happy new year!
Without getting too much in the details, it should look like a basic shell
command with a few pipes. Splunk for example has the universal forwarder
that is going to push logs to the server where you can then review and
search for the ingested logs.
The main issue is to push something meaningful to the log collector, and
this is where things are a bit lacking, mainly because it's better to push
structured info, and varnish isn't great at it yet.
For example, for logs, you have about three choices:
- varnishncsa, treat each line as a string and be done with it. It's not
amazing as you'll be forced to use regex to filter requests, since you just
logged a string
- varnishncsa -j, it's better, you can carefully craft a format line to
look like an LDJSON object, and now the log analyzer (I know splunk does
it, at least) will allow you to look for "resp.status == 200 && req.url ~
/foo//". The annoyance is that you need to explicitly decide which headers
you want to log, and the format line/file is going to be disgustingly
verbose and painful to maintain.
- enterprise varnishlog has support for LDJSON output, which is great and
is as comprehensive as you can get. It could be too verbose (i.e. storage
heavy), it's only in Varnish Enterprise, and it'll log everything,
including the headers that got deleted/modified.
I believe that what we need is a JSON logger that just log a synthetic view
of the transaction, something like this for example
{
"req": {
"method": "GET",
"url": "/foo.html",
"headers": [ {"name": "host", "value": "example.com" }, {"name":
"accept-encoding", "value": "gzip"} ],
"start_time": 123456789,
"end_time": 123456790,
"bytes": { "headers": 67, "body": 500, "total": 567 },
"processing": "miss",
},
"resp": {...},
"bereq": {...},
}
we have all the information in varnishlog, it's just a matter of formatting
it correctly. With that, you have something that's easily filtered and is
more natural and comprehensive than what we currently have.
It turns out it's been on my mind for a while, and I intend to get on it,
but for now I'm having way too much fun with rust, vmods and backends to
promise any commitment.
HOWEVER, if somebody wants to code some C/rust to scratch that itch, I'll
be happy to lend a hand!
Does this make sense?
--
Guillaume Quintard
On Tue, Jan 10, 2023 at 2:22 PM Justin Lloyd <justinl at arena.net> wrote:
> Hi all,
>
>
>
> I need to centralize logs of multiple Varnish servers in my web server
> environments, generally just 4 or 6 servers depending on the environment.
> I’d like to be able to do this either with Splunk or an Amazon OpenSearch
> cluster, i.e., a managed ELK stack. However, not having worked with either
> tool for such a purpose, I’m not clear on how I could then review, replay,
> etc. the centralized logs similar to the output from tools like
> *varnishlog* and *varnishtop*. Are there existing tools for handling
> Varnish logs in these kinds of centralized log management systems, or would
> I be somewhat constrained on what I could do with the stored logs? Aside
> from the benefit of unifying the logs across all of my web servers, I am
> trying to reduce how much I need to log in to the individual log servers to
> monitor ongoing issues, etc.
>
>
>
> FWIW, I haven’t checked how much log data our production web servers
> generate in a day, but when I checked several years ago (before moving into
> AWS and when the sites were much smaller), it was on the order of like 1 GB
> per day per server.
>
>
>
> Thanks,
>
> Justin
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20230110/d5ffc502/attachment-0001.html>
More information about the varnish-misc
mailing list