Hit ratio dropped significantly after recent upgrades

Justin Lloyd justinl at arena.net
Fri Dec 9 18:08:27 CET 2016


We may be seeing this issue though I can't confirm since I all of my servers are running 4.1.1 now, but I calculate my hit ratio based on the total number of Varnish connections and how many are sent to Apache, per server. Again, for reference from my first email, here's my Graphite dashboard function for generating my hit ratio graph:

asPercent(
    diffSeries(
        linux.hostname.varnish-default-connections.connections-received,
        linux.hostname.varnish-default-backend.http_requests-requests
    ),
   linux.hostname.varnish-default-connections.connections-received
)

I also look at varnishstat to confirm, here's about a 10 minute sample:

    NAME                 CURRENT        CHANGE       AVERAGE        AVG_10       AVG_100      AVG_1000 
MAIN.client_req          7937622          0.00         47.00          5.27         83.94         60.92
MAIN.backend_req         6341302         51.91         37.00         38.97         53.59         43.80
MAIN.cache_hit           2267216          0.00         13.00          1.08         25.91         18.19
MAIN.cache_miss          4906402          0.00         29.00          1.50         43.93         37.26

So I don't think our problem is a cache_hit/miss value calculation issue since the number of client vs. backend requests is similar and very underperforming.

To reiterate on a point in another of my responses in this thread, I think it may be something about MediaWiki thumbnail images not being cached properly despite our current VCL in that regard not having changed from how it worked prior to the upgrade during which time we were seeing a very high (86%-ish) hit ratio from the same formula.

-----Original Message-----
From: varnish-misc-bounces+justinl=arena.net at varnish-cache.org [mailto:varnish-misc-bounces+justinl=arena.net at varnish-cache.org] On Behalf Of Justin Lloyd
Sent: Friday, December 9, 2016 5:44 AM
To: Dag Haavi Finstad <daghf at varnish-software.com>
Cc: varnish-misc at varnish-cache.org
Subject: RE: Hit ratio dropped significantly after recent upgrades

Hello! Yes, it is Varnish 4.1.1-1 from the Ubuntu 16.04 repo. I'll look at the issue you've linked and see if I can match it to our situation. Thanks!

Justin

-----Original Message-----
From: Dag Haavi Finstad [mailto:daghf at varnish-software.com]
Sent: Friday, December 9, 2016 4:47 AM
To: Justin Lloyd <justinl at arena.net>
Cc: Dridi Boukelmoune <dridi at varni.sh>; Jason Price <japrice at gmail.com>; varnish-misc at varnish-cache.org
Subject: Re: Hit ratio dropped significantly after recent upgrades

Hi

Is this Varnish 4.1 ?

We have an unsolved bug open describing something very similar,
https://github.com/varnishcache/varnish-cache/issues/1859

On Thu, Dec 8, 2016 at 9:26 PM, Justin Lloyd <justinl at arena.net> wrote:
> I have been doing a lot of digging with varnishtop and varnishlog, and our VCL really didn’t change from this upgrade except as needed to migrate from Varnish 3 to 4. As I mentioned, our web app is MediaWiki so we don't control its caching requirements and recommendations, so what I'm trying to understand is whether the drop in the hit rate is due to some change(s) in MediaWiki's cookie and/or cache handling (e.g. via Cache-Control and Set-Cookie headers) or if something in Varnish changed that affects how it determines  things. For example, a while back I had been using the Varnish hit and miss metrics in Collectd to calculate the ratio but apparently how those values are calculated with respect to purges changed so the hit ratio dropped, causing me to change the ratio calculation to use incoming connections and backend requests instead.
>
> That said, based on my varnishlog and varnishtop testing, I have a strong feeling that the biggest part of the problem is thumbnail images. If you look again at my VCL code (https://gist.github.com/Calygos/105957a997ea3bde6b8257a1f34bbd20), you can see I strip cookies from thumbnails so they should get cached, but I seem to get a lot more misses than hits when watching for thumbnail URL requests through varnishtop. I give 8 GB to Varnish and its process is typically only around 1 to 2 GB when previous it would be at 8 GB with frequent nukes and the occasional spike of expires that would temporarily eliminate nukes while memory filled up again. For what it's worth, I added the thumbnail stripping a couple of years ago due to a performance issue and it helped tremendously, so I don't know why it would become problematic with these latest upgrades.
>
> Justin
>
> -----Original Message-----
> From: Dridi Boukelmoune [mailto:dridi at varni.sh]
> Sent: Thursday, December 8, 2016 6:49 AM
> To: Jason Price <japrice at gmail.com>
> Cc: Justin Lloyd <justinl at arena.net>; varnish-misc at varnish-cache.org
> Subject: Re: Hit ratio dropped significantly after recent upgrades
>
> On Thu, Dec 8, 2016 at 2:35 PM, Jason Price <japrice at gmail.com> wrote:
>> I think we're going to need something a little more specific to go on.
>> That is a mile of changes all at once.
>
> Yes: varnishlog, coffee, and a lot of patience.
>
>> Finding a single request that should be cached, but isn't and 
>> producing the varnish log for that request will probably help illuminate what's going on.
>
> There's currently no way to query the transaction log of a specific request:
> https://github.com/varnishcache/varnish-cache/issues/2154
>
> I'm just saying...
>
> Dridi
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc



--
Dag Haavi Finstad
Software Developer | Varnish Software
Mobile: +47 476 64 134
We Make Websites Fly!
_______________________________________________
varnish-misc mailing list
varnish-misc at varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc


More information about the varnish-misc mailing list