Best practice for caching scenario with different backend servers but same content

Sun Aug 1 16:30:23 UTC 2021

Hi,

There are a lot of things to unpack here.

> if a varnish for specific file requests to one backend server and for the
same file but to another backend server it would cache that file again
because of different Host headers ! so my solution is using fallback
director instead of round-robin

The two aren't related, if you have a hashing problem causing you to cache
the same object twice, changing the directors isn't going to save you.
Ideally, the requests will get normalized (host header and path) in
vcl_recv{} so that they will be properly hashed in vcl_hash{}.

The backend resolution only happens after you have exited
vcl_backend_fetch{}, long after you have (not) found the object in the
cache, and the best solution for video is usually to use
consistent-hashing. In open-source this means vmod_shard (
https://varnish-cache.org/docs/trunk/reference/vmod_directors.html#directors-shard),
in Enterprise, it'll be udo (
https://docs.varnish-software.com/varnish-cache-plus/vmods/udo/#set-hash),
they are going to handle about the same except udo makes it easier to set
the hash, which may be important for live (more info below).

With consistent hashing, you can configure all Varnish servers the same,
and they will determine which backend to use based on the request.
Typically, the same request will always go to the same backend. This
provides pretty good load-balancing over time, and additionally it
leverages the internally caching that most video origins have.

If you are serving VOD, that is all you need, but if you are serving Live,
you need to care about one other thing: you want consistent hashing not per
request, but per stream. Because the origins may be slightly out of sync,
you may get a manifest on origin A which will advertise a chunk that isn't
available anywhere yet, and if you don't fetch the new chunk from origin A,
you'll get a 404 or a 412.
So, for live, you will need to use shard's key() (
https://varnish-cache.org/docs/trunk/reference/vmod_directors.html#int-xshard-key-string)
or udo's set_hash() (
https://docs.varnish-software.com/varnish-cache-plus/vmods/udo/#set-hash)
to create a hash based on the stream path.

For example, consider these paths:
- /live/australia/Channel5/480p/manifest.m3u8 and
/live/australia/Channel5/480p/chunk_43212123.ts: the stream path is
/live/australia/Channel5/480p/
- /video/live/52342645323/manifest.dash and
/video/live/52342645323/manifest.dash?time=4216432432&bitrate=80000000&audio=523453:
the stream path is /video/live/52342645323/manifest.dash

On top of all this, if you start having more than 5 Varnish servers, you
might want to consider adding an extra layer of caching between the
client-facing Varnish nodes and the origins (origin shields) to reduce
the load on the origins. In that case, the shields would be the one
handling the consistent hashing.

Hope this helps

-- 
Guillaume Quintard

On Sun, Aug 1, 2021 at 4:18 AM Hamidreza Hosseini <hrhosseini at hotmail.com>
wrote:

> Hi,
> I want to use varnish in my scenario as cache service, I have about 10
> http servers that serve Hls fragments as the backend servers and about 5
> varnish servers for caching purpose, the problem comes in when I use
> round-robin director for backend servers in varnish,
> if a varnish for specific file requests to one backend server and for the
> same file but to another backend server it would cache that file again
> because of different Host headers ! so my solution is using fallback
> director instead of round-robin as follow:
>
> ```
> In varnish-1:
>     new hls_cluster = directors.fallback();
>     hls_cluster.add_backend(b1());
>     hls_cluster.add_backend(b2());
>     hls_cluster.add_backend(b3());
>     hls_cluster.add_backend(b4());
>     hls_cluster.add_backend(b5());
>     hls_cluster.add_backend(b6());
>     hls_cluster.add_backend(b7());
>     hls_cluster.add_backend(b8());
>     hls_cluster.add_backend(b9());
>     hls_cluster.add_backend(b10());
>
>
>
> In varnish-2:
>     new hls_cluster = directors.fallback();
>     hls_cluster.add_backend(b10());
>     hls_cluster.add_backend(b1());
>     hls_cluster.add_backend(b2());
>     hls_cluster.add_backend(b3());
>     hls_cluster.add_backend(b4());
>     hls_cluster.add_backend(b5());
>     hls_cluster.add_backend(b6());
>     hls_cluster.add_backend(b7());
>     hls_cluster.add_backend(b8());
>     hls_cluster.add_backend(b9());
>
>
> In varnish-3:
>     new hls_cluster = directors.fallback();
>     hls_cluster.add_backend(b9());
>     hls_cluster.add_backend(b1());
>     hls_cluster.add_backend(b2());
>     hls_cluster.add_backend(b3());
>     hls_cluster.add_backend(b4());
>     hls_cluster.add_backend(b5());
>     hls_cluster.add_backend(b6());
>     hls_cluster.add_backend(b7());
>     hls_cluster.add_backend(b8());
>     hls_cluster.add_backend(b10());
>
> ```
> But I think this is not the best solution, because there is no load
> balancing despite, I used different backend for the first argument of
> fallback directive,
> What is varnish recommendation for this scenario?
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20210801/f98093bc/attachment.html>