From slink at schokola.de Fri Dec 1 11:40:43 2017 From: slink at schokola.de (Nils Goroll) Date: Fri, 1 Dec 2017 12:40:43 +0100 Subject: Varnish Lurker is getting slower / Ban lists keeps increasing In-Reply-To: <996d26d4-29ba-6c75-cc11-682d99bc08c2@schokola.de> References: <12ca81a6-44fa-bfa0-046d-b051cabc92b6@schokola.de> <996d26d4-29ba-6c75-cc11-682d99bc08c2@schokola.de> Message-ID: <48fcdb1c-da05-a2a0-f89b-364502f9c970@schokola.de> On 30/11/17 20:40, Nils Goroll wrote: > FTR: I'll continue working on this with Olivier 1:1 and will report the outcome > here. Summary: Olivier's varnish instance's ban lurker is evaluating ~1M bans/s, which is a normal rate. The instance at hand was not configured to use the cutoff. The ban regular expressions can be optimized, but that will most likely not avoid the problem. Other options I proposed: - reduce the number of bans - reduce the number of objects - use the cutoff - sponsor more research and possibly development to further improve ban lurker performance Nils From lagged at gmail.com Fri Dec 1 11:59:25 2017 From: lagged at gmail.com (Andrei) Date: Fri, 1 Dec 2017 05:59:25 -0600 Subject: Upstream connection between nginx to varnish - HTTP2 In-Reply-To: <36bf3412c60d472fbb777daaffbc7e53@netmatch.nl> References: <36bf3412c60d472fbb777daaffbc7e53@netmatch.nl> Message-ID: How do you guys deal with large request rates when offloading SSL? For example, Hitch/Nginx do 5k requests/second, which then get sent to Varnish and that's another 5k requests/second. This causes a tremendous spike in internal connections which ultimately increases resource consumption two-fold, and reduces the maximum number of requests per second which can be handled via HTTPS. This is honestly one of the main reasons I'm considering moving over to Nginx which can handle both, to cut down on the tremendous amount of internal connections/requests, especially considering UNIX sockets cannot be used yet. Over HTTP we can easily serve 10k+ reqs/s, however 10k+ reqs/s over HTTPS causes some massive issues, even on a tuned 32 core server with 128GB RAM + NVMe disks On Thu, Nov 30, 2017 at 1:09 PM, Angelo H?ngens wrote: > Yup, hitch works fine. > > > > *From:* varnish-misc [mailto:varnish-misc-bounces+a.hongens= > netmatch.nl at varnish-cache.org] *On Behalf Of *Prem Kumar > *Sent:* Wednesday, 29 November, 2017 09:24 > *To:* varnish-misc at varnish-cache.org > *Subject:* Upstream connection between nginx to varnish - HTTP2 > > > > Hi , > > > > Trying to use nginx to varnish connection as http2 instead of http1.1. > > > > https://www.nginx.com/blog/http2-module-nginx/#QandA > > > > Currently nginx is not supporting the upstream http2. > > > > Anyone knows hitch SSL proxy will do upstream connection to varnish > through http2?. or any other SSL proxy would do http2 connection?. > > > > > > Thanks for the help. > > > > > > Thanks, > > Prem > > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.hanesse at gmail.com Fri Dec 1 14:34:33 2017 From: olivier.hanesse at gmail.com (Olivier Hanesse) Date: Fri, 1 Dec 2017 15:34:33 +0100 Subject: Varnish Lurker is getting slower / Ban lists keeps increasing In-Reply-To: <48fcdb1c-da05-a2a0-f89b-364502f9c970@schokola.de> References: <12ca81a6-44fa-bfa0-046d-b051cabc92b6@schokola.de> <996d26d4-29ba-6c75-cc11-682d99bc08c2@schokola.de> <48fcdb1c-da05-a2a0-f89b-364502f9c970@schokola.de> Message-ID: Thanks a lot for you analysis Nils. 2017-12-01 12:40 GMT+01:00 Nils Goroll : > On 30/11/17 20:40, Nils Goroll wrote: > > FTR: I'll continue working on this with Olivier 1:1 and will report the > outcome > > here. > > Summary: > > Olivier's varnish instance's ban lurker is evaluating ~1M bans/s, which is > a > normal rate. The instance at hand was not configured to use the cutoff. > The ban > regular expressions can be optimized, but that will most likely not avoid > the > problem. Other options I proposed: > > - reduce the number of bans > - reduce the number of objects > - use the cutoff > - sponsor more research and possibly development to further improve ban > lurker > performance > > Nils > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmoisa at yahoo.com Sat Dec 9 09:05:37 2017 From: rmoisa at yahoo.com (Radu Moisa) Date: Sat, 9 Dec 2017 09:05:37 +0000 (UTC) Subject: Varnish sending incomplete responses when nuking objects References: <260343594.1670959.1512810337124.ref@mail.yahoo.com> Message-ID: <260343594.1670959.1512810337124@mail.yahoo.com> Hi, I have an issue with varnish (v5.2.1) returning incomplete responses when the cache gets full and it starts nuking objects.The request that triggered the object nuke is returned incomplete (tested with curl) and the python requests library complains with "ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))". Do you see anything wrong in the vlc file? Should there be a mandatory return statement in the vcl_recv function? vcl 4.0; backend pub1 {? ? .host = "pub1.example.com";? ? .port = "80";? ? .probe = {? ? ? ? .url = "/";? ? ? ? .timeout = 5s;? ? ? ? .interval = 10s;? ? ? ? .window = 5;? ? ? ? .threshold = 3;? ? }? ? .connect_timeout = 10s;? ? .first_byte_timeout = 900s;? ? .between_bytes_timeout = 900s;} backend pub2 {? ? .host = "pub2.example.com";? ? .port = "80";? ? .probe = {? ? ? ? .url = "/";? ? ? ? .timeout = 5s;? ? ? ? .interval = 10s;? ? ? ? .window = 5;? ? ? ? .threshold = 3;? ? }? ? .connect_timeout = 10s;? ? .first_byte_timeout = 900s;? ? .between_bytes_timeout = 900s;}?# Enables use of "Cache-Control: no-cache"sub vcl_recv {? ? if (req.http.Cache-Control ~ "no-cache") {? ? ? ? ?set req.hash_always_miss = true;? ? } ? ? set req.backend_hint = pub1; ? ? if (req.http.host == "pub1-cache.example.com") {? ? ? ? set req.backend_hint = pub1;? ? } ? ? if (req.http.host == "pub2-cache.example.com") {? ? ? ? set req.backend_hint = pub2;? ? }} # Use smaller TTL for non 200 backend response codessub vcl_backend_response {? ? if (beresp.status != 200){? ? ? ? set beresp.ttl = 60s;? ? }} Thanks a lot!Radu -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlos.abalde at gmail.com Sat Dec 9 14:32:46 2017 From: carlos.abalde at gmail.com (Carlos Abalde) Date: Sat, 9 Dec 2017 15:32:46 +0100 Subject: Varnish sending incomplete responses when nuking objects In-Reply-To: <260343594.1670959.1512810337124@mail.yahoo.com> References: <260343594.1670959.1512810337124.ref@mail.yahoo.com> <260343594.1670959.1512810337124@mail.yahoo.com> Message-ID: Hi Radu, Try increasing 'nuke_limit' (default value is 50). Check out https://github.com/varnishcache/varnish-cache/issues/1764 for details. Best, -- Carlos Abalde > On 9 Dec 2017, at 10:05, Radu Moisa wrote: > > Hi, > > I have an issue with varnish (v5.2.1) returning incomplete responses when the cache gets full and it starts nuking objects. > The request that triggered the object nuke is returned incomplete (tested with curl) and the python requests library complains with "ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))". > > Do you see anything wrong in the vlc file? Should there be a mandatory return statement in the vcl_recv function? > > > vcl 4.0; > > backend pub1 { > .host = "pub1.example.com"; > .port = "80"; > .probe = { > .url = "/"; > .timeout = 5s; > .interval = 10s; > .window = 5; > .threshold = 3; > } > .connect_timeout = 10s; > .first_byte_timeout = 900s; > .between_bytes_timeout = 900s; > } > > backend pub2 { > .host = "pub2.example.com"; > .port = "80"; > .probe = { > .url = "/"; > .timeout = 5s; > .interval = 10s; > .window = 5; > .threshold = 3; > } > .connect_timeout = 10s; > .first_byte_timeout = 900s; > .between_bytes_timeout = 900s; > } > > # Enables use of "Cache-Control: no-cache" > sub vcl_recv { > if (req.http.Cache-Control ~ "no-cache") { > set req.hash_always_miss = true; > } > > set req.backend_hint = pub1; > > if (req.http.host == "pub1-cache.example.com") { > set req.backend_hint = pub1; > } > > if (req.http.host == "pub2-cache.example.com") { > set req.backend_hint = pub2; > } > } > > # Use smaller TTL for non 200 backend response codes > sub vcl_backend_response { > if (beresp.status != 200){ > set beresp.ttl = 60s; > } > } > > Thanks a lot! > Radu > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.karayiannidis at stoiximan.gr Sun Dec 10 10:46:26 2017 From: y.karayiannidis at stoiximan.gr (Yiannis Karayiannidis) Date: Sun, 10 Dec 2017 12:46:26 +0200 Subject: Mark of a varnish backend as sick In-Reply-To: References: Message-ID: Hi all, i' ve got a question regarding the marking of a Varnish backend as sick. Let's assume that my backend server is healthy an I manually set that particular backend as sick what will happen to existing open connections? To be more specific is the sick operation same like Haproxy's drain Operation? i.e. Is it a graceful operation ? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From guillaume at varnish-software.com Sun Dec 10 12:49:46 2017 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Sun, 10 Dec 2017 13:49:46 +0100 Subject: Mark of a varnish backend as sick In-Reply-To: References: Message-ID: Yup, completely graceful :-) Being sick only prevent new connections from being opened. -- Guillaume Quintard On Sun, Dec 10, 2017 at 11:46 AM, Yiannis Karayiannidis < y.karayiannidis at stoiximan.gr> wrote: > Hi all, > i' ve got a question regarding the marking of a Varnish backend as sick. > > Let's assume that my backend server is healthy an I manually set that > particular backend as sick what will happen to existing open connections? > > To be more specific is the sick operation same like Haproxy's drain > Operation? > i.e. Is it a graceful operation ? > > Regards > > > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlos.abalde at gmail.com Mon Dec 11 09:09:57 2017 From: carlos.abalde at gmail.com (Carlos Abalde) Date: Mon, 11 Dec 2017 10:09:57 +0100 Subject: Varnish sending incomplete responses when nuking objects In-Reply-To: <2065730732.2531489.1512975086551@mail.yahoo.com> References: <260343594.1670959.1512810337124.ref@mail.yahoo.com> <260343594.1670959.1512810337124@mail.yahoo.com> <2065730732.2531489.1512975086551@mail.yahoo.com> Message-ID: <505B57A9-ADB8-4FA1-9913-73814E3A2C5A@gmail.com> > On 11 Dec 2017, at 07:51, Radu Moisa wrote: > > Hi! > > Thanks a lot for the hint! > > Just so that I understand it better, nuke_limit is the "Maximum number of objects we attempt to nuke in order to make space for a object body." > If I set it to something like 9999999, varnish will throw out only the number of objects needed to make room for the new request, not the nuke_limit number of objects, right? Yes, that's right. While trying to store an object in the cache, if not enough free space is available, Varnish will nuke up to 'nuke_limit' objects. This will happen incrementally, while the object is being fetched from the backend, stored in the cache, and eventually also being streamed to one or more clients. If the 'nuke_limit' is reached the object won't be cached and client responses will be closed (and therefore clients will end up with a truncated response). Best, -- Carlos Abalde -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmathiesen at tripadvisor.com Mon Dec 11 14:10:25 2017 From: jmathiesen at tripadvisor.com (James Mathiesen) Date: Mon, 11 Dec 2017 14:10:25 +0000 Subject: Varnish sending incomplete responses when nuking objects In-Reply-To: <505B57A9-ADB8-4FA1-9913-73814E3A2C5A@gmail.com> References: <260343594.1670959.1512810337124.ref@mail.yahoo.com> <260343594.1670959.1512810337124@mail.yahoo.com> <2065730732.2531489.1512975086551@mail.yahoo.com> <505B57A9-ADB8-4FA1-9913-73814E3A2C5A@gmail.com> Message-ID: <8B00E401-B4AE-4322-8AA6-01C65EA83CC0@tripadvisor.com> I have caching turned off at the moment because of this (not a big deal -- the cache hit rate would be very low regardless). It's a bit awkward to work around and this is the only case I can think of where varnish would cause a request that would otherwise succeed to fail. I'm planning to have multiple caches (small object cache + large object cache for example) but this would not be possible if the response used chunked transfer encoding. Setting nuke limit very high would work with chunked transfers but also makes it possible for a single response to evict everything else in the cache. james From: varnish-misc on behalf of Carlos Abalde Date: Monday, December 11, 2017 at 4:11 AM To: Radu Moisa Cc: varnish-misc Subject: Re: Varnish sending incomplete responses when nuking objects On 11 Dec 2017, at 07:51, Radu Moisa > wrote: Hi! Thanks a lot for the hint! Just so that I understand it better, nuke_limit is the "Maximum number of objects we attempt to nuke in order to make space for a object body." If I set it to something like 9999999, varnish will throw out only the number of objects needed to make room for the new request, not the nuke_limit number of objects, right? Yes, that's right. While trying to store an object in the cache, if not enough free space is available, Varnish will nuke up to 'nuke_limit' objects. This will happen incrementally, while the object is being fetched from the backend, stored in the cache, and eventually also being streamed to one or more clients. If the 'nuke_limit' is reached the object won't be cached and client responses will be closed (and therefore clients will end up with a truncated response). Best, -- Carlos Abalde -------------- next part -------------- An HTML attachment was scrubbed... URL: From raph at futomaki.net Fri Dec 22 20:33:34 2017 From: raph at futomaki.net (Raphael Mazelier) Date: Fri, 22 Dec 2017 21:33:34 +0100 Subject: Varnish nighmare after upgrading : epilogue ? In-Reply-To: <1a28c6e7-29b2-3998-b7d4-049d0eaf1fec@futomaki.net> References: <620367f1-0247-b457-f65c-08fc898ea6b3@futomaki.net> <89923832-10d5-2a3f-617e-22e9a9f31f7b@futomaki.net> <1a28c6e7-29b2-3998-b7d4-049d0eaf1fec@futomaki.net> Message-ID: <29807c51-f354-a0ec-51a3-c3c16dad1b6a@futomaki.net> On 23/11/2017 21:57, Raphael Mazelier wrote: > > A short following of the situation and what seems to mitigate the > problem/make this platform works. > After lot of testing and A/B testing, the solution for us was to make > more smaller instances. > We basically double all the servers(vms) , but in the other hand divide > by two (or more) the ram, and the memory allocated to varnish. > We also revert using malloc with little ram (4G) , 12g on the vm(s). We > also make scheduled task to flush the cache (restarting varnish). > This is a completely counter intuitive, because nuking some entries > seems better to handle a big cache with no nuke. > In my understating it means that our hot content remains in the cache, > and nuking object is OK. This may also means that our ttl in objects are > completly wrong. > > Anyway it seems working. Thanks a lot for the people who help us. (and > I'm sure we can find a way to re-tribute this). > Another follow up for posterity :) I think we have finally succeed restoring a nominal service on your application. The main problem in the varnish side was the use of a two stage caching pattern for non cachable requests. We completely misunderstood the hit for pass concept ; resulting in many request being kept in the waiting list at the two stage, specifically in peak. Since theses requests can not be cached it seems that piping them in level 1 is more than enough. To be fair we also fix some little things in our application code too :) Happy Holidays. -- Raphael Mazelier From phk at phk.freebsd.dk Fri Dec 22 20:56:35 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Fri, 22 Dec 2017 20:56:35 +0000 Subject: Varnish nighmare after upgrading : epilogue ? In-Reply-To: <29807c51-f354-a0ec-51a3-c3c16dad1b6a@futomaki.net> References: <620367f1-0247-b457-f65c-08fc898ea6b3@futomaki.net> <89923832-10d5-2a3f-617e-22e9a9f31f7b@futomaki.net> <1a28c6e7-29b2-3998-b7d4-049d0eaf1fec@futomaki.net> <29807c51-f354-a0ec-51a3-c3c16dad1b6a@futomaki.net> Message-ID: <23675.1513976195@critter.freebsd.dk> -------- In message <29807c51-f354-a0ec-51a3-c3c16dad1b6a at futomaki.net>, Raphael Mazelier writes: >On 23/11/2017 21:57, Raphael Mazelier wrote: >Another follow up for posterity :) Thanks for reporting, if more people did that, searching email archives for clues would be much more productive. >We completely misunderstood the hit for pass concept We have struggled with that one from almost day one of Varnish, any and all ideas are welcome. >Happy Holidays. God Jul :-) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.