From yassine.aouadi90 at gmail.com Mon Mar 2 11:13:53 2020 From: yassine.aouadi90 at gmail.com (Yassine Aouadi) Date: Mon, 2 Mar 2020 12:13:53 +0100 Subject: Varnishncsa Random Log Sampling ? Message-ID: Hello, I am sending my Varnish log to to remote SAAS solution I and want to improve logs costs by implementing a server side sampling solution . First I splitted Varnishnncsa into two service one for error logs and the other for acces logs : CGroup: /system.slice/varnishncsa-error.service ??18458 /usr/bin/varnishncsa -c -b -a -w /var/log/varnish/varnishncsa-error.log -D -P /run/varnishncsa/varnishncsa-error.pid -f /etc/varnish/varnishncsa_logmatic.format -q *Status > 399 CGroup: /system.slice/varnishncsa.service ??18347 /usr/bin/varnishncsa -c -b -a -w /var/log/varnish/varnishncsa-access.log -D -P /run/varnishncsa/varnishncsa-access.pid -f /etc/varnish/varnishncsa_logmatic.format -q *Status < 400 Is there Any way to go further with varnishncsa and perform and random sampling of my access logs ? for example write only 10 % of access logs If it's not possible with varnishncsa any Suggestion ? I tried rsyslog random sampling but I am facing memory leaks while stress testing server with high load Thanks, Yassine -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Mon Mar 2 11:43:25 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Mon, 2 Mar 2020 11:43:25 +0000 Subject: Varnishncsa Random Log Sampling ? In-Reply-To: References: Message-ID: On Mon, Mar 2, 2020 at 11:15 AM Yassine Aouadi wrote: > > > Hello, > > I am sending my Varnish log to to remote SAAS solution I and want to improve logs costs by implementing a server side sampling solution . > > First I splitted Varnishnncsa into two service one for error logs and the other for acces logs : > > CGroup: /system.slice/varnishncsa-error.service > ??18458 /usr/bin/varnishncsa -c -b -a -w /var/log/varnish/varnishncsa-error.log -D -P /run/varnishncsa/varnishncsa-error.pid -f /etc/varnish/varnishncsa_logmatic.format -q *Status > 399 > > CGroup: /system.slice/varnishncsa.service > ??18347 /usr/bin/varnishncsa -c -b -a -w /var/log/varnish/varnishncsa-access.log -D -P /run/varnishncsa/varnishncsa-access.pid -f /etc/varnish/varnishncsa_logmatic.format -q *Status < 400 > > > Is there Any way to go further with varnishncsa and perform and random sampling of my access logs ? for example write only 10 % of access logs > > If it's not possible with varnishncsa any Suggestion ? I tried rsyslog random sampling but I am facing memory leaks while stress testing server with high load Hi, I think the closest to what you want is rate limiting, see the documentation for the varnishstat -R option. Otherwise you can always do the sampling one step downstream and instead of sending varnishncsa-.log whenever logrotate triggers a rotation you run script that sends 1 line every 10 lines. But I think rate limiting with -R is simpler and instead of a percentage that depends highly on your traffic you can actually get a limit according to a budget since you wish to reduce costs. Dridi From geoff at uplex.de Mon Mar 2 12:18:15 2020 From: geoff at uplex.de (Geoff Simmons) Date: Mon, 2 Mar 2020 13:18:15 +0100 Subject: Varnishncsa Random Log Sampling ? In-Reply-To: References: Message-ID: <1ca1650e-da27-e336-6127-7a851b79a865@uplex.de> On 3/2/20 12:13, Yassine Aouadi wrote: > > Is there Any way to go further with varnishncsa and perform and random > sampling of my access logs ? for example write only 10 % of access logs I've done that with log queries in the varnishncsa or varnishlog command line. For example, this query filters out all logs for which the X-Varnish header does not end in 0: -q 'RespHeader:X-Varnish !~ "0$"' https://varnish-cache.org/docs/trunk/reference/vsl-query.html The queries can use regular expressions, as this one does, so you can do things like filter for the range [0-4], [02468] for the even numbers, or whatever your imagination comes up with. If you know that your site always produces a certain kind of content, say a cookie whose value is hex digits, you can base your filter on that, say something like [89a-f]. If you use that X-Varnish example, bear in mind that it only applies to client logs. X-Varnish appears in backend request headers, so it would have to be BereqHeader:X-Varnish. HTH, Geoff -- ** * * UPLEX - Nils Goroll Systemoptimierung Scheffelstra?e 32 22301 Hamburg Tel +49 40 2880 5731 Mob +49 176 636 90917 Fax +49 40 42949753 http://uplex.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From veereshreddy.r at gmail.com Thu Mar 5 14:09:06 2020 From: veereshreddy.r at gmail.com (Veeresh Reddy) Date: Thu, 5 Mar 2020 19:39:06 +0530 Subject: Fwd: varnish help In-Reply-To: References: Message-ID: Please help me solve the below issue for preventing varnish crashing ---------- Forwarded message --------- From: Dridi Boukelmoune Date: Thu, Mar 5, 2020 at 7:36 PM Subject: Re: varnish help To: Veeresh Reddy On Thu, Mar 5, 2020 at 1:51 PM Veeresh Reddy wrote: > > Hello, > > My varnish is crashing with this error > Assert error in mgt_cli_challenge(), mgt/mgt_cli.c line 256: > > would you please help me solve this issue? Hello, I'm sorry to see that your varnish server is crashing but I do not take direct sollicitation. Please ask for help on the misc mailing list instead: https://varnish-cache.org/lists/mailman/listinfo Otherwise I can put you in contact with a sales person if you need commercial support. Best Regards, Dridi -- Hello, Regards, Veeresha R -------------- next part -------------- An HTML attachment was scrubbed... URL: From batanun at hotmail.com Sun Mar 15 21:54:35 2020 From: batanun at hotmail.com (J X) Date: Sun, 15 Mar 2020 21:54:35 +0000 Subject: Grace and misbehaving servers Message-ID: Hi, I'm currently setting up Varnish for a project, and the grace feature together with health checks/probes seems to be a great savior when working with servers that might misbehave. But I'm not really sure I understand how to actually achive that, since the example doesn't really make sense: https://varnish-cache.org/docs/trunk/users-guide/vcl-grace.html See the section "Misbehaving servers". There the example does "set beresp.grace = 24h" in vcl_backend_response, and "set req.grace = 10s" in vcl_recv, if the backend is healthy. But since vcl_recv is run before vcl_backend_response, doesn't that mean that the 10s grace value of vcl_recv is overwritten by the 24h value in vcl_backend_response? Also... There is always a risk of some URL's suddenly giving 500-error (or a timeout) all while the probe still returns 200. Is it possible to have Varnish behave more or less as if the backend is sick, but just for those URL? Basically I would like this logic: If a healthy content exists in the cache: 1. Return the cached (and potentially stale) content to the client 2. Increase the ttl and/or grace, to keep the healthy content longer 3. Only do a bg-fetch if a specified time has past since the last attempt (lets say 5s), to avoid hammering the backend If a non-health (ie 500-error) exists in the cache: 1. Return the cached 500-content to the client 2. Only do a bg-fetch if a specified time has past since the last attempt (lets say 5s), to avoid hammering the backend If no content doesn't exists in the cache: 1. Perform a synchronous fetch 2. If the result is a 500-error, cache it with lets say ttl = 5s 3. Otherwise, cache it with a longer ttl 4. Return the result to the client Is this possible with the community edition of Varnish? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Mon Mar 16 08:58:49 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Mon, 16 Mar 2020 08:58:49 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: Message-ID: Hi, On Sun, Mar 15, 2020 at 9:56 PM J X wrote: > > Hi, > > I'm currently setting up Varnish for a project, and the grace feature together with health checks/probes seems to be a great savior when working with servers that might misbehave. But I'm not really sure I understand how to actually achive that, since the example doesn't really make sense: > > https://varnish-cache.org/docs/trunk/users-guide/vcl-grace.html > > See the section "Misbehaving servers". There the example does "set beresp.grace = 24h" in vcl_backend_response, and "set req.grace = 10s" in vcl_recv, if the backend is healthy. But since vcl_recv is run before vcl_backend_response, doesn't that mean that the 10s grace value of vcl_recv is overwritten by the 24h value in vcl_backend_response? Not really, it's actually the other way around. The beresp.grace variable defines how long you may serve an object past its TTL once it enters the cache. Subsequent requests can then limit grace mode, so think of req.grace as a req.max_grace variable (which maybe hints that it should have been called that in the first place). > Also... There is always a risk of some URL's suddenly giving 500-error (or a timeout) all while the probe still returns 200. Is it possible to have Varnish behave more or less as if the backend is sick, but just for those URL? Basically I would like this logic: > > If a healthy content exists in the cache: > 1. Return the cached (and potentially stale) content to the client > 2. Increase the ttl and/or grace, to keep the healthy content longer > 3. Only do a bg-fetch if a specified time has past since the last attempt (lets say 5s), to avoid hammering the backend > > If a non-health (ie 500-error) exists in the cache: > 1. Return the cached 500-content to the client > 2. Only do a bg-fetch if a specified time has past since the last attempt (lets say 5s), to avoid hammering the backend What you are describing is stale-if-error, something we don't support but could be approximated with somewhat convoluted VCL. It used to be easier when Varnish had saint mode built-in because it generally resulted in less convoluted VCL. It's not something I would recommend attempting today. > If no content doesn't exists in the cache: > 1. Perform a synchronous fetch > 2. If the result is a 500-error, cache it with lets say ttl = 5s > 3. Otherwise, cache it with a longer ttl > 4. Return the result to the client > > Is this possible with the community edition of Varnish? You can do that with plain VCL, but even better, teach your backend to inform Varnish how to handle either cases with the Cache-Control response header. Dridi From batanun at hotmail.com Tue Mar 17 20:04:52 2020 From: batanun at hotmail.com (Batanun B) Date: Tue, 17 Mar 2020 20:04:52 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: , Message-ID: Hi Dridi, On Monday, March 16, 2020 9:58 AM Dridi Boukelmoune wrote: > Not really, it's actually the other way around. The beresp.grace > variable defines how long you may serve an object past its TTL once it > enters the cache. > > Subsequent requests can then limit grace mode, so think of req.grace > as a req.max_grace variable (which maybe hints that it should have > been called that in the first place). OK. So beresp.grace mainly effects how long the object can stay in the cache? And if ttl + grace + keep is a low value set in vcl_backend_response, then vcl_recv is limited in how high the grace can be? And req.grace doesn't effect the time that the object is in the cache? Even if req.grace is set to a low value on the very first request (ie the same request that triggers the call to the backend)? > What you are describing is stale-if-error, something we don't support > but could be approximated with somewhat convoluted VCL. It used to be > easier when Varnish had saint mode built-in because it generally > resulted in less convoluted VCL. > > It's not something I would recommend attempting today. That's strange. This stale-if-error sounds like something pretty much everyone would want, right? I mean, if there is is stale content available why show an error page to the end user? But maybe it was my want to "cache/remember" previous failed fetches and that made it complicated? So if I loosen the requirements/wish-list a bit, into this: Assuming that: * A request comes in to Varnish * The content is stale, but still in the cache * The backend is considered healthy * The short (10s) grace has expired * Varnish triggers a synchronus fetch in the backend * This fetch fails (timeout or 5xx error) I would then like Varnish to: * Return the stale content Would this be possible using basic Varnish community edition, without a "convoluted VCL", as you put it? Is it possible without triggering a restart of the request? Either way, I am interested in hearing about how it can be achieved. Is there any documentation or blog post that mentions this? Or can you give me some example code perhaps? Even a convoluted example would be OK by me. Increasing the req.grace value for every request is not an option, since we only want to serve old content if Varnish can't get hold of new content. And some of our pages are visited very rarely, so we can't rely on a constant stream of visitors keeping the content fresh in the cache. Regards From martynas at atomgraph.com Wed Mar 18 10:36:55 2020 From: martynas at atomgraph.com (=?UTF-8?Q?Martynas_Jusevi=C4=8Dius?=) Date: Wed, 18 Mar 2020 11:36:55 +0100 Subject: Varnish backend request sent to itself? Message-ID: Hi, I'm using varnish-5.2.1 revision 67e562482 as a Docker image and trying to restore the setup that was working some time ago. However I'm getting 503 Backend fetch failed. The VCL file clearly specifies the backend: backend default { .host = "atomgraph.some_host.com"; .port = "80"; .first_byte_timeout = 60s; } Varnish itself runs on atomgraph.some_host.varnish, as specified in docker-compose.yml. When I login into the container, I see the client request: - ReqMethod GET - ReqURL /smth/smth/... - ReqProtocol HTTP/1.1 - ReqHeader Accept: application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json - ReqHeader Cache-Control: no-cache - ReqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV - ReqHeader Host: atomgraph.dydra.varnish - ReqHeader Connection: Keep-Alive - ReqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) - ReqHeader X-Forwarded-For: 172.30.0.3 However when I look at the backend request: - BereqMethod GET - BereqURL /smth/smth/... - BereqProtocol HTTP/1.1 - BereqHeader Accept: application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json - BereqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV - BereqHeader Host: atomgraph.dydra.varnish - BereqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) - BereqHeader X-Forwarded-For: 172.30.0.3 - BereqHeader X-Varnish: 32783 it looks like it is calling itself (i.e. the local fake .varnish backend host instead of the real .com one), judging from the BereqHeader Host. If that is the case, then no wonder that the backend request fails: - BerespProtocol HTTP/1.1 - BerespStatus 503 - BerespReason Service Unavailable - BerespReason Backend fetch failed How can this be? Martynas From martynas at atomgraph.com Wed Mar 18 10:41:23 2020 From: martynas at atomgraph.com (=?UTF-8?Q?Martynas_Jusevi=C4=8Dius?=) Date: Wed, 18 Mar 2020 11:41:23 +0100 Subject: Varnish backend request sent to itself? In-Reply-To: References: Message-ID: My attempt to obfuscate the hosts failed :) Sorry for that. In short: the configured backend is atomgraph.dydra.com but BereqHeader Host is atomgraph.dydra.varnish. Why? On Wed, Mar 18, 2020 at 11:36 AM Martynas Jusevi?ius wrote: > > Hi, > > I'm using varnish-5.2.1 revision 67e562482 as a Docker image and > trying to restore the setup that was working some time ago. > > However I'm getting 503 Backend fetch failed. > > The VCL file clearly specifies the backend: > > backend default { > .host = "atomgraph.some_host.com"; > .port = "80"; > .first_byte_timeout = 60s; > } > > Varnish itself runs on atomgraph.some_host.varnish, as specified in > docker-compose.yml. > > When I login into the container, I see the client request: > > - ReqMethod GET > - ReqURL /smth/smth/... > - ReqProtocol HTTP/1.1 > - ReqHeader Accept: > application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json > - ReqHeader Cache-Control: no-cache > - ReqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV > - ReqHeader Host: atomgraph.dydra.varnish > - ReqHeader Connection: Keep-Alive > - ReqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) > - ReqHeader X-Forwarded-For: 172.30.0.3 > > However when I look at the backend request: > > - BereqMethod GET > - BereqURL /smth/smth/... > - BereqProtocol HTTP/1.1 > - BereqHeader Accept: > application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json > - BereqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV > - BereqHeader Host: atomgraph.dydra.varnish > - BereqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) > - BereqHeader X-Forwarded-For: 172.30.0.3 > - BereqHeader X-Varnish: 32783 > > it looks like it is calling itself (i.e. the local fake .varnish > backend host instead of the real .com one), judging from the > BereqHeader Host. > > If that is the case, then no wonder that the backend request fails: > > - BerespProtocol HTTP/1.1 > - BerespStatus 503 > - BerespReason Service Unavailable > - BerespReason Backend fetch failed > > How can this be? > > Martynas From martynas at atomgraph.com Wed Mar 18 12:53:32 2020 From: martynas at atomgraph.com (=?UTF-8?Q?Martynas_Jusevi=C4=8Dius?=) Date: Wed, 18 Mar 2020 13:53:32 +0100 Subject: Varnish backend request sent to itself? In-Reply-To: References: Message-ID: Nevermind, I figured out I wasn't even passing my VCL config correctly :) On Wed, Mar 18, 2020 at 11:41 AM Martynas Jusevi?ius wrote: > > My attempt to obfuscate the hosts failed :) Sorry for that. > > In short: the configured backend is atomgraph.dydra.com but > BereqHeader Host is atomgraph.dydra.varnish. Why? > > On Wed, Mar 18, 2020 at 11:36 AM Martynas Jusevi?ius > wrote: > > > > Hi, > > > > I'm using varnish-5.2.1 revision 67e562482 as a Docker image and > > trying to restore the setup that was working some time ago. > > > > However I'm getting 503 Backend fetch failed. > > > > The VCL file clearly specifies the backend: > > > > backend default { > > .host = "atomgraph.some_host.com"; > > .port = "80"; > > .first_byte_timeout = 60s; > > } > > > > Varnish itself runs on atomgraph.some_host.varnish, as specified in > > docker-compose.yml. > > > > When I login into the container, I see the client request: > > > > - ReqMethod GET > > - ReqURL /smth/smth/... > > - ReqProtocol HTTP/1.1 > > - ReqHeader Accept: > > application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json > > - ReqHeader Cache-Control: no-cache > > - ReqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV > > - ReqHeader Host: atomgraph.dydra.varnish > > - ReqHeader Connection: Keep-Alive > > - ReqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) > > - ReqHeader X-Forwarded-For: 172.30.0.3 > > > > However when I look at the backend request: > > > > - BereqMethod GET > > - BereqURL /smth/smth/... > > - BereqProtocol HTTP/1.1 > > - BereqHeader Accept: > > application/rdf+xml,text/rdf+n3,application/n-triples,text/csv,application/rdf+xml,application/rdf+thrift,text/turtle,application/rdf+json > > - BereqHeader Authorization: Basic bGlua2VkZGF0YWh1YjpqYThhc3BhdHV0YWtFdGhV > > - BereqHeader Host: atomgraph.dydra.varnish > > - BereqHeader User-Agent: Apache-HttpClient/4.1.1 (java 1.5) > > - BereqHeader X-Forwarded-For: 172.30.0.3 > > - BereqHeader X-Varnish: 32783 > > > > it looks like it is calling itself (i.e. the local fake .varnish > > backend host instead of the real .com one), judging from the > > BereqHeader Host. > > > > If that is the case, then no wonder that the backend request fails: > > > > - BerespProtocol HTTP/1.1 > > - BerespStatus 503 > > - BerespReason Service Unavailable > > - BerespReason Backend fetch failed > > > > How can this be? > > > > Martynas From batanun at hotmail.com Wed Mar 18 14:15:49 2020 From: batanun at hotmail.com (Batanun B) Date: Wed, 18 Mar 2020 14:15:49 +0000 Subject: Fix incorrect Last-Modified from backend? Message-ID: Hi, Long story short, one of our backend systems serves an incorrect Last-Modified response header, and I don't see a way to fix it at the source (third party system, not based on Nginx/Tomcat/IIS or anything like that). So, I would like to "fix" it in Varnish, since I don't expect the maker of that software being able to fix this within a reasonable time. Is there a built in way in Varnish to make it generate it's own Last-Modified response header? Something like: * If no stale object exists in cache, set Last-Modified to the value of the Date response header * If a stale object exists in cache, and it's body content is identical to the newly fetched content, keep the Last-Modified from the step above * If a stale object exists in cache, but it's body content is different to the newly fetched content, set Last-Modified to the value of the Date response header Any suggestions on how to handle this situation? Any general Varnish guidelines when working with a backend that acts like this? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Thu Mar 19 09:57:47 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 19 Mar 2020 09:57:47 +0000 Subject: Fix incorrect Last-Modified from backend? In-Reply-To: References: Message-ID: On Wed, Mar 18, 2020 at 2:17 PM Batanun B wrote: > > Hi, > > Long story short, one of our backend systems serves an incorrect Last-Modified response header, and I don't see a way to fix it at the source (third party system, not based on Nginx/Tomcat/IIS or anything like that). > > So, I would like to "fix" it in Varnish, since I don't expect the maker of that software being able to fix this within a reasonable time. Is there a built in way in Varnish to make it generate it's own Last-Modified response header? Something like: > > * If no stale object exists in cache, set Last-Modified to the value of the Date response header > * If a stale object exists in cache, and it's body content is identical to the newly fetched content, keep the Last-Modified from the step above > * If a stale object exists in cache, but it's body content is different to the newly fetched content, set Last-Modified to the value of the Date response header I don't think you can do something like that without writing a module, and even if you could you would still have a chicken-egg problem for streaming deliveries when it comes to generating a header based on the contents of the body (you would need trailers, but we don't support them). By the way, when it comes to revalidation based on the body, you should use ETag instead of Last-Modified. > Any suggestions on how to handle this situation? Any general Varnish guidelines when working with a backend that acts like this? I think that's a tough nut to crack. There are many things you can work around from a misbehaving backend but this case is not trivial. Dridi From martynas at atomgraph.com Thu Mar 19 10:04:22 2020 From: martynas at atomgraph.com (=?UTF-8?Q?Martynas_Jusevi=C4=8Dius?=) Date: Thu, 19 Mar 2020 11:04:22 +0100 Subject: Purging on PUT and DELETE Message-ID: Hi, upon receiving a PUT or DELETE request, I'd like Varnish to invalidate the current object (and its variants) *and* to pass the request to the backend. Essentially the same question as here: https://serverfault.com/questions/399814/varnish-purge-on-post-or-put The answer seems outdated though. I consider this a common use case for REST CRUD APIs, so I was surprised not to find a single VCL example mentioning it. Martynas From dridi at varni.sh Thu Mar 19 10:12:11 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 19 Mar 2020 10:12:11 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: Message-ID: On Tue, Mar 17, 2020 at 8:06 PM Batanun B wrote: > > Hi Dridi, > > On Monday, March 16, 2020 9:58 AM Dridi Boukelmoune wrote: > > > Not really, it's actually the other way around. The beresp.grace > > variable defines how long you may serve an object past its TTL once it > > enters the cache. > > > > Subsequent requests can then limit grace mode, so think of req.grace > > as a req.max_grace variable (which maybe hints that it should have > > been called that in the first place). > > OK. So beresp.grace mainly effects how long the object can stay in the cache? And if ttl + grace + keep is a low value set in vcl_backend_response, then vcl_recv is limited in how high the grace can be? Not quite! ttl+grace+keep defines how long an object may stay in the cache (barring any form of invalidation). The grace I'm referring to is beresp.grace, it defines how long we might serve a stale object while a background fetch is in progress. > And req.grace doesn't effect the time that the object is in the cache? Even if req.grace is set to a low value on the very first request (ie the same request that triggers the call to the backend)? Right, req.grace only defines the maximum staleness tolerated by a client. So if backend selection happens on the backend side, you can for example adjust that maximum based on the health of the backend. > > What you are describing is stale-if-error, something we don't support > > but could be approximated with somewhat convoluted VCL. It used to be > > easier when Varnish had saint mode built-in because it generally > > resulted in less convoluted VCL. > > > > It's not something I would recommend attempting today. > > That's strange. This stale-if-error sounds like something pretty much everyone would want, right? I mean, if there is is stale content available why show an error page to the end user? As always in such cases it's not black or white. Depending on the nature of your web traffic you may want to put the cursor on always serving something, or never serving something stale. For example, live "real time" traffic may favor failing some requests over serving stale data. Many users want stale-if-error, but it's not trivial, and it needs to be balanced against other aspects like performance. > But maybe it was my want to "cache/remember" previous failed fetches and that made it complicated? So if I loosen the requirements/wish-list a bit, into this: > > Assuming that: > * A request comes in to Varnish > * The content is stale, but still in the cache > * The backend is considered healthy > * The short (10s) grace has expired > * Varnish triggers a synchronus fetch in the backend > * This fetch fails (timeout or 5xx error) > > I would then like Varnish to: > * Return the stale content I agree that on paper it sounds simple, but in practice it might be harder to get right. For example, "add HTTP/3 support" is a simple statement, but the work it implies can be orders of magnitude more complicated. And stale-if-error is one those tricky features: tricky for performance, that must not break existing VCL, etc. > Would this be possible using basic Varnish community edition, without a "convoluted VCL", as you put it? Is it possible without triggering a restart of the request? Either way, I am interested in hearing about how it can be achieved. Is there any documentation or blog post that mentions this? Or can you give me some example code perhaps? Even a convoluted example would be OK by me. I wouldn't recommend stale-if-error at all today, as I said in my first reply. > Increasing the req.grace value for every request is not an option, since we only want to serve old content if Varnish can't get hold of new content. And some of our pages are visited very rarely, so we can't rely on a constant stream of visitors keeping the content fresh in the cache. Is it hurting you that less frequently requested contents don't stay in the cache? Another option is to give Varnish a high TTL (and give clients a lower TTL) and trigger a form of invalidation directly from the backend when you know a resource changed. Dridi From dridi at varni.sh Thu Mar 19 10:21:00 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 19 Mar 2020 10:21:00 +0000 Subject: Purging on PUT and DELETE In-Reply-To: References: Message-ID: On Thu, Mar 19, 2020 at 10:05 AM Martynas Jusevi?ius wrote: > > Hi, > > upon receiving a PUT or DELETE request, I'd like Varnish to invalidate > the current object (and its variants) *and* to pass the request to the > backend. > > Essentially the same question as here: > https://serverfault.com/questions/399814/varnish-purge-on-post-or-put > The answer seems outdated though. I would do it like this: > sub vcl_backend_response { > if (beresp.status == 200 && bereq.method ~ "PUT|DELETE") { > ban("req.url == " + bereq.url + " && req.http.host == " + bereq.http.host); > } > } Or at least, I would do it in vcl_backend_response, there's no point in invalidating if the client wasn't allowed to change a resource for example. > I consider this a common use case for REST CRUD APIs, so I was > surprised not to find a single VCL example mentioning it. The problem is that so many things can go wrong. For example my snippet doesn't allow the ban to be processed in the background, so further adjustments are needed to make that happen. It also assumes that bereq's URL and Host are identical to req's, and isn't subject to client noise (spurious query parameters and whatnots). So indeed, I wouldn't want to advertise that kind of snippet without a heavy supply of red tape. Dridi From martynas at atomgraph.com Thu Mar 19 10:28:15 2020 From: martynas at atomgraph.com (=?UTF-8?Q?Martynas_Jusevi=C4=8Dius?=) Date: Thu, 19 Mar 2020 11:28:15 +0100 Subject: Purging on PUT and DELETE In-Reply-To: References: Message-ID: Thank you Dridi. But what I'm reading here https://docs.varnish-software.com/tutorials/cache-invalidation/ > Unlike purges, banned content won?t immediately be evicted from cache freeing up memory, instead it will either stay in cache until its TTL expires, if we ban on req properties, or it will be evicted by a background thread, called ban_lurker, if we ban on the obj properties Which means that using your example, if immediately follow up PUT/DELETE with a GET, it is not certain to get a fresh copy? Because "banned content won?t immediately be evicted from cache"? On Thu, Mar 19, 2020 at 11:21 AM Dridi Boukelmoune wrote: > > On Thu, Mar 19, 2020 at 10:05 AM Martynas Jusevi?ius > wrote: > > > > Hi, > > > > upon receiving a PUT or DELETE request, I'd like Varnish to invalidate > > the current object (and its variants) *and* to pass the request to the > > backend. > > > > Essentially the same question as here: > > https://serverfault.com/questions/399814/varnish-purge-on-post-or-put > > The answer seems outdated though. > > I would do it like this: > > > sub vcl_backend_response { > > if (beresp.status == 200 && bereq.method ~ "PUT|DELETE") { > > ban("req.url == " + bereq.url + " && req.http.host == " + bereq.http.host); > > } > > } > > Or at least, I would do it in vcl_backend_response, there's no point > in invalidating if the client wasn't allowed to change a resource for > example. > > > I consider this a common use case for REST CRUD APIs, so I was > > surprised not to find a single VCL example mentioning it. > > The problem is that so many things can go wrong. For example my > snippet doesn't allow the ban to be processed in the background, so > further adjustments are needed to make that happen. It also assumes > that bereq's URL and Host are identical to req's, and isn't subject to > client noise (spurious query parameters and whatnots). > > So indeed, I wouldn't want to advertise that kind of snippet without a > heavy supply of red tape. > > > Dridi From dridi at varni.sh Thu Mar 19 10:33:42 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 19 Mar 2020 10:33:42 +0000 Subject: Purging on PUT and DELETE In-Reply-To: References: Message-ID: On Thu, Mar 19, 2020 at 10:28 AM Martynas Jusevi?ius wrote: > > Thank you Dridi. > > But what I'm reading here > https://docs.varnish-software.com/tutorials/cache-invalidation/ > > Unlike purges, banned content won?t immediately be evicted from cache freeing up memory, instead it will either stay in cache until its TTL expires, if we ban on req properties, or it will be evicted by a background thread, called ban_lurker, if we ban on the obj properties > > Which means that using your example, if immediately follow up > PUT/DELETE with a GET, it is not certain to get a fresh copy? Because > "banned content won?t immediately be evicted from cache"? That's because bans using req criteria (as opposed to obj) need a request to happen to test the ban on a given object. And even bans with obj criteria don't happen immediately, they eventually happen in the background. But once a ban is in the list, an object is not served from cache before confirming that it isn't invalidated by a newer ban during lookup, so you shouldn't worry about that. Dridi From batanun at hotmail.com Fri Mar 20 22:11:58 2020 From: batanun at hotmail.com (Batanun B) Date: Fri, 20 Mar 2020 22:11:58 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: , Message-ID: On Thu , Mar 19, 2020 at 11:12 AM Dridi Boukelmoune wrote: > > Not quite! > > ttl+grace+keep defines how long an object may stay in the cache > (barring any form of invalidation). > > The grace I'm referring to is beresp.grace, Well, when I wrote "if ttl + grace + keep is a low value set in vcl_backend_response", I was talking about beresp.grace, as in beresp.ttl + beresp.grace + beresp.keep. > it defines how long we might serve a stale object while a background fetch is in progress. I'm not really seeing how that is different from what I said. If beresp.ttl + beresp.grace + beresp.keep is 10s in total, then a req.grace of say 24h wouldn't do much good, right? Or maybe I just misunderstood what you were saying here. > As always in such cases it's not black or white. Depending on the > nature of your web traffic you may want to put the cursor on always > serving something, or never serving something stale. For example, live > "real time" traffic may favor failing some requests over serving stale > data. Well, I was thinking of the typical "regular" small/medium website, like blogs, corporate profile, small town news etc. > I agree that on paper it sounds simple, but in practice it might be > harder to get right. OK. But what if I implemented it in this way, in my VCL? * In vcl_backend_response, set beresp.grace to 72h if status < 400 * In vcl_backend_error and vcl_backend_response (when status >= 500), return (abandon) * In vcl_synth, restart the request, with a special req header set * In vcl_recv, if this req header is present, set req.grace to 72h Wouldn't this work? If no, why? If yes, would you say there is something else problematic with it? Of course I would have to handle some special cases, and maybe check req.restarts and such, but I'm talking about the thought process as a whole here. I might be missing something, but I think I would need someone to point it out to me because I just don't get why this would be wrong. > Is it hurting you that less frequently requested contents don't stay > in the cache? If it results in people seeing error pages when a stale content would be perfectly fine for them, then yes. And these less frequently requested pages might still be part of a group of pages that all result in an error in the backend (while the health probe still return 200 OK). So while one individual page might be visited infrequently, the total number of visits on these kind of pages might be high. Lets say that there are 3.000 unique (and cachable) pages that are visited during an average weekend. And all of these are in the Varnish cache, but 2.000 of these have stale content. Now lets say that 50% of all pages start returning 500 errors from the backend, on a Friday evening. That would mean that about ~1000 of these stale pages would result in the error displayed to the end users during that weekend. I would much more prefer if it were to still serve them stale content, and then I could look into the problem on Monday morning. > Another option is to give Varnish a high TTL (and give clients a lower > TTL) and trigger a form of invalidation directly from the backend when > you know a resource changed. Well, that is perfectly fine for pages that have a one-to-one mapping between the page (ie the URL) and the content updated. But most pages in our setup contain a mix of multiple contents, and it is not possible to know beforehand if a specific content will contribute to the result of a specific page. That is especially true for new content that might be included in multiple pages already in the cache. The only way to handle that in a foolproof way, as far as I can tell, is to invalidate all pages (since any page can contain this kind of content) the moment any object is updated. But that would pretty much clear the cache constantly. And we would still have to handle the case where the cache is invalidated for a page that gives a 500 error when Varnish tries to fetch it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From batanun at hotmail.com Fri Mar 20 22:32:57 2020 From: batanun at hotmail.com (Batanun B) Date: Fri, 20 Mar 2020 22:32:57 +0000 Subject: Fix incorrect Last-Modified from backend? In-Reply-To: References: , Message-ID: On Thu, Mar 19, 2020 at 09:57 AM Dridi Boukelmoune wrote: > > By the way, when it comes to revalidation based on the body, you > should use ETag instead of Last-Modified. Sadly, there is no ETag available. And I can't see any way of adding it without "hacking" the software (patching their code in an unsupported way, causing problems every time we want to upgrade that software). But I'm waiting for a reply on a support case I created, asking about this. But if I had easy access to the body in Varnish this would be "trivial" to implement there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Mon Mar 23 10:00:18 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Mon, 23 Mar 2020 10:00:18 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: Message-ID: Hi, On Fri, Mar 20, 2020 at 10:14 PM Batanun B wrote: > > On Thu , Mar 19, 2020 at 11:12 AM Dridi Boukelmoune wrote: > > > > Not quite! > > > > ttl+grace+keep defines how long an object may stay in the cache > > (barring any form of invalidation). > > > > The grace I'm referring to is beresp.grace, > > Well, when I wrote "if ttl + grace + keep is a low value set in vcl_backend_response", I was talking about beresp.grace, as in beresp.ttl + beresp.grace + beresp.keep. > > > > it defines how long we might serve a stale object while a background fetch is in progress. > > I'm not really seeing how that is different from what I said. If beresp.ttl + beresp.grace + beresp.keep is 10s in total, then a req.grace of say 24h wouldn't do much good, right? Or maybe I just misunderstood what you were saying here. Or maybe *I* just misunderstood your understanding :) > > As always in such cases it's not black or white. Depending on the > > nature of your web traffic you may want to put the cursor on always > > serving something, or never serving something stale. For example, live > > "real time" traffic may favor failing some requests over serving stale > > data. > > Well, I was thinking of the typical "regular" small/medium website, like blogs, corporate profile, small town news etc. > > > > I agree that on paper it sounds simple, but in practice it might be > > harder to get right. > > OK. But what if I implemented it in this way, in my VCL? > > * In vcl_backend_response, set beresp.grace to 72h if status < 400 > * In vcl_backend_error and vcl_backend_response (when status >= 500), return (abandon) > * In vcl_synth, restart the request, with a special req header set > * In vcl_recv, if this req header is present, set req.grace to 72h > > Wouldn't this work? If no, why? If yes, would you say there is something else problematic with it? Of course I would have to handle some special cases, and maybe check req.restarts and such, but I'm talking about the thought process as a whole here. I might be missing something, but I think I would need someone to point it out to me because I just don't get why this would be wrong. For starters, there currently is no way to know for sure that you entered vcl_synth because of a return(abandon) transition. There are plans to make it possible, but currently you can do that with confidence lower than 100%. A problem with the restart logic is the race it opens since you now have two lookups, but overall, that's the kind of convoluted VCL that should work. The devil might be in the details. > > Is it hurting you that less frequently requested contents don't stay > > in the cache? > > If it results in people seeing error pages when a stale content would be perfectly fine for them, then yes. > > And these less frequently requested pages might still be part of a group of pages that all result in an error in the backend (while the health probe still return 200 OK). So while one individual page might be visited infrequently, the total number of visits on these kind of pages might be high. > > Lets say that there are 3.000 unique (and cachable) pages that are visited during an average weekend. And all of these are in the Varnish cache, but 2.000 of these have stale content. Now lets say that 50% of all pages start returning 500 errors from the backend, on a Friday evening. That would mean that about ~1000 of these stale pages would result in the error displayed to the end users during that weekend. I would much more prefer if it were to still serve them stale content, and then I could look into the problem on Monday morning. In this case you might want to combine your VCL restart logic with vmod_saintmode. https://github.com/varnish/varnish-modules/blob/6.0-lts/docs/vmod_saintmode.rst#vmod_saintmode This VMOD allows you to create circuit breakers for individual resources for a given backend. That will result in more complicated but will help you mark individual resources as sick, making the need for a "special req header" redundant. And since vmod_saintmode marks resources sick for a given time, it means that NOT ALL individual clients will go through the complete restart dance during that window. I think you may still have to do a restart in vcl_miss because only then will you know the saint-mode health (you need both a backend and a hash). > > Another option is to give Varnish a high TTL (and give clients a lower > > TTL) and trigger a form of invalidation directly from the backend when > > you know a resource changed. > > Well, that is perfectly fine for pages that have a one-to-one mapping between the page (ie the URL) and the content updated. But most pages in our setup contain a mix of multiple contents, and it is not possible to know beforehand if a specific content will contribute to the result of a specific page. That is especially true for new content that might be included in multiple pages already in the cache. > > The only way to handle that in a foolproof way, as far as I can tell, is to invalidate all pages (since any page can contain this kind of content) the moment any object is updated. But that would pretty much clear the cache constantly. And we would still have to handle the case where the cache is invalidated for a page that gives a 500 error when Varnish tries to fetch it. And you might solve this problem with vmod_xkey! https://github.com/varnish/varnish-modules/blob/6.0-lts/docs/vmod_xkey.rst#vmod_xkey You need help from the backend to communicate a list of "abstract identifiers" of "things" that contribute to a response. This way if a change in your backend spans multiple responses you can still perform a single invalidation to affect them all. Dridi From batanun at hotmail.com Wed Mar 25 18:41:44 2020 From: batanun at hotmail.com (Batanun B) Date: Wed, 25 Mar 2020 18:41:44 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: , Message-ID: On Mo, Mar 23, 2020 at 10:00 AM Dridi Boukelmoune wrote: > > For starters, there currently is no way to know for sure that you > entered vcl_synth because of a return(abandon) transition. There are > plans to make it possible, but currently you can do that with > confidence lower than 100%. I see. I actually had a feeling about that, since I didn't see an obvious way to pass that kind of information into vcl_synth when triggered by an abandon. Although, just having a general rule to restart 500-requests there, regardless of what caused it, is not really that bad anyway. > A problem with the restart logic is the race it opens since you now > have two lookups, but overall, that's the kind of convoluted VCL that > should work. The devil might be in the details. Could you describe this race condition that you mean can happen? What could the worst case scenario be? If it is just a guru meditation for this single request, and it happens very rarely, then that is something I can live with. If it is something that can cause Varnish to crash or hang, then it is not something I can live with :) > In this case you might want to combine your VCL restart logic with > vmod_saintmode. Yes, I have already heard some things about this vmod. I will definitely look into it. Thanks. > And you might solve this problem with vmod_xkey! We actually already use this vmod. But like I said, it doesn't solve the problem with new content that effects existing pages. Several pages might for example include information about the latest objects created in the system. If one of these pages were loaded and cached at time T1, and then at T2 a new object O2 was created, an "xkey purge" with the key "O2" will have no effect since that page was not associated with the "O2" key at time T1, because O2 didn't even exist then. And since there is no way to know beforehand which these pages are, the only bullet proof way I can see of handling this is to purge all pages* any time any content is updated. * or at least a large subset of all pages, since the vast majority might include something related to newly created objects -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Wed Mar 25 22:18:18 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Wed, 25 Mar 2020 22:18:18 +0000 Subject: Grace and misbehaving servers In-Reply-To: References: Message-ID: > > A problem with the restart logic is the race it opens since you now > > have two lookups, but overall, that's the kind of convoluted VCL that > > should work. The devil might be in the details. > > Could you describe this race condition that you mean can happen? What could the worst case scenario be? If it is just a guru meditation for this single request, and it happens very rarely, then that is something I can live with. If it is something that can cause Varnish to crash or hang, then it is not something I can live with :) In general by the time you get to the second lookup the state of the cache may have changed. An object may go away in between, so a restart would cause unnecessary processing that would likely lead to an additional erroring fetch. Using a combination of saint mode and req.grace to emulate stale-if-error could in theory lead to something simpler. At least it would if this change landed one way or the other: https://github.com/varnishcache/varnish-cache/issues/3259 > > In this case you might want to combine your VCL restart logic with > > vmod_saintmode. > > Yes, I have already heard some things about this vmod. I will definitely look into it. Thanks. It used to be a no brainer with Varnish 3, being part of VCL... > > And you might solve this problem with vmod_xkey! > > We actually already use this vmod. But like I said, it doesn't solve the problem with new content that effects existing pages. Oh, now I get it! That's an interesting limitation I don't think I ever considered. I will give it some thought! > Several pages might for example include information about the latest objects created in the system. If one of these pages were loaded and cached at time T1, and then at T2 a new object O2 was created, an "xkey purge" with the key "O2" will have no effect since that page was not associated with the "O2" key at time T1, because O2 didn't even exist then. > > And since there is no way to know beforehand which these pages are, the only bullet proof way I can see of handling this is to purge all pages* any time any content is updated. > > * or at least a large subset of all pages, since the vast majority might include something related to newly created objects You can always use vmod_xkey to broadly tag responses. An example I like to take to illustrate this is tagging a response as "article". If you change the template for articles, you know you can [soft] purge them all at once. That doesn't solve the invalidation using keys unknown (yet) to the cache, but my take would be that if my application can know that, it should be able to invalidate individual resources affected by their new key (I'm aware it's not always that easy). Dridi From rsreese at gmail.com Tue Mar 31 21:12:14 2020 From: rsreese at gmail.com (Stephen Reese) Date: Tue, 31 Mar 2020 17:12:14 -0400 Subject: Varnish not respecting pass to backend for specified http hosts Message-ID: Running Varnish 6.0.6 in a Docker container for several Wordpress sites. I have several domains that I would like to pass to the backend verse having them cached, but the configuration is not behaving as intended. Looking to understand why I am unable to specify which sites I would like to pass. If I do something like: if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { set req.backend_hint = default; } else { return (pass); } then every hostname's content is cached where I would expect only the two specified domains to cache, and everything not defined, to pass. Also, if I do not specify the configuration, all sites are cached (as expected). If I use something like: if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { return (pass); } then no sites are cached where I would expect everything to cache except for the two domains. What might be causing this behavior? I looked at the requests with varnishlog, the undefined domains are indeed being fetched from the backend verse being cached: - VCL_call RECV - VCL_acl NO_MATCH forbidden - VCL_return pass - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch Varnish configuration is attached. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- vcl 4.0; # Set the default backend web server backend default { .host = "app-proxy"; .port = "8080"; # Increase guru timeout # http://vincentfretin.ecreall.com/articles/varnish-guru-meditation-on-timeout .first_byte_timeout = 300s; } # Forbidden IP ACL acl forbidden { } # Purge ACL acl purge { "app-proxy"; "192.168.0.0"/16; "127.0.0.1"; "localhost"; "172.16.0.0"/16; "10.0.0.0"/8; } # This function is used when a request is send by a HTTP client (Browser) sub vcl_recv { # Block the forbidden IP addresse if (client.ip ~ forbidden) { return (synth(403, "Forbidden")); } if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { return (pass); } # Compatibility with Apache format log if (req.restarts == 0) { if (req.http.X-Pss-Loop == "pagespeed_proxy") { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } # Normalize the header, remove the port (in case you're testing this on various TCP ports) set req.http.Host = regsub(req.http.Host, ":[0-9]+", ""); # Allow purging from ACL if (req.method == "PURGE") { # If not allowed then a error 405 is returned if (!client.ip ~ purge) { return (synth(405, "This IP is not allowed to send PURGE requests.")); } #return (purge); ban("req.http.host == " + req.http.host + " && req.url == " + req.url); # Throw a synthetic page so the # request won't go to the backend. return(synth(200, "Purge added")); } if (req.method == "BAN") { # Same ACL check as above: if (!client.ip ~ purge) { return(synth(403, "Not allowed.")); } ban("req.http.host == " + req.http.host + " && req.url == " + req.url); # Throw a synthetic page so the # request won't go to the backend. return(synth(200, "Ban added")); } # Only deal with "normal" types if (req.method != "GET" && req.method != "HEAD" && req.method != "PUT" && req.method != "POST" && req.method != "TRACE" && req.method != "OPTIONS" && req.method != "PATCH" && req.method != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); } # Only cache GET or HEAD requests. This makes sure the POST requests are always passed. if (req.method != "GET" && req.method != "HEAD") { return (pass); } # Configure grace period, in case the backend goes down #set req.grace = 15s; #if (std.healthy(req.backend)) { # set req.grace = 30s; #} else { # unset req.http.Cookie; # set req.grace = 6h; #} # --- Wordpress specific configuration # Do not cache the RSS feed if (req.url ~ "/feed") { return (pass); } # Dont Cache WordPress post pages and edit pages if (req.url ~ "(wp-admin|post\.php|edit\.php|wp-login)") { return(pass); } if (req.url ~ "/wp-cron.php" || req.url ~ "preview=true") { return (pass); } # Pass through the WooCommerce dynamic pages if (req.url ~ "^/(cart|my-account/*|checkout|wc-api/*|addons|logout|lost-password)") { return (pass); } # Pass through the WooCommerce add to cart if ( req.url ~ "\?add-to-cart=" ) { return (pass); } # Pass through the WooCommerce API if (req.url ~ "\?wc-api=" ) { return (pass); } # Remove the "has_js" cookie set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", ""); # Remove any Google Analytics based cookies set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", ""); # Remove any Google Analytics based cookies set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "_gid=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", ""); # Remove the Disqus cookie set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(disqus_unique)=[^;]*", ""); # Remove the Quant Capital cookies (added by some plugin, all __qca) set req.http.Cookie = regsuball(req.http.Cookie, "__qc.=[^;]+(; )?", ""); # Remove the wp-settings-1 cookie set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-1=[^;]+(; )?", ""); # Remove the wp-settings-time-1 cookie set req.http.Cookie = regsuball(req.http.Cookie, "wp-settings-time-1=[^;]+(; )?", ""); # Remove the wp test cookie set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=[^;]+(; )?", ""); # Are there cookies left with only spaces or that are empty? if (req.http.cookie ~ "^ *$") { unset req.http.cookie; } if (!(req.url ~ "(wp-login|wp-admin|cart|my-account|checkout|addons|wordpress-social-login|wp-login\.php|forumPM|members)")) { unset req.http.cookie; } # Cache all static files by Removing all cookies for static files if (req.url ~ "^[^?]*\.(bmp|bz2|css|doc|eot|flv|gif|ico|jpeg|jpg|js|less|pdf|png|rtf|swf|txt|woff|xml)(\?.*)?$") { unset req.http.Cookie; return (hash); } # Check the cookies for wordpress-specific items if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") { return (pass); } if (!req.http.cookie) { unset req.http.cookie; } # Ban outside access to wp-admin #if (req.url ~ "wp-(login|admin)" && !client.ip ~ purge) { # error 403 "Forbidden"; #} # Cache all others requests # --- End of Wordpress specific configuration # Normalize Accept-Encoding header and compression # https://www.varnish-cache.org/docs/3.0/tutorial/vary.html if (req.http.Accept-Encoding) { # Do no compress compressed files... if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") { unset req.http.Accept-Encoding; } elsif (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { unset req.http.Accept-Encoding; } } # Large static files should be piped, so they are delivered directly to the end-user without # waiting for Varnish to fully read the file first. # TODO: once the Varnish Streaming branch merges with the master branch, use streaming here to avoid locking. if (req.url ~ "^[^?]*\.(mp[34]|rar|tar|tgz|gz|wav|zip)(\?.*)?$") { unset req.http.Cookie; return (pipe); } # Do not cache HTTP authentication and HTTP Cookie if (req.http.Authorization || req.http.Cookie) { return (pass); } # Exclude caching Ajax requests if (req.http.X-Requested-With == "XMLHttpRequest") { return(pass); } # Cache all others requests return (hash); } sub vcl_pipe { # Note that only the first request to the backend will have # X-Forwarded-For set. If you use X-Forwarded-For and want to # have it set for all requests, make sure to have: # set bereq.http.connection = "close"; # here. It is not set by default as it might break some broken web # applications, like IIS with NTLM authentication. set bereq.http.Connection = "Close"; return (pipe); } # The data on which the hashing will take place sub vcl_hash { hash_data(req.url); if (req.http.host) { hash_data(req.http.host); } else { hash_data(server.ip); } # If the client supports compression, keep that in a different cache if (req.http.Accept-Encoding) { hash_data(req.http.Accept-Encoding); } return (lookup); } sub vcl_hit { # Allow purges if (req.method == "PURGE") { #purge; return (synth(200, "Purged Hit")); } return (deliver); } sub vcl_miss { # Allow purges if (req.method == "PURGE") { #purge; return (synth(200, "Purged Miss")); } return (fetch); } # This function is used when a request is sent by our backend (Nginx server) sub vcl_backend_response { set beresp.ttl = 1800s; # Cache static files if (bereq.url ~ "^[^?]*\.(bmp|bz2|css|doc|eot|flv|gif|ico|jpeg|jpg|js|less|mp[34]|pdf|png|rar|rtf|swf|tar|tgz|txt|wav|woff|xml|zip)(\?.*)?$") { unset beresp.http.set-cookie; } return (deliver); } # The routine when we deliver the HTTP request to the user # Last chance to modify headers that are sent to the client sub vcl_deliver { if (obj.hits > 0) { set resp.http.X-Cache = "cached"; } else { set resp.http.X-Cache = "uncached"; } # Remove some headers: PHP version unset resp.http.X-Powered-By; # Remove some headers: Apache version & OS unset resp.http.Server; # Remove Varnish version unset resp.http.X-Varnish; unset resp.http.Via; # Remove Google ModPageSpeed unset resp.http.X-Mod-Pagespeed; return (deliver); } From guillaume at varnish-software.com Tue Mar 31 21:56:18 2020 From: guillaume at varnish-software.com (Guillaume Quintard) Date: Tue, 31 Mar 2020 14:56:18 -0700 Subject: Varnish not respecting pass to backend for specified http hosts In-Reply-To: References: Message-ID: Hi, I think there is a bit of confusion regarding what "return(pass)" does. Basically, when a request comes in, you need to answer two questions: - do I want to cache it? - if I need data from the backend, well, what is my backend? "return(pass)" answers the first question with a big fat "no", and this is what you see in your log ("VCL_call PASS" and "VCL_return pass"), the request will be fetched backend, but won't be stored in cache. You then need to decide where to fetch the data from, but in your vcl, you only have one backend, so everything comes from the same backend, what you want is to create a second backend, and do if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { set req.backend_hint = default; } else { set req.backend_hint = other; return (pass); } Hopefully that will clarify things a bit. Out of curiosity, are you using the official varnish image ( https://hub.docker.com/_/varnish) or something else? Cheers, -- Guillaume Quintard On Tue, Mar 31, 2020 at 2:13 PM Stephen Reese wrote: > Running Varnish 6.0.6 in a Docker container for several Wordpress sites. I > have several domains that I would like to pass to the backend verse having > them cached, but the configuration is not behaving as intended. Looking to > understand why I am unable to specify which sites I would like to pass. If > I do something like: > > if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { > set req.backend_hint = default; > } else { > return (pass); > } > > then every hostname's content is cached where I would expect only the two > specified domains to cache, and everything not defined, to pass. Also, if I > do not specify the configuration, all sites are cached (as expected). If I > use something like: > > if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { > return (pass); > } > > then no sites are cached where I would expect everything to cache except > for the two domains. What might be causing this behavior? I looked at the > requests with varnishlog, the undefined domains are indeed being fetched > from the backend verse being cached: > > - VCL_call RECV > - VCL_acl NO_MATCH forbidden > - VCL_return pass > - VCL_call HASH > - VCL_return lookup > - VCL_call PASS > - VCL_return fetch > > Varnish configuration is attached. > > _______________________________________________ > varnish-misc mailing list > varnish-misc at varnish-cache.org > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dridi at varni.sh Tue Mar 31 22:06:05 2020 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 31 Mar 2020 22:06:05 +0000 Subject: Varnish not respecting pass to backend for specified http hosts In-Reply-To: References: Message-ID: On Tue, Mar 31, 2020 at 9:58 PM Guillaume Quintard wrote: > > Hi, > > I think there is a bit of confusion regarding what "return(pass)" does. > > Basically, when a request comes in, you need to answer two questions: > - do I want to cache it? > - if I need data from the backend, well, what is my backend? > > "return(pass)" answers the first question with a big fat "no", and this is what you see in your log ("VCL_call PASS" and "VCL_return pass"), the request will be fetched backend, but won't be stored in cache. > > You then need to decide where to fetch the data from, but in your vcl, you only have one backend, so everything comes from the same backend, what you want is to create a second backend, and do > > if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { No matter how you look at it this if statement is broken. You can either go for this: > if (req.http.host ~ "^(dev\.)?domain\.com") Or you can do this: > if (req.http.host == "domain.com" || req.http.host == "dev.domain.com") The ~ operator in this context matches regular expressions, and your regex doesn't make sense. Dridi > set req.backend_hint = default; > } else { > set req.backend_hint = other; > return (pass); > } > > > Hopefully that will clarify things a bit. > > Out of curiosity, are you using the official varnish image (https://hub.docker.com/_/varnish) or something else? > > Cheers, > > -- > Guillaume Quintard From rsreese at gmail.com Tue Mar 31 22:53:11 2020 From: rsreese at gmail.com (Stephen Reese) Date: Tue, 31 Mar 2020 18:53:11 -0400 Subject: Varnish not respecting pass to backend for specified http hosts In-Reply-To: References: Message-ID: On Tue, Mar 31, 2020 at 5:56 PM Guillaume Quintard wrote: > Basically, when a request comes in, you need to answer two questions: > - do I want to cache it? > - if I need data from the backend, well, what is my backend? > > "return(pass)" answers the first question with a big fat "no", and this is what you see in your log ("VCL_call PASS" and "VCL_return pass"), the request will be fetched backend, but won't be stored in cache. > > You then need to decide where to fetch the data from, but in your vcl, you only have one backend, so everything comes from the same backend, what you want is to create a second backend, and do This cleared things up, the following seems to work in that it will not cache requests for the two domains but appears to cache everything I would like to: if ( req.http.host == "domain.com" || req.http.host == "dev.domain.com") { return (pass); } else { set req.backend_hint = default; } Or since I only have one backend: if ( req.http.host == "domain.com" || req.http.host == "dev.domain.com") { return (pass); } > Out of curiosity, are you using the official varnish image (https://hub.docker.com/_/varnish) or something else? I am using the official Varnish image from the aforementioned link, works great! Nginx -> Varnish -> Nginx -> PHP-FPM. With fairly heavy Wordpress theme and plugins, site will sees ~1-3k ms DOMContentLoaded without Varnish and ~300ms with! From rsreese at gmail.com Tue Mar 31 22:54:52 2020 From: rsreese at gmail.com (Stephen Reese) Date: Tue, 31 Mar 2020 18:54:52 -0400 Subject: Varnish not respecting pass to backend for specified http hosts In-Reply-To: References: Message-ID: On Tue, Mar 31, 2020 at 6:06 PM Dridi Boukelmoune wrote: > > if ((req.http.host ~ "(domain.com) || (dev.domain.com)")) { > > No matter how you look at it this if statement is broken. > > You can either go for this: > > > if (req.http.host ~ "^(dev\.)?domain\.com") > > Or you can do this: > > > if (req.http.host == "domain.com" || req.http.host == "dev.domain.com") > > The ~ operator in this context matches regular expressions, and your > regex doesn't make sense. Thanks for clearing this up!