Caching Modified URLs by Varnish instead of the original requested URL

Uday Kumar uday.polu at indiamart.com
Sun Sep 3 15:57:08 UTC 2023


Thanks Guillaume, I'll look into it.

Thanks & Regards
Uday Kumar


On Fri, Sep 1, 2023 at 1:36 AM Guillaume Quintard <
guillaume.quintard at gmail.com> wrote:

> I'm pretty sure it's correctly lowercasing "\2" correctly. The problem is
> that you want to lowercase the *value* referenced by "\2" instead.
>
> On this, I don't think you have a choice, you need to make that captured
> group its own string, lowercase it, and only then concatenate it. Something
> like:
>
> set req.http.hash-url = regsuball(req.http.hash-url,
> ".*(q=)(.*?)(\&|$).*", "\1") + *std.tolower("regsuball(req.http.hash-url,
> ".*(q=)(.*?)(\&|$).*", "\2")") + *regsuball(req.http.hash-url,
> ".*(q=)(.*?)(\&|$).*", "\3"));
>
> It's disgusting, but eh, we started with regex, so...
>
> Other options include vmod_querystring
> <https://github.com/Dridi/libvmod-querystring/blob/master/src/vmod_querystring.vcc.in>
> (Dridi might possibly be of assistance on this topic) and vmod_urlplus
> <https://docs.varnish-software.com/varnish-enterprise/vmods/urlplus/#query_get> (Varnish
> Enterprise), and the last, and possibly most promising one, vmod_re2
> <https://gitlab.com/uplex/varnish/libvmod-re2/-/blob/master/README.md> which
> would allow you to do something like
>
> if (myset.match(".*(q=)(.*?)(\&|$).*", "\1")) {
>    set req.http.hash-url = myset.matched(1) + std.lower(myset.matched(2))
> + myset.matched(3)
> }
>
> --
> Guillaume Quintard
>
>
> On Thu, Aug 31, 2023 at 1:03 AM Uday Kumar <uday.polu at indiamart.com>
> wrote:
>
>> Hi Guillaume,
>>
>> In the process of modifying the query string in VCL code, we have a
>> requirement of *lowercasing value of specific parameter*, instead of the *whole
>> query string*
>>
>> *Example Request URL:*
>> /search/ims?q=*CRICKET bat*&country_code=IN
>>
>> *Requirement:*
>> We have to modify the request URL by lowercasing the value of only the *q
>> *parameter
>> i.e ./search/ims?q=*cricket bat*&country_code=IN
>>
>> *For that, we have found below regex:*
>> set req.http.hash-url = regsuball(req.http.hash-url, "(q=)(.*?)(\&|$)",
>> "\1"+*std.tolower("\2")*+"\3");
>>
>> *ISSUE:*
>> *std.tolower("\2")* in the above statement is *not lowercasing* the
>> string that's captured, but if I test it using *std.tolower("SAMPLE"),* its
>> lowercasing as expected.
>>
>> 1. May I know why it's not lowercasing if *std.tolower("\2") is used*?
>> 2. Also, please provide possible optimal solutions for the same. (using
>> regex)
>>
>> Thanks & Regards
>> Uday Kumar
>>
>>
>> On Wed, Aug 23, 2023 at 12:01 PM Uday Kumar <uday.polu at indiamart.com>
>> wrote:
>>
>>> Hi Guillaume,
>>>
>>> *use includes and function calls*
>>> This is great, thank you so much for your help!
>>>
>>> Thanks & Regards
>>> Uday Kumar
>>>
>>>
>>> On Wed, Aug 23, 2023 at 1:32 AM Guillaume Quintard <
>>> guillaume.quintard at gmail.com> wrote:
>>>
>>>> Hi Uday,
>>>>
>>>> I'm not exactly sure how to read those diagrams, so I apologize if I'm
>>>> missing the mark or if I'm too broad here.
>>>>
>>>> There are a few points I'd like to attract your attention to. The first
>>>> one is that varnish doesn't cache the request or the URL. The cache is
>>>> essentially a big hashmap/dictionary/database, in which you store the
>>>> response. The request/url is the key for it, so you need to have it in its
>>>> "final" form before you do anything.
>>>>
>>>> From what I read, you are not against it, and you just want to sanitize
>>>> the URL in vcl_recv, but you don't like the idea of making the main file
>>>> too unwieldy. If I got that right, then I have a nice answer for you: use
>>>> includes and function calls.
>>>>
>>>> As an example:
>>>>
>>>> # cat /etc/varnish/url.vcl
>>>> sub sanitize_url {
>>>>   # do whatever modifications you need here
>>>> }
>>>>
>>>> # cat /etc/varnish/default.vcl
>>>> include "./url.vcl";
>>>>
>>>> sub vcl_recvl {
>>>>   call sanitize_url;
>>>> }
>>>>
>>>>
>>>> That should get you going.
>>>>
>>>> Hopefully I didn't miss the mark too much here, let me know if I did.
>>>>
>>>> --
>>>> Guillaume Quintard
>>>>
>>>>
>>>> On Tue, Aug 22, 2023 at 3:45 AM Uday Kumar <uday.polu at indiamart.com>
>>>> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>>
>>>>> For our spring boot application, we are using Varnish Caching in a
>>>>> production environment.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Requirement: [To utilize cache effectively]
>>>>>
>>>>> Modify the URL (Removal of unnecessary parameters) while caching the
>>>>> user request, so that the modified URL can be cached by varnish which
>>>>> helps improve cache HITS for similar URLs.
>>>>>
>>>>>
>>>>> For Example:
>>>>>
>>>>> Let's consider the below Request URL
>>>>>
>>>>> Url at time t, 1. samplehost.com/search/ims?q=bags&source=android
>>>>> &options.start=0
>>>>>
>>>>>
>>>>> Our Requirement:
>>>>>
>>>>> To make varnish consider URLs with options.start=0 and without
>>>>> options.start parameter as EQUIVALENT, such that a single cached
>>>>> response(Single Key) can be utilized in both cases.
>>>>>
>>>>>
>>>>> *1st URL after modification:*
>>>>>
>>>>> samplehost.com/search/ims?q=bags&source=android
>>>>>
>>>>>
>>>>> *Cached URL at Varnish:*
>>>>>
>>>>> samplehost.com/search/ims?q=bags&source=android
>>>>>
>>>>>
>>>>>
>>>>> Now, Url at time t+1, 2.
>>>>> samplehost.com/search/ims?q=bags&source=android
>>>>>
>>>>>
>>>>> At present, varnish considers the above URL as different from 1st URL
>>>>> and uses a different key while caching the 2nd URL[So, it will be a
>>>>> miss]
>>>>>
>>>>>
>>>>> *So, URL after Modification:*
>>>>>
>>>>> samplehost.com/search/ims?q=bags&source=android
>>>>>
>>>>>
>>>>> Now, 2nd URL will be a HIT at varnish, effectively utilizing the
>>>>> cache.
>>>>>
>>>>>
>>>>>
>>>>> NOTE:
>>>>>
>>>>> We aim to execute this URL Modification without implementing the
>>>>> logic directly within the default.VCL file. Our intention is to
>>>>> maintain a clean and manageable codebase in the VCL.
>>>>>
>>>>>
>>>>>
>>>>> To address this requirement effectively, we have explored two
>>>>> potential Approaches:
>>>>>
>>>>>
>>>>> Approach-1:
>>>>>
>>>>>
>>>>>
>>>>> Approach-2:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 1. Please go through the approaches mentioned above and let me know
>>>>> the effective solution.
>>>>>
>>>>> 2. Regarding Approach-2
>>>>>
>>>>> At Step 2:
>>>>>
>>>>> May I know if there is any way to access and execute a custom
>>>>> subroutine from another VCL, for modifying the Request URL? if yes,
>>>>> pls help with details.
>>>>>
>>>>> At Step 3:
>>>>>
>>>>> Tomcat Backend should receive the Original Request URL instead of the
>>>>> Modified URL.
>>>>>
>>>>> 3. Please let us know if there is any better approach that can be
>>>>> implemented.
>>>>>
>>>>>
>>>>>
>>>>> Thanks & Regards
>>>>> Uday Kumar
>>>>> _______________________________________________
>>>>> varnish-misc mailing list
>>>>> varnish-misc at varnish-cache.org
>>>>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20230903/d81fcd37/attachment-0001.html>


More information about the varnish-misc mailing list