Modifying req.url and restart() produces ncsa to log wrong URL resource

Marian Velez marianmvelez at gmail.com
Tue Sep 29 20:08:08 UTC 2020


Hiya all!

I've been trying to implement an abtest engine on my varnish. The problem
I've come across is related to the manipulation of the original req.url
property.

What I'm experiencing is that when setting a new "req.url" value and after
forcing a restart(), the URL that varnishncsa uses to log is just the one
it has set when it re-enters recv flow, but not the one it uses from the
backend or the last req.URL object either (which will even have much more
sense since).

I've made a sample snippet of what I observe just for the sake of
reproducing it and to take away all the abtest logic that may be just
confusing.

vcl 4.1;
backend default {
 .host = "127.0.0.1";
 .port = "8000";
}

sub vcl_recv {
  if (req.restarts == 0) {
    set req.http.X-Original-URL = req.url;
    set req.url = "/this_url_does_not_exist";
    return(restart);
  }
  else {
    #restore original URL
    set req.url = req.http.X-Original-URL;
  }
}

That vcl just sets an alternative req.url and triggers a flow restart.
varnishncsa logs this entry

::1 - - [29/Sep/2020:16:49:08 -0300] "GET
http://localhost:6081/this_url_does_not_exist HTTP/1.1" 200 12 "-"
"curl/7.72.0"


Of course that URL does not exist, but since its using the req.URL object
from which varnish recv flow started, it just keeps that for logging, and
not any later req.URL set.

Here attached is the varnishlog output from a fresh start:

*   << Request  >> 2
-   Begin          req 1 rxreq
-   Timestamp      Start: 1601409664.461391 0.000000 0.000000
-   Timestamp      Req: 1601409664.461391 0.000000 0.000000
-   VCL_use        boot
-   ReqStart       ::1 55180 a0
-   ReqMethod      GET
-   ReqURL         /this_object_exists
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: localhost:6081
-   ReqHeader      User-Agent: curl/7.72.0
-   ReqHeader      Accept: */*
-   ReqHeader      X-Forwarded-For: ::1
-   VCL_call       RECV
-   ReqHeader      X-Original-URL: /this_object_exists
-   ReqURL         /this_url_does_not_exist
-   VCL_return     restart
-   VCL_call       HASH
-   VCL_return     lookup
-   Timestamp      Restart: 1601409664.461429 0.000037 0.000037
-   Link           req 3 restart
-   End

*   << BeReq    >> 4
-   Begin          bereq 3 fetch
-   VCL_use        boot
-   Timestamp      Start: 1601409664.461553 0.000000 0.000000
-   BereqMethod    GET
-   BereqURL       /this_object_exists
-   BereqProtocol  HTTP/1.1
-   BereqHeader    User-Agent: curl/7.72.0
-   BereqHeader    Accept: */*
-   BereqHeader    X-Forwarded-For: ::1
-   BereqHeader    host: localhost:6081
-   BereqHeader    Accept-Encoding: gzip
-   BereqHeader    X-Varnish: 4
-   VCL_call       BACKEND_FETCH
-   VCL_return     fetch
-   BackendOpen    26 default 127.0.0.1 8000 127.0.0.1 59480 connect
-   Timestamp      Bereq: 1601409664.462196 0.000642 0.000642
-   Timestamp      Beresp: 1601409664.463115 0.001561 0.000918
-   BerespProtocol HTTP/1.0
-   BerespStatus   200
-   BerespReason   OK
-   BerespHeader   Server: SimpleHTTP/0.6 Python/3.8.6
-   BerespHeader   Date: Tue, 29 Sep 2020 20:01:04 GMT
-   BerespHeader   Content-type: application/octet-stream
-   BerespHeader   Content-Length: 12
-   BerespHeader   Last-Modified: Tue, 29 Sep 2020 19:36:31 GMT
-   TTL            RFC 120 10 0 1601409664 1601409664 1601409664 0 0
cacheable
-   VCL_call       BACKEND_RESPONSE
-   VCL_return     deliver
-   Filters
-   Storage        malloc s0
-   Fetch_Body     3 length stream
-   BackendClose   26 default close
-   Timestamp      BerespBody: 1601409664.463236 0.001682 0.000120
-   Length         12
-   BereqAcct      155 0 155 199 12 211
-   End

*   << Request  >> 3
-   Begin          req 2 restart
-   Timestamp      Start: 1601409664.461429 0.000037 0.000000
-   ReqStart       ::1 55180 a0
-   ReqMethod      GET
-   ReqURL         /this_url_does_not_exist
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: localhost:6081
-   ReqHeader      User-Agent: curl/7.72.0
-   ReqHeader      Accept: */*
-   ReqHeader      X-Forwarded-For: ::1
-   ReqHeader      X-Original-URL: /this_object_exists
-   VCL_call       RECV
-   ReqURL         /this_object_exists
-   ReqUnset       X-Original-URL: /this_object_exists
-   ReqUnset       Host: localhost:6081
-   ReqHeader      host: localhost:6081
-   VCL_return     hash
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       MISS
-   VCL_return     fetch
-   Link           bereq 4 fetch
-   Timestamp      Fetch: 1601409664.463212 0.001821 0.001783
-   RespProtocol   HTTP/1.0
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Server: SimpleHTTP/0.6 Python/3.8.6
-   RespHeader     Date: Tue, 29 Sep 2020 20:01:04 GMT
-   RespHeader     Content-type: application/octet-stream
-   RespHeader     Content-Length: 12
-   RespHeader     Last-Modified: Tue, 29 Sep 2020 19:36:31 GMT
-   RespProtocol   HTTP/1.1
-   RespHeader     X-Varnish: 3
-   RespHeader     Age: 0
-   RespHeader     Via: 1.1 varnish (Varnish/6.5)
-   VCL_call       DELIVER
-   VCL_return     deliver
-   Timestamp      Process: 1601409664.463225 0.001833 0.000012
-   Filters
-   RespHeader     Accept-Ranges: bytes
-   RespHeader     Connection: keep-alive
-   Timestamp      Resp: 1601409664.463286 0.001894 0.000061
-   ReqAcct        96 0 96 299 12 311
-   End

*   << Session  >> 1
-   Begin          sess 0 HTTP/1
-   SessOpen       ::1 55180 a0 ::1 6081 1601409664.461310 18
-   Link           req 2 rxreq
-   SessClose      REM_CLOSE 0.002
-   End

As you can see from the FETCH block, what it effectively fetches from
backend is not what varnishncsa logs.
I think it has something to do with the order in which the restart affects
the flow, since its not replacing the req.url object, just appending a new
one. BACKEND FETCH uses the right URL object since it uses the last one,
but varnishncsa logs the first it picked up from that restart.

I've tried setting a blank req.url or unsetting in an attempt to avoid
having several req.URL objects introducing problems, but it seems that
property is readonly since I get errors like "Variable cannot be unset" or
"Setting req.url to empty sting - FAILED".

Unfortunately I have no other way to implement my abtest engine than by
using the restart() order from varnish, but this logging issue is extremely
affecting what I can see to debug my stuff.

Any clue of what trick I could use to workaround this?

Thanks in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20200929/64cde78a/attachment.html>


More information about the varnish-misc mailing list