[Varnish] #1762: VSL API: endless loop and out of memory in vtx_scan() on forced synthetic transactions

Varnish varnish-bugs at varnish-cache.org
Tue Jul 14 18:49:53 CEST 2015


#1762: VSL API: endless loop and out of memory in vtx_scan() on forced synthetic
transactions
----------------------+----------------------------------
 Reporter:  geoff     |       Owner:  slink
     Type:  defect    |      Status:  assigned
 Priority:  normal    |   Milestone:  Varnish 4.0 release
Component:  varnishd  |     Version:  4.0.3
 Severity:  critical  |  Resolution:
 Keywords:            |
----------------------+----------------------------------

Comment (by slink):

 Further investigating into root cause scenarios in order to write a
 regression resulted in the following insights:

 * the bad vxid must have got into vtx->key.vxid by way of `vtx_parse_link`
 * which is only called for `SLT_Begin` (`vtx_scan_begin()`) and `SLT_Link`
 (`vtx_scan_link()`)

 (actually this was known before, but I am now confident that these are the
 only cases)

 There is no case in the code as of 4.0.3 release where `SLT_Begin` is
 emitted with an unmasked vxid, so the issue must be root casue in an
 `SLT_Link` link record.

 In both cases where unmasked vxids are emitted for `SLT_Link`, the id
 comes directly from `VXID_Get()`:
 * `cache_fetch.c`
 {{{
 wid = VXID_Get(&wrk->vxid_pool);
 VSLb(bo->vsl, SLT_Link, "bereq %u retry", wid);
 }}}
 * `cache_req_fsm.c`
 {{{
 wid = VXID_Get(&wrk->vxid_pool);
 // XXX: ReqEnd + ReqAcct ?
 VSLb_ts_req(req, "Restart", W_TIM_real(wrk));
 VSLb(req->vsl, SLT_Link, "req %u restart", wid);
 }}}

 So unless I have overseen anything significant, the root cause must have
 been a vxid spill, which was fixed with
 0dd8c0b864a9574df0f2891824b4581d0e846613 (master) /
 171f3ac585f2bda639f526c31ad0689aecb8f8b4 (4.0)

 `VXID()` masking would have avoided the issue to surface.

 This insight is consistent with two observations:
 * the issue only surfaced after `varnishd` running for longer periods of
 time
 * the issue didn't go away after a restart of the vsl client, a `varnishd`
 restart was required

 This gives confidence that the issue has really been understood completely
 and that the root cause has been fixed.

-- 
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1762#comment:9>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator



More information about the varnish-bugs mailing list