[Varnish] #1375: Varnish performance appears to be impacted by the presence of many vary headers
Varnish
varnish-bugs at varnish-cache.org
Tue Nov 19 15:02:34 CET 2013
#1375: Varnish performance appears to be impacted by the presence of many vary
headers
----------------------+----------------------
Reporter: closer01 | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.4 | Severity: normal
Keywords: |
----------------------+----------------------
Varnish performance appears to be significantly impacted by the presence
of many vary headers
== Background ==
We use Varnish as our primary caching layer on a large platform.
We've seen unpredictable behaviour in Varnish in front of web applications
characterised by
long response times. Having investigated the response times of our
backends and having run
a number of loadtests we believe we have isolated the problem to Varnish's
caching behaviour
around vary headers. Our own applications vary on a number of headers each
with a number of variations.
We believe that Varnish demonstrates a significant performance decrease
when responses
vary on a high number of headers and a substantial performance decrease
when varying on a moderate number of headers with many variations.
== Investigation ==
We've been able to reproduce this problem outside of our platform
environment
on both Varnish 2 and Varnish 3 when running vanilla VCL.
The scenarios detailed below show:
- A page that varies on a high number of headers but only one variation
per header
- A page that varies on one header but with a high number of variations
- A page that varies on a number of headers which have a number of
variations.
We ran our scenarios against both Varnish 2.0.1 and Varnish 3.0.4 for
comparison.
We've included data against Varnish 2.0.1 where we feel it's of interest
but
are raising this as an issue in Varnish 3.0.4.
=== Testing Environment ===
We used AWS as our testing environment, further information on our
instance
sizes are documented here: http://aws.amazon.com/ec2/instance-types
/instance-details/
==== Varnish setup ====
- 1 64-bit, 'General Purpose' m1.medium instance
- RHEL 5.10
- A simple demo node JS application as a backend running locally
==== Loadtest setup ====
- 1 64-bit, 'General Purpose' m1.medium instance,
- Load generated by JMeter against a single endpoint with immediate ramp
up
- Test duration: 3 minutes
- "Concurrent users": 800
Both our Varnish VM and loadtesting VM ran in the same availability zone
and should not be subject to network high network latency.
=== Scenarios ===
==== Many Headers, 1 Variation per Header (Ref: Horizontal) ====
- Page varies on 400 headers
- Each header has only one value
- Each request supplies one randomly chosen header out of the 400
==== Many Variations, 1 Header (Ref: Vertical) ====
- Page varies on 1 header
- This 1 header has 400 variations
- Each request supplies a random value between 1-400 for this single
header
==== Many Variations, Many Headers (Ref: Diagonal) ====
- Page varies on 20 headers
- Each of those headers varies on 20 values
- Each request supplies a random value between 1-20 for a randomly chosen
header
from the 20.
=== Results ===
Attached are graphs showing response times over time for each Varnish
3.0.4 scenario.
==== Response Times ====
v2 - Varnish 2.0.1,
v3 - Varnish 3.0.4
|| Varnish || Scenario || Min (ms) || Mean (ms) || Max (ms) || Standard
Deviation (%) || Successful Requests (%) || Throughput (req/sec) ||
|| v2 || Horizontal || 2 || 3948 || 19763 || 3635.8 || 56.68 || 174.2 ||
|| v3 || Horizontal || 3 || 14546 || 139447 || 21897.94 || 95.49 || 28.7
||
|| || || || || || || || ||
|| v2 || Diagonal || 1 || 385 || 1877 || 220.17 || 100 || 219.0 ||
|| v3 || Diagonal || 1 || 374 || 2520 || 214.65 || 100 || 221.0 ||
|| || || || || || || || ||
|| v2 || Vertical || 1 || 288 || 1347 || 170.06 || 100 || 297.6 ||
|| v3 || Vertical || 1 || 282 || 1409 || 168.48 || 100 || 298.2 ||
==== Varnish Stats ====
During 'horizontal' testing we observed both versions of Varnish seem to
frequently restart themselves due to segfaults. Data recorded from
Varnishstat is therefore incomplete for those situations.
|| Varnish || Scenario || Hits || Misses || Percentage Hits (%) ||
|| v2 || Diagonal || 36772 || 6565 || 84.9 ||
|| v3 || Diagonal || 37094 || 6583 || 84.9 ||
|| || || || || || || || ||
|| v2 || Vertical || 51808 || 6824 || 88.4 ||
|| v3 || Vertical || 52114 || 6823 || 88.4 ||
=== Conclusions ===
Our 'horizontal' tests exhibit behaviour in Varnish that result in very
long response times.
Our 'diagonal' tests appear to exhibit a 25% lower throughput, a higher
average response time and higher peaks in response time.
Whilst the 'horizontal' scenario doesn't correspond to a realistic
application, we believe this demonstrates the extremity of
a problem within Varnish. We believe we're experiencing a more extreme
variant of the 'diagonal' behaviour on our own platform and were able to
reproduce this in our initial load tests against that platform. Hence, we
have reason to believe that Varnish's implementation of caching variations
is the root cause.
To add context, we noticed and became interested in this behaviour as some
of our applications vary on up to 10 headers and were failing to respond
under moderate load. We were noticing our backends responding quickly but
when looking through our Varnish logs requests appeared to take a very
long time within Varnish itself.
==== Other Observations ====
In the scenarios above, an equal number of variations should be present in
each scenario and we'd expect to see reasonably consistent behaviour
across those three scenarios. For the available 'diagonal' scenario, our
high hit ratio suggests our requests to the backend should have been low
and given our application was running locally are unable to attribute the
lower throughput to network latency.
With respect to the 'horizontal' scenario, the presence of seg faulting
may also suggest Varnish struggles to cope with pages that vary on many
headers.
== A Solution ==
We're not able to present any solution or patch to improve this behaviour.
Although we have observed changes made in this area of the code
previously:
https://github.com/varnish/Varnish-
Cache/commit/7bc0068d8f422c917042e35867e00a19f8956f46
=== Attachments ===
Graphs for Varnish 3.0.4 test results:
- Horizontal.jpg
- Vertical.jpg
- Diagonal.jpg
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1375>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list