Varnish intermittently returns incomplete images
Batanun B
batanun at hotmail.com
Fri May 8 18:23:08 UTC 2020
Hi,
Well, sure there are some objects that are rather big (for a regular web site, up to maybe 50 MB), but most objects are maybe 10-100 kB. The last image I tried, that had intermittent problems, was about 700 kB.
Some numbers from varnishstat:
MAIN.uptime: 23+01:28:11
MAIN.n_lru_nuked: 887748
MAIN.n_lru_limited: 459
SMA.s0.c_bytes: 17.59G
SMA.s0.c_freed: 17.49G
SMA.s0.g_bytes: 99.34M
SMA.s0.g_space: 607.18K
"n_lru_nuked" seems high. Would you recommend a bigger cache in this case?
Below is the output from "varnishadm param.show". I'm suspecting that when we did the initial tweaking (actually only focusing on the vcl logic, not cache sizes) we glanced at this output when the server was recently started, and didn't have much traffic. Now the server has been running for a while, and the traffic has increased (still testing environment only though).
------
accept_filter -
acceptor_sleep_decay 0.9 (default)
acceptor_sleep_incr 0.000 [seconds] (default)
acceptor_sleep_max 0.050 [seconds] (default)
auto_restart on [bool] (default)
backend_idle_timeout 60.000 [seconds] (default)
backend_local_error_holddown 10.000 [seconds] (default)
backend_remote_error_holddown 0.250 [seconds] (default)
ban_cutoff 0 [bans] (default)
ban_dups on [bool] (default)
ban_lurker_age 60.000 [seconds] (default)
ban_lurker_batch 1000 (default)
ban_lurker_holdoff 0.010 [seconds] (default)
ban_lurker_sleep 0.010 [seconds] (default)
between_bytes_timeout 60.000 [seconds] (default)
cc_command exec gcc -g -O2 -fdebug-prefix-map=/build/varnish-ZKkrdt/varnish-6.0.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Werror -Wno-error=unused-result -pthread -fpic -shared -Wl,-x -o %o %s (default)
cli_limit 48k [bytes] (default)
cli_timeout 60.000 [seconds] (default)
clock_skew 10 [seconds] (default)
clock_step 1.000 [seconds] (default)
connect_timeout 3.500 [seconds] (default)
critbit_cooloff 180.000 [seconds] (default)
debug none (default)
default_grace 10.000 [seconds] (default)
default_keep 0.000 [seconds] (default)
default_ttl 120.000 [seconds] (default)
esi_iovs 10 [struct iovec] (default)
feature none (default)
fetch_chunksize 16k [bytes] (default)
fetch_maxchunksize 0.25G [bytes] (default)
first_byte_timeout 60.000 [seconds] (default)
gzip_buffer 32k [bytes] (default)
gzip_level 6 (default)
gzip_memlevel 8 (default)
h2_header_table_size 4k [bytes] (default)
h2_initial_window_size 65535b [bytes] (default)
h2_max_concurrent_streams 100 [streams] (default)
h2_max_frame_size 16k [bytes] (default)
h2_max_header_list_size 2147483647b [bytes] (default)
h2_rx_window_increment 1M [bytes] (default)
h2_rx_window_low_water 10M [bytes] (default)
http_gzip_support on [bool] (default)
http_max_hdr 64 [header lines] (default)
http_range_support on [bool] (default)
http_req_hdr_len 8k [bytes] (default)
http_req_size 32k [bytes] (default)
http_resp_hdr_len 8k [bytes] (default)
http_resp_size 32k [bytes] (default)
idle_send_timeout 60.000 [seconds] (default)
listen_depth 1024 [connections] (default)
lru_interval 2.000 [seconds] (default)
max_esi_depth 5 [levels] (default)
max_restarts 4 [restarts] (default)
max_retries 4 [retries] (default)
nuke_limit 50 [allocations] (default)
pcre_match_limit 10000 (default)
pcre_match_limit_recursion 20 (default)
ping_interval 3 [seconds] (default)
pipe_timeout 60.000 [seconds] (default)
pool_req 10,100,10 (default)
pool_sess 10,100,10 (default)
pool_vbo 10,100,10 (default)
prefer_ipv6 off [bool] (default)
rush_exponent 3 [requests per request] (default)
send_timeout 600.000 [seconds] (default)
shm_reclen 255b [bytes] (default)
shortlived 10.000 [seconds] (default)
sigsegv_handler on [bool] (default)
syslog_cli_traffic on [bool] (default)
tcp_fastopen off [bool] (default)
tcp_keepalive_intvl 75.000 [seconds] (default)
tcp_keepalive_probes 9 [probes] (default)
tcp_keepalive_time 7200.000 [seconds] (default)
thread_pool_add_delay 0.000 [seconds] (default)
thread_pool_destroy_delay 1.000 [seconds] (default)
thread_pool_fail_delay 0.200 [seconds] (default)
thread_pool_max 5000 [threads] (default)
thread_pool_min 100 [threads] (default)
thread_pool_reserve 0 [threads] (default)
thread_pool_stack 48k [bytes] (default)
thread_pool_timeout 300.000 [seconds] (default)
thread_pool_watchdog 60.000 [seconds] (default)
thread_pools 2 [pools] (default)
thread_queue_limit 20 (default)
thread_stats_rate 10 [requests] (default)
timeout_idle 5.000 [seconds] (default)
timeout_linger 0.050 [seconds] (default)
vcc_allow_inline_c off [bool] (default)
vcc_err_unref on [bool] (default)
vcc_unsafe_path on [bool] (default)
vcl_cooldown 600.000 [seconds] (default)
vcl_dir /etc/varnish:/usr/share/varnish/vcl (default)
vcl_path /etc/varnish:/usr/share/varnish/vcl (default)
vmod_dir /usr/lib/varnish/vmods (default)
vmod_path /usr/lib/varnish/vmods (default)
vsl_buffer 4k [bytes] (default)
vsl_mask -ObjProtocol,-ObjStatus,-ObjReason,-ObjHeader,-VCL_trace,-WorkThread,-Hash,-VfpAcct,-H2RxHdr,-H2RxBody,-H2TxHdr,-H2TxBody (default)
vsl_reclen 255b [bytes] (default)
vsl_space 80M [bytes] (default)
vsm_free_cooldown 60.000 [seconds] (default)
vsm_space 1M [bytes] (default)
workspace_backend 64k [bytes] (default)
workspace_client 64k [bytes] (default)
workspace_session 0.50k [bytes] (default)
workspace_thread 2k [bytes] (default)
------
________________________________
From: Guillaume Quintard <guillaume at varnish-software.com>
Sent: Friday, May 8, 2020 7:34 PM
To: Batanun B <batanun at hotmail.com>
Cc: varnish-misc at varnish-cache.org <varnish-misc at varnish-cache.org>
Subject: Re: Varnish intermittently returns incomplete images
Hi,
Do you have objects that are sensibly smaller that your images in your cache?
What you are describing sounds like LRU failure (check nuke_limit in "varnishadm param.show"), basically, on a miss, varnish couldn't evict enough objects and make room for the new object, so it had to truncate it and throw it away.
If that's the issue, you can increase nuke_limit, or get a bigger cache, or segregate small and large objects into different storages.
--
Guillaume Quintard
On Fri, May 8, 2020 at 10:14 AM Batanun B <batanun at hotmail.com<mailto:batanun at hotmail.com>> wrote:
Our Varnish (test environment) intermittently returns incomplete images. So the binary content is not complete. When requesting the image from the backend directly (using curl), the complete image is returned every time (I tested 1000 times using a script).
This happens intermittently. Sometimes Varnish returns the complete image, sometimes half of it, sometimes 20% etc... The incomplete image is returned quickly, so I don't think there is a timeout involved (we have not configured any specific timeout in varnish).
I see nothing special in varnishlog when this happens. But I don't know how to troubleshoot this in a good way. Any suggestions?
_______________________________________________
varnish-misc mailing list
varnish-misc at varnish-cache.org<mailto:varnish-misc at varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20200508/2a8ee196/attachment-0001.html>
More information about the varnish-misc
mailing list