Understanding 503s

Maninder Singh mandys at gmail.com
Fri Apr 23 07:02:38 UTC 2021


I finally figured out why this was happening.

Hope this helps someone.

We were running php-fpm and had the following configuration.

pm = dynamic
pm.max_children = 166
pm.start_servers = 16
pm.min_spare_servers = 8
pm.max_spare_servers = 16

This was working fine with the usual load.
But, we found that whenever there was a spike it led to an increase in 503s.

This was due to start_servers set to 16.

php-fpm takes a sec to spawn more processes and during that time we see
503s.

For a high traffic site ( like ours ), we had to set this to
pm = static
pm.max_children = 125

The above values are kept keeping in mind our RAM.
These would be different for others.

Now, we don't see any 503s as the server is prepared to handle more
connections.

Thanks,

On Thu, 15 Apr 2021 at 14:03, Maninder Singh <mandys at gmail.com> wrote:

> Apache runs on port 8080 and is not open to the outside world.
> All requests are routed through varnish but then not all requests are
> cached.
>
> I guess in that case, varnish becomes the only client for apache.
> So, I should increase the KeepAliveTimeout.
>
>
>
> On Thu, 15 Apr 2021 at 13:45, Dridi Boukelmoune <dridi at varni.sh> wrote:
>
>> On Thu, Apr 15, 2021 at 7:27 AM Maninder Singh <mandys at gmail.com> wrote:
>> >
>> > Thank you Dridi.
>> > This is very helpful.
>> >
>> > FYI - My apache keepalive is
>> > KeepAliveTimeout 3
>> >
>> > You would suggest increasing this to 5-10 ?
>>
>> If varnish is httpd´s only client then increase it to 70s. Varnish
>> will close unused connections after 60s by default, and if it´s really
>> really busy that gives a 10s window for the actual shutdown to happen.
>>
>> If there are other direct clients in front of your httpd server, then
>> decrease backend_idle_timeout in varnishd to 2s, but then you will
>> force varnish to establish connections more often. This is already the
>> case of course, but at least that will reduce the risk of reusing a
>> closed connection and failing backend fetches for this reason.
>>
>> > We had lowered the KeepAliveTimeout as the server is a very busy one
>> and we want to handle many connections.
>>
>> I understand, and there´s a good reason to have a low default when you
>> can´t trust the clients. It boils down to whether your httpd server is
>> openly accessible to more than just varnish, including potentially
>> malicious clients.
>>
>> Dridi
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20210423/f6616399/attachment.html>


More information about the varnish-misc mailing list