100% CPU IOwait

Tue Nov 3 17:15:35 CET 2015

Hi Per,

Correcting: At the time the machine was Ubuntu not Amazon Linux.

We didn't have the proper monitoring at the time and after that we did it
some load testing with the request urls and we couldn't reproduce the
error. The other machine had this line of error in the syslog.

varnishd[22217]: segfault at 18 ip 00007f061ea62565 sp 00007ef8a87e8170
error 4 in libjemalloc.so.1[7f061ea57000+30000]

Similar to this:
https://bugs.launchpad.net/ubuntu/+source/jemalloc/+bug/1333581

Hi Paul,

The long TTL would apply for grace? Size down would help the cache evict?

On Tue, Nov 3, 2015 at 1:30 PM Per Buer <perbu at varnish-software.com> wrote:

> Hi,
>
> On Tue, Nov 3, 2015 at 4:08 PM, Caires Vinicius <cairesvs at gmail.com>
> wrote:
>
>> We had some problems with malloc with the same kind of aws instance and
>> the -s malloc,5.8G(80% of the memory total). The only trace of the error
>> was a cannot fork cannot allocate memory into syslog. We're probably
>> missing some point, maybe the instance size ins't the right fit for us.
>>
>
>
> This sounds like your running out of virtual memory. Maybe you're running
> without swap space?
>
> Per.
>
>
>>
>>
>> On Tue, Nov 3, 2015 at 10:56 AM Per Buer <perbu at varnish-software.com>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> On Tue, Nov 3, 2015 at 1:42 PM, Caires Vinicius <cairesvs at gmail.com>
>>> wrote:
>>>
>>>> We've started to use Varnish 4 with Amazon Linux with EBS SSD of 40GB,
>>>> memory of 7.5GB. We use the file storage with 20G allocated with ttl of 11
>>>> minutes and grace of 5 hours, all the other configs are standard.
>>>>
>>> I would, on a general basis, recommend against using the file backend.
>>> It will start to struggle with fragmentation relatively quickly and the
>>> performance isn't all that great (lots of unnecessary synchronous reads).
>>>
>>>> Sometimes when we have a lot of request that result into cache miss we
>>>> started to notice that our request latency grows and the iowait stays at
>>>> 100%, something similar to this
>>>> https://www.varnish-cache.org/lists/pipermail/varnish-misc/2008-April/01...
>>>> <https://www.varnish-cache.org/lists/pipermail/varnish-misc/2008-April/016139.html>.
>>>> And our threads reaches the maximum (1000).
>>>>
>>>> Do you guys have any idea why is that?
>>>>
>>> Yeah. New objects get assign a piece of memory, starts writing, triggers
>>> pagefault, kernel takes over and reads/merges the underlying page, varnish
>>> then overwrites that page which then gets written back to disk. This
>>> naturally slows down delivery so Varnish spawns new threads.
>>>
>>> Try malloc. You should start with -s malloc,30G or there about - if you
>>> have lots of small objects you might need to go a bit down to avoid
>>> swapping.
>>>
>>> Not related: You should also move /var/lib/varnish onto tempfs. Linux
>>> will do a lot of writing if the shared memory segment is visible on a
>>> filesystem that is backed by a disk.
>>> --
>>> *Per Buer*
>>> CTO | Varnish Software AS
>>> Cell: +47 95839117
>>> We Make Websites Fly!
>>> www.varnish-software.com
>>> <http://info.varnish-software.com/signature>
>>>
>>
>
>
> --
> *Per Buer*
> CTO | Varnish Software AS
> Cell: +47 95839117
> We Make Websites Fly!
> www.varnish-software.com
> <http://info.varnish-software.com/signature>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20151103/ff95c00f/attachment.html>