Seeing as how you're testing with only a concurrency of 10, I doubt thread
startup is a big issue, but nevertheless, 12 threads is too low if you
intend to use this for production. I advice setting thread_pool_min to
reflect your actual expected load.

I also see a few expiries, it's not really easy to tell from varnishstat
what is causing the slowdowns, but it I would try setting up grace to
ensure that cache misses don't drag down a significant number of threads.

However, with only 10 concurrent requests, even one request going to the
backend will cause a significant hit to the requestrate. To solve this,
you'll have to either increase the concurrency of the test and/or increase
the lifespan of the cached objects.

Other than that, varnishstat -1 and some tinkering with varnishtop could
further track down the lousy performance you're seeing.

