rdbms as backend

Thu Aug 1 18:52:34 CEST 2013

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Leif,

Thanks for interest in this topic! Nobody else picked it up it seems. :-)

To tell the truth it's somewhat different:

1. 30K rps *total* is what we have to handle at typical hrs.

2. 7K rps is what I managed to get ONE INSTANCE of Varnish to achieve
*as cache for one of our HTTP backends*. (our legacy solution caches
Oracle, MySQL *and* HTTP backends, transforming it all into JSON/HTTP
responses, used by lots of different client libraries and subsystems -
this is all internal infrastructure).

3. 20-30 instances of node.js spread across several machines + nginx
at the front would be doable for obvious reasons of load balancing and
failover (that's what tests of working prototype implemented in
node.js suggest). That's sort of plan A. I'm investigating plan B.

What I don't like about plan A is that it's not only heavy and costly,
what's even worse is that we have to build our own HA and load
balancing into it that like most of infrastructual stuff somebody
develops for themselves is half-baked and coded in haste. Another
piece of old unmaintainable cruft in the making, after telling
ourselves once again "this time it will be different" (no it won't be
unless we take a different approach).

Varnish has probes, load balancing, random and round robin failover
and load balancing that's (probably?) battle-tested by many people and
companies and frankly the bulk of dev costs is on somebody else's
shoulders.

I do not like building caching server myself anymore than I like
building nginx or apache replacements myself.

All clients talk http anyway. Some backends talk http anyway.

So the only thing I'd have to do to achieve nirvana would be making
databases translate their result sets into JSON and make them
available over http. Which eventually we have to do anyway at some
place (lots of different clients, can't rewrite and upgrade them all
anyway on version change of mysql from X to Y).

We have already done loose coupling in the databases: to avoid having
to rewrite SQL on every upgrade or possibly switch to another DB, we
implement everything possible in dbs as stored procedures and use dirt
simple queries calling those stored procedures, so SQL "frontend"
stays the same while you can tweak stored procedure behind it to your
liking. There's only a single step from there to query result
uniformization.

Admittedly, this sort of thing - plugging db into varnish - looks
weird, even outlandish. But it's so logical and fits so well I have
trouble giving up this thought!

I may give up though and simply add another layer between varnish and
databases.

Regards,
MK

W dniu 8/1/2013 17:57, Leif Pedersen pisze:
> Hm, lemmie step back a sec. So as I understand, you currently get
> 30k frontend requests per sec, and with Varnish to cache results,
> you have about 7k backend requests per sec. Does this line up now?
> 
> Seems to me that if you're provisioning the middleware for 7k rqs 
> instead of 30k rqs, the problem is much easier to solve. It may
> require a few machines, but it sounds like your DB costs are so
> high that saving you 76% on database traffic would be an easy
> budget to meet. You've done FAPWS3 on one machine at 3k rps? How
> about simply running that solution on 3 machines plus a spare or
> two, which should provision it for about 9k rps? One nice thing
> about this approach is that it's usually far easier to add (and
> fail over) HTTP nodes and database clients than to add database
> servers.
> 
> I'm pragmatic in this. If the middleware costs a lot more than a
> custom vmod to connect to the DBs, then I'd most likely do that. My
> skepticism is just that it doesn't seem likely. So I won't answer
> your question ("you would not connect to DB directly either?") in
> the absolute affirmative. However, with an experienced guess and
> only a little information about your problem space, that is my
> inclination, yes. And I wouldn't build the vmod for purity or fun
> -- it sounds to me like a daunting gnarly thing with lots of
> maintainability issues. But don't let me tell you not to, if you
> really believe in the cause. :)
> 
> With deference to the authors, I'd be a bit astonished to see such
> a vmod in Varnish's distribution. But if it's worth it in
> comparison to a middleware solution (be it Python, node.js, C++, or
> whatever), the results would certainly be interesting as a
> third-party vmod if you don't mind sharing.
> 
> - Leif
> 
> 
> --
> 
> As implied by email protocols, the information in this message is 
> not confidential.  Any middle-man or recipient may inspect,
> modify, copy, forward, reply to, delete, or filter email for any
> purpose unless said parties are otherwise obligated.  As the
> sender, I acknowledge that I have a lower expectation of the
> control and privacy of this message than I would a post-card.
> Further, nothing in this message is legally binding without
> cryptographic evidence of its integrity.
> 
> http://bilbo.hobbiton.org/wiki/Eat_My_Sig
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJR+pJSAAoJEFMgHzhQQ7hOEjIH/R3MrpOdXSPeBDZVTgrhg63U
lUjsuobvDJDYXeMSoBNI24aeFUdCwnWrPJjNHN1qG9TAZl7+Rtq0muLkHb/ToAlx
n4L7A18omg1Lqp9SAHboL4+OHgpBSfNOLDtuuXS6L1NoOxkdWJxAyBVCrz1x+QqO
HxvKWPy0pVUYwX6P9tdjTTNIjlzJpNrshV036MCe3cdnFRMvlR1sFGOKc8DGQjEb
6VZ7pxQEzCFj6D9cYfe4a8X3x46nRMshlD+k3su2Zsp7t/450mEKS4LiJf9H6Aht
rqhZl1LHbXm5dAghJxC4FZOhsr2HTgpQQSdHgA79gv+Ej1GhkO9iAHcjI2XPouA=
=92FJ
-----END PGP SIGNATURE-----