VCL language

Tue Mar 28 19:39:09 CEST 2006

> In message <2607.193.213.34.102.1143499332.squirrel at denise.vg.no>, "Anders
> Berg
> " writes:
>
>>seeing that Poul-Henning is back to the VCL compiler again in the code, I
>>think it is time to start a more detailed discussion about the VCL.
>
> Good time for it.

I thought so :)

>>My guess is that we are gonna spend "alot" of time on it, and it could be
>>a natural part of a face-to-face meeting. I also acknowledge that the
>>sooner we "freeze" the language, the easier and less rewrite of code
>>Poul-Henning has to do.
>
> It's important to keep two things clear of each other here.  On one hand
>...
> The other part, the variables and operations is in need for being
> hashed out, because that is the next bit that I need to start working
> on:  Calling into the compiled VCL program from the cache process.

Yes. I see that.

> But there is a lot of code yet to be written before this stuff gets
> in the critical path, so there is no need to yell "emergency" or
> anything like it :-)

Good. Hehe.

>>I think our small "proof-of-concept" and the general look-and-feel of
>> VCL,
>>will make it suitable and _really_ good for Varnish.
>
> I think VCL is the bit which will make people sit up and take notice :-)

I agree on that. Thats why we have to try and get it right the first time :)

>>I/We haven't gotten down to trying/thinking/poking/defining/documenting
>>the VCL yet, but I _think_ I might have come up with a "system" to make
>>VCL easier to understand, and possible easier to code both for
>>Poul-Henning and the end user. I am attaching 2 documents:
>>vcl_diagram_v1.png and vcl_diagram_proposal.png (*v1 is approx. what we
>>have today)
>
> I can't say I have thought deeply about the data model yet, my initial
> mock-up was based on a semi-object oriented model where we basically
> had four data objects:
>
> 	Client	(Who asked)
> 		IP#
> 		Bandwidth estimate
> 		failed requests
> 		user agent
> 		...
>
> 	Request	(What they asked for)
> 		URL
> 		HEAD/GET/other
> 		Headers
> 		...
>
> 	Object	(Document in our cache)
> 		ttl
> 		length
> 		usage count
> 		refresh count
> 		...
>
> 	Backend	(Where we can get documents from)
> 		IP#
> 		responsetime
> 		...

Yes, and it can work, but I think we will end up finding variables that
really don't classify for either place. As you mention later, we should
try and mock up some cfg's and see.

>>Also, the object (document if you like) has 2 sets of variables. For
>>example I think that backend.obj.usage and client.obj.usage makes sense.
>>Lets say it's a number/factor to say how often this object is
>>used/refreshed. A JPEG will have a low backend.obj.usage (since it
>> typical
>>is not often requested from backend) but client.obj.usage will be high
>>(because its requested often, logo etc...). I can also think of more uses
>>here.
>
> I don't disagree with the two different usable numbers, but I think
> I do disagree with the naming.  What you call backend.obj.usage isn't
> really a usage count, it is a refresh count, and since the client
> can't do that, just object.refreshcount would work without confusion.

Okay, I see. I can imagine scenarios where the object class/variables will
end up beeing used in a backend "context" and a frontend/client context.
Time will show if this will be confusing. I am not "against" what we
already have, just trying to categorize it a bit more, see if it works out
good.

> A fundamental rule in object-oriented programming is to make sure
> you have a good correspondence between your objects and the real world
> objects they represent, and I think splitting the "object" (or should
> we call it "document" instead ?) into a client and a backend side
> misses the point about the cache:  It is the cached documents which
> are interesting here.

Agree. We should try to represent what we wanna use the object (in a
object-oriented sense) for. Maybe we should call it a document.

> The other thing I would like to point out is that a given document
> does not have a static mapping to a backend.
>
> For instance, we may pick it up from a peer server during startup
> and then subsequently refresh it from one of a number of backends
> whenever it is in danger of expiring.
>
> So the object/document clearly cannot be tied to a particular
> backend without severely constraining the efficiency and flexibility.

Good point. I wasn't trying to map it to a specific backend (since, as you
say, that is not the point).

>
> But anyway, at the risk of sounding like a broken record, I think the
> best way to find out how VCL should develop is to try and use it,
> so lets sit down and write an actual real-life VCL program for VG's
> site, and see what we find out along the way.

I agree. We/I should mock some up. It's just that since Squid has such
"restrictions" as i has today, my squid.cfg files are not "advanced". But
I am sure I could come up with a "dream-scenario" .cfg :) Not only for VG
Nett (www.vg.no) but also for all the other sites (live.vg.no, tpn.vg.no
etc.) that I use Bluecoat for today in 1 big .cfg file.
You are far from sounding like a broken record :)

> Poul-Henning
>
> --
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk at FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by
> incompetence.
>
>