VCL BNF

Kacper Wysocki kacperw at gmail.com
Sun Mar 13 17:30:44 CET 2011


Varnish Control Language grammar in BNF notation
================================================

The VCL compiler is a one-step lex-parse-prune-symtable-typecheck-emit compiler.
Having looked for it several times myself, and having discussed it
with several others the conclusion was that VCL needs a proper
grammar. Grammars, as many know, are useful in several circumstances.

BNF based on PHK's precedence rules
http://www.varnish-cache.org/docs/trunk/phk/vcl_expr.html
as well as vcc_Lexer and vcc_Parse from HEAD.
For those of us unfamiliar with BNF:
http://www.cui.unige.ch/db-research/Enseignement/analyseinfo/AboutBNF.html

Note on BNF syntax: As the BNF canon is somewhat unweildy, I've opted
for the convention of specifying terminal tokens in lowercase, while
non-terminals are denoted in UPPERCASE. Optional statements are the
usual [..] and repeated statements are {..}. To improve portability
there are quotes around literals as this does not sacrifice
readability.

As for token and production names, I've tried to stay as true to the
source code as possible without sacrificing readability.

As an extension to BNF I have included comments, which are lines
starting with '#'.
I have attempted to comment grammar particular to major versions of
Varnish and other notables. I have not backward-checked the grammar,
and would appreciate comments on what grammar differences we see in
V2.0 and 2.1 as compared to 3.0.

There are bound to be bugs. Feedback and comments appreciated.

v0.1 .. not yet machine parsable(?)!

Nonterminals
------------

VCL ::= ACL | SUB | BACKEND | DIRECTOR | PROBE | IMPORT | CSRC

ACL ::= 'acl' identifier '{' {ACLENTRY} '}'
SUB ::= 'sub' identifier COMPOUND
BACKEND ::= 'backend' identifier '{' { ['set|backend'] BACKENDSPEC } '}'
PROBE ::= 'probe' identifier '{' PROBESPEC '}'
# VMod imports are new in 3.0
IMPORT ::= 'import' identifier [ 'from' string ] ';'
CSRC ::= 'C{' inline-c '}C'

# director definitions - simple variant
DIRECTOR ::= 'director' dirtype identifier '{' DIRSPEC '}'
dirtype ::= 'hash' | 'random' | 'client' | 'round-robin' | 'dns'

# can do better: specify production rule for every director type
DIRECTOR ::=
    'director' ('hash'|'random'|'client')' identifier '{' DIRSPEC '}'
    'director' 'round-robin' identifier '{' { '.' BACKENDEF } '}'
    'director' 'dns' identifier '{' DNSSPEC '}'

DIRSPEC ::=
      [ '.' 'retries' '=' uintval ';' ]
      { '{' '.' BACKENDEF [ '.' 'weight' '=' numval ';' ] '}' }

DNSSPEC ::=
    { '.' BACKENDEF }
    [ '.' 'ttl' '=' timeval ';' ]
    [ '.' 'suffix' '=' string ';' ]
    [ '.' DNSLIST ]

DNSLIST ::= '{' { iprange ';' [ BACKENDSPEC ] } '}'

BACKENDEF ::= 'backend' ( BACKENDSPEC | identifier ';' )

# field spec as used in backend and probe definitions
SPEC ::= '{' { '.' identifier = fieldval ';' } '}'
# can do better: devil is in the detail on this one
BACKENDSPEC ::=
      '.' 'host' '=' string ';'
   |  '.' 'port' '=' string ';'
# wow I had no idea...
   |  '.' 'host_header' '=' string ';'
   |  '.' 'connect_timeout''=' timeval ';'
   |  '.' 'first_byte_timeout' '='  timeval ';'
   |  '.' 'between_bytes_timeout' '=' timeval ';'
   |  '.' 'max_connections '=' uintval ';'
   |  '.' 'saintmode_treshold '=' uintval ';'
   |  '.' 'probe' '{' {PROBESPEC} '}' ';'
# another woww \0/
   |  '.' 'probe' identifier;

PROBESPEC ::=
      '.' 'url' = string ';'
   |  '.' 'request' = string ';'
   |  '.' 'expected_response' = uintval ';'
   |  '.' 'timeout' = timeval ';'
   |  '.' 'interval' = timeval ';'
   |  '.' 'window' = uintval ';'
   |  '.' 'treshold' =  uintval ';'
   |  '.' 'initial' = uintval ';'


# there is no room in BNF for 'either !(..) or (!..) or !..' (parens optional)
ACLENTRY ::= ['!'] ['('] ['!'] iprange [')'] ';'

# totally avoids dangling else yarr
IFSTMT ::= 'if' CONDITIONAL COMPOUND [ { ('elsif'|'elseif')
CONDITIONAL COMPOUND } [ 'else' COMPOUND ]]

CONDITIONAL ::= '(' EXPR ')'
COMPOUND ::= '{' {STMT} '}'
STMT ::= COMPOUND | IFSTMT | CSRC | ACTIONSTMT ';'
ACTIONSTMT ::= ACTION | FUNCALL

ACTION :==
      'error' [ '(' EXPR(int) [ ',' EXPR(string) ] ')' | EXPR(int) [
EXPR(string) ]
   | 'call' identifier
# in vcl_fetch only
   | 'esi'
# in vcl_hash only
   | 'hash_data' '(' EXPRESSION ')'
   | 'panic' EXPRESSION
# note: purge expressions are semantically special
   | 'purge' '(' EXPRESSION ')'
   | 'purge_url' '(' EXPRESSION ')'
   | 'remove' variable
# V2.0: could do actions without return keyword
   | 'return' '(' ( deliver | error | fetch | hash | lookup | pass |
pipe | restart ) ')'
# rollback what?
   | 'rollback'
   | 'set' variable assoper EXPRESSION
   | 'synthetic' EXPRESSION
   | 'unset' variable

FUNCALL ::= variable  '(' [ { FUNCALL | expr | string-list } ] ')'

EXPRESSION ::= 'true' | 'false' | constant | FUNCALL | variable
   | '(' EXPRESSION ')'
   | number '*' number
   | number '/' number
# add two strings without operator in 2.x series
   | duration '*' doubleval
   | string '+' string
   | number '+' number
   | number '-' number
   | timeval '+' duration
   | timeval '-' duration
   | timeval '-' timeval
   | duration '+' duration
   | duration '-' duration
   | EXPRESSION comparison EXPRESSION
   | '!' EXPRESSION
   | EXPRESSION '&&' EXPRESSION
   | EXPRESSION '||' EXPRESSION


Terminals:
-----------------
timeval ::= doubleval timeunit
duration ::= ['-'] timeval
doubleval ::= { number [ '.' [number] ] }
timeunit ::= 'ms' | 's' | 'm' | 'h' | 'd' | 'w'
uintval ::= { number } # unsigned
fieldval ::= timeval | doubleval | timeunit | uintval
constant ::= string | fieldval
iprange ::= string [ '/' number ]
variable ::= identifier [ '.' identifier ]

comparison ::= '==' | '!=' | '<' | '>' | '<= | '>=' | '~' | '!~'
assoper ::= '=' | '+=' | '-=' | '*=' | '/=' |

comment ::=
    /* !(/*|*/)* */
    // !(\n)* $
    #  !(\n)* $

long-string ::=
   '{"' !("})* '"}'

shortstring ::=
   '"' !(\")* '"'

inline-c ::=
   !(('}C')

string ::= shortstring | longstring
identifier ::= [a-zA-Z][a-zA-Z0-9_-]*
number ::= [0-9]+

Lexer tokens:
-----------------

 ! % & + * , - . / ; < = > { | } ~ ( )
 != NEQ
 !~ NOMATCH
 ++ INC
 += INCR
 *= MUL
 -- DEC
 -= DECR
 /= DIV
 << SHL
 <= LEQ
 == EQ
 >= GEQ
 >> SHR
 || COR
 && CAND
 elseif ELSEIF
 elsif  ELSIF
 include INCLUDE
 if    IF
# include statements omitted as they are pre-processed away, they are
not a syntactic device.


-- 
http://kacper.doesntexist.org
http://windows.dontexist.com
Employ no technique to gain supreme enlightment.
- Mar pa Chos kyi blos gros




More information about the varnish-misc mailing list