Parsing non-English URLs

Tollef Fog Heen tfheen at varnish-software.com
Wed Jun 2 09:59:18 CEST 2010


]] "Angie T. Muhammad" 

| I get some lines like:
|    14 TxHeader     b Referer:
| http://myserver.example.com/news/%D8%A7%D9%84%D9%85%D9%84%D9%8A%D8%A7%D8%B1%D8%AF%D9%8A%D8%B1-%D8%A7%D9%84%D9%85%D8%B5%D8%B1%D9%8A-%C2%AB%D9%85%D8%AD%D9%85%D8%AF-%D8%A7%D9%84%D9%81%D8%A7%D9%8A%D8%AF%C2%BB-%D9%8A%D9%82%D8%B1%D8%B1-%D8%A8%D9%8A%D
| 
| The point is that I need to make varnish accept such percentages or any
| other syntax in VCL file. So that some pages are cached longer based on
| their Arabic address (Like the one in the referer above).

We use %-escaped strings in vcl, so referring to that string above would
be to match it to

«http://myserver.example.com/news/%25D8%25A7%25D9%2584%25D9%2585%25D9%2584%25D9%258A%25D8%25A7%25D8%25B1%25D8%25AF%25D9%258A%25D8%25B1-%25D8%25A7%25D9%2584%25D9%2585%25D8%25B5%25D8%25B1%25D9%258A-%25C2%25AB%25D9%2585%25D8%25AD%25D9%2585%25D8%25AF-%25D8%25A7%25D9%2584%25D9%2581%25D8%25A7%25D9%258A%25D8%25AF%25C2%25BB-%25D9%258A%25D9%2582%25D8%25B1%25D8%25B1-%25D8%25A8%25D9%258A%25D»

or more easily

{"http://myserver.example.com/news/%D8%A7%D9%84%D9%85%D9%84%D9%8A%D8%A7%D8%B1%D8%AF%D9%8A%D8%B1-%D8%A7%D9%84%D9%85%D8%B5%D8%B1%D9%8A-%C2%AB%D9%85%D8%AD%D9%85%D8%AF-%D8%A7%D9%84%D9%81%D8%A7%D9%8A%D8%AF%C2%BB-%D9%8A%D9%82%D8%B1%D8%B1-%D8%A8%D9%8A%D"}

-- 
Tollef Fog Heen
Varnish Software
t: +47 21 54 41 73




More information about the varnish-misc mailing list