Friday, July 27, 2012

Varnish Cache Purge, Ban and Ban Lurker

Lets walk through some basics of varnish before understanding purge and ban

From DOC: Varnish is a web application accelarator.
Varnish can cache & serve all your static properties [css, js, images, parsed PHP pages, HTML]
Reduces load on the webservers even on high traffic.
Can act as a load balancer even [provided with proper director configurations].

Varnish uses VCL [Varnish Configuration Language] to override the defaults and tweak varnish based on usecase

Varnish caches contents [cache object] against a key.
By Default the key is Hash(HostName and Request URL)
We can override the defaults by editing vcl_hash sub-routine in vcl file

Do Cache objects live long in varnish?
In varnish every cache object is stored against a ttl value.
Every object will be auto-magically removed out of cache once they reach the expiry.
TTL can be configured globally as default while starting varnishd with -t option.
Also can be overridden in VCL using bresp.ttl value.

What if I had to manually invalidate a cache object?
There comes purge and ban as savior :)

Invalidates [removes] specified cache object actively.
Hit varnish with request method as purge. You can use any equivalent of CURL
Can anybody purge my contents?
Use acl purge {} directive to allow IPs/IP Class from which purge request can be sent.

Invalidates cache objects passively. Supports regex.
Consider ban as a filter over already available cache objects.
ban == "" && req.url ~ "\.png$"
filter all png objects from
The above code should be placed in vcl_recv
Authentication mechanism is same as purge

Purge vs Ban How do they differ?
Invalidates cached object actively [sets the ttl of object to 0 and removes the moment purge request is sent]
Ban :
A ban is a filter maintained by varnish not a command.
It is always applied before delivering an object from the cache.
There might be multiple bans in the same varnish instance.
A ban is applicable only for the contents that were present by the time it was created.
It will never prevent new objects being cached or delivered.
Too many ban lists per instance will consume too much cpu.
Long lived cache [assume infinite TTL] objects with no hits will remain untouched by bans and consumes memory.

Why CPU & Memory consuming?
Every request before being served it might need to be matched across multiple ban list before deliver.
Matching here means a regex match. Hence It is going to consume CPU.
Consider heavy traffic systems. The frequency of requests to ban fiter check might be a concern.
Ban is clearly a filter.
It will take care of removing objects that are actively getting hits and that match ban list.
But it will not take care of idle cache objects with high TTL values even if they match the ban.
Hence, the memory consumed by them is never released till their TTL expires although we had already invalidated them.

How to overcome this?
Use Ban Lurker.

What problem does this solve?
1. Banned objects can be discarded in background.
2. The size ban-list can be reduced.
Ban Lurker is a varnish process who will be actively walking the cache and invalidate objects against the ban list.
This is a kind of enable/disable feature by default off [enable it: param.set ban_lurker_sleep 0.1].
Read more about ban lurker here

Final Point:
Purge will not refresh the invalidated object from backend. It will happen only in next cache miss.
Incase you want to force a cache miss and refresh content from backend you need to set
req.hash_always_miss to true
In that cache varnish will miss the current object in the cache, thus forcing a fetch from the backend