Speed Up Your Web Site with Varnish

Varnish Subroutines

The Varnish subroutines have default definitions, which are shown in default.vcl. Just because you redefine one of these subroutines doesn't mean the default definition will not execute. In particular, if you redefine one of the subroutines but don't return a value, Varnish will proceed to execute the default subroutine. All the default Varnish subroutines return a value, so it makes sens that Varnish uses them as a fallback.

The first subroutine to look at is called vcl_recv. This gets executed after receiving the full client request, which is available in the req object. Here you can inspect and make changes to the original request via the req object. You can use the value of req to decide how to proceed. The return value is how you tell Varnish what to do. I'll put the return values in parentheses as they are explained. Here you can tell Varnish to bypass the cache and send the back end's response back to the client (pass). You also can tell Varnish to check its cache for a match (lookup).

Next is the vcl_pass subroutine. If you returned pass in vcl_recv, this is where you'll be just before sending the request to the back end. You can tell Varnish to continue as planned (pass) or to restart the cycle at the vcl_recv subroutine (restart).

The vcl_miss and vcl_hit subroutines are executed depending on whether Varnish found a suitable response in the cache. From vcl_miss, your main options are to get a response from the back-end server and cache it (fetch) or to get a response from the back end and not cache it (pass). vcl_hit is where you'll be if Varnish successfully finds a matching response in its cache. From vcl_hit, you have the cached response available to you in the obj object. You can tell Varnish to send the cached response to the client (deliver) or have Varnish ignore the cached response and return a fresh response from the back end (pass).

The vcl_fetch subroutine is where you'll be after getting a fresh response from the back end. The response will be available to you in the beresp object. You either can tell Varnish to continue as planned (deliver) or to start over (restart).

From vcl_deliver, you can finish the request/response cycle by delivering the response to the client and possibly caching it as well (deliver), or you can start over (restart).

As previously stated, you express your caching policy within the subroutines in default.vcl. The return values tell Varnish what to do next. You can base your return values on many things, including the values held in the request (req) and response (resp) objects mentioned earlier. In addition to req and resp, there also is a client object representing the client, a server object and a beresp object representing the back end's response. It's important to realize that not all objects are available in all subroutines. It's also important to return one of the allowed return values from subroutines. One of the hardest things to remember when starting out with Varnish is which objects are available in which subroutines, and what the legal return values are. To make it easier, I've created a couple reference tables. They will help you get up to speed quickly by not having to memorize everything up front or dig through the documentation every time you make a change.

Table 1. This table shows which objects are available in each of the subroutines.

client server req bereq beresp resp obj
vcl_recv X X X
vcl_pass X X X X
vcl_miss X X X X
vcl_hit X X X X
vcl_fetch X X X X X
vcl_deliver X X X X

Table 2. This table shows valid return values for each of the subroutines.

pass lookup error restart deliver fetch pipe hit_for_pass
vcl_recv X X X X
vcl_pass X X X
vcl_miss X X X
vcl_hit X X X X
vcl_fetch X X X X
vcl_deliver X X X


Be sure to read the full explanation of VCL, available subroutines, return values and objects in the vcl(7) man page.

Let's put it all together by looking at some examples.

Normalizing the request's Host header:

sub vcl_recv {
    if (req.http.host ~ "^www.example.com") {
        set req.http.host = "example.com";

Notice you access the request's host header by using req.http.host. You have full access to all of the request's headers by putting the header name after req.http. The ~ operator is the match operator. That is followed by a regular expression. If you match, you then use the set keyword and the assignment operator (=) to normalize the hostname to simply "example.com". A really good reason to normalize the hostname is to keep Varnish from caching duplicate responses. Varnish looks at the hostname and the URL to determine if there's a match, so the hostnames should be normalized if possible.

Here's a snippet from the default vcl_recv subroutine:

sub vcl_recv {
    if (req.request != "GET" && req.request != "HEAD") {
        return (pass);
    return (lookup);

That's a snippet of the default vcl_recv subroutine. You can see that if it's not a GET or HEAD request, varnish returns pass and won't cache the response. If it is a GET or HEAD request, it looks it up in the cache.

Removing request's Cookies if the URL matches:

sub vcl_recv {
    if (req.url ~ "^/images") {
        unset req.http.cookie;

That's an example from the Varnish Web site. It removes cookies from the request if the URL starts with "/images". This makes sense when you recall that Varnish won't cache a request with a cookie. By removing the cookie, you allow Varnish to cache the response.

Removing response cookies for image files:

sub vcl_fetch {
    if (req.url ~ "\.(png|gif|jpg)$") {
        unset beresp.http.set-cookie;
        set beresp.ttl = 1h;

That's another example from Varnish's Web site. Here you're in the vcl_fetch subroutine, which happens after fetching a fresh response from the back end. Recall that the response is held in the beresp object. Notice that here you're accessing both the request (req) and the response (beresp). If the request is for an image, you remove the Set-Cookie header set by the server and override the cached response's TTL to one hour. Again, you do this because Varnish won't cache responses with the Set-Cookie header.



Pages are not updated quickly

Ravi's picture

I had tried it in past, but the problem was the refreshed pages were not shown instantly. It was taking minutes to refresh the page. So, I decided to get rid off it.

Is there anyway to fix this problem for dynamic pages?

What about sites that customize contents for each user?

Anonymous's picture

Thanks for the article.

> It will consider only caching GET and HEAD requests. It won't cache a request with either a Cookie or Authorization header. It won't cache a response with either a Set-Cookie or Vary header

... which means that tools like Varnish are only useful to speed up sites that don't customize contents on a user-basis. I wonder how those other sites eg. Facebook handle the issue, since each page is different for each user.

The Varnish Book

Rubén Romero's picture

If you are getting started with Varnish, make sure to read the Varnish Book:

This book is the material we use for our Varnish training classes and it gives a good introduction on the project history, what it is, how VCL works and how Varnish it is related to HTTP and Cache Invalidation.

Read the Varnish Book

