By Jason Harmon | Feb 15, 2013
In the typical usage of HTTP, the GET
and POST
verbs seem to get the most mileage. I’ve previously covered some aspects of moving up the Richardson Maturity Model scale. Implementing the PUT
and DELETE
verbs is typically a step up to RMM Level 2. There are other HTTP verbs, outside of the ‘by the book’ RESTful patterns, which can prove very useful in certain situations. One of the easiest to implement verbs, with some great benefits in scaling terms, is the HEAD
verb.
If you’re not familiar with HTTP HEAD
, it’s best to start with the specifications at http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
9.4 HEAD
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.
The response to a HEAD
request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.
Simply put, HEAD
returns all of the HTTP headers, just like GET
, but provides no body content. All of the same rules regarding content caching can be applied, regarding headers in the request/response.
There are a few scenarios where HEAD
is exceptionally useful:
Existence checks:
In cases where the content of a resource is not particularly important, but the existence of the resource is, HEAD
is perfect. We can do everything just like a GET
and check the response code, without the weight of the response body.
As an example, we’ll look at Twitter’s public profile API.
** Update: the Twitter v1 API is deprecated…however you’ll still get the point **
GET http://api.twitter.com/1/users/show.json?screen_name=jasonh-n-austin
The server responds:
Cache checks:
When we call this using GET
, there is a definite weight to the response body…enough that I’ll snip it for sake of brevity. If we wanted to simply validate that a profile exists, we could use HEAD and simply evaluate the HTTP status (as there will be no response body):
HEAD http://api.twitter.com/1/users/show.json?screen_name=jasonh-n-austin
The server responds:
200 OK
That’s it! In cases where the profile doesn’t exist, the response code will be a 404 Not Found
(much like you’ve seen when you request a non-existent web page).
HEAD http://api.twitter.com/1/users/show.json?screen_name=jasonh-n-timbuktu
The server responds:
404 OK
There’s really no simpler way to determine if a resource exists in an API. This is very nice when there is a secured resource (such as a username or email of some user) which you cannot expose the details of. HEAD prevents any potential information being exposed, other than the existence of that item.
Cache check
When the resource in use supports caching, HEAD can be useful to check if there is a new version of the resource you have previously retrieved. When you are dealing with distributed scenarios in which data is potentially cached for periods of time (such as with a mobile device), it can be advantageous to use HEAD to investigate the previously retrieved resources. If there has been a change, the 200 status code indicates there has been an update; otherwise 304 indicates the data is the same as when you retrieved it last.
A well-implemented example of this in a public API is at Github. They implement two models of cache validation in the response headers:
ETag
: this is typically a calculated hash on the output object, expressed in theETag
response header.
* In the next request, the `If-None-Match` request header should utilize the ETag as previously supplied in the response.
Last-Modified
: normally a database-or-otherwise-maintained “last modified date” is maintained for the resource, which is supplied in theLast-Modified
response header.
* In the subsequent request, the `If-Modified-Since` request header can be populated with the previously retrieved 'last modified-date'.
In either case, the results are simple; 200 OK
means there is new content, 304 Not Modified
means nothing has changed.
While HEAD
can be useful for certain caching situations, most folks elect to simply use GET here, as the 304 Not Modified
will not supply a body, and the 200 OK
will reply with the new content in the body. This saves the extra hit for HEAD
+200 OK
as well as GET
+200 OK
+response body. However if you have one process which looks for updates, and another which does the work of retrieving the new content, this model can be very helpful in reducing traffic over the wires.
Example request:
Now that we have content, let’s check to see if it has changed:
Note that the If-Modified-Since
request header matches the value in the original Last-Modified
response header.
Summary
In REST talk, we usually refer to the four verbs, GET
/PUT
/POST
/DELETE
, as though that’s all there is to HTTP. The HEAD
verb reminds us that we have more tools in our bag, especially when wire traffic, data access costs, and potentially large response bodies are a big concern. Implementing HEAD
is typically fairly simple in most RESTful application frameworks, and can pay real dividends in terms of scaling a platform, when used wisely.
Good luck!