SEO top tip: Don’t neglect those HTTP status codes
Having recently read a post on Google Webmaster Central regarding how to prepare your site for planned downtime, I was reminded once again how much neglect there is around HTTP status codes.
The HTTP status code is a number returned by the web-server to the client (browser) to indicate what the outcome is of requesting a resource (most commonly a web-page). These are the same status codes that govern redirects being referred to as 301s or 302s. Some of the most common SEO problems on web-sites come from inappropriate configuration of the web-server resulting in the wrong codes being returned. The status is returned in the HTTP headers so most users require some sort of tool in their browser to view it. Non-browser clients, such as Googlebots use HTTP header information much more than regular users.
I thought it might be useful to list some of the most common status codes, what they mean and then talk through a couple of typical problems when misusing them.
(Taken from HTTP 1.1 standard RFC 2616)
200 OK – Your request was successfully processed and the returned data corresponds to that request
301 Moved Permanently – This is a status code telling the client the resource they’re looking for has permanently moved to another location (and this is provided in the “Location” field of the HTTP headers)
302 Found – More commonly known as a temporary redirect, these are to indicate to the client that the resource they’re requesting is elsewhere but this may change soon.
304 Not Modified – This is used to indicate if a resource has not changed since the client last requested it. This is extensively used to indicate to clients that a cached version of this resource should be used if it has it.
400 Bad Request – This is the response from the web server if the request was not understood, if the URL had some unexpected parameters for example.
401 Unauthorized – This is the response given when trying to access a location which requires the user to be logged in. This response should also be given if the client fails authentication.
403 Forbidden – The server has understood the request but is refusing to fulfil it. This is often accompanied with an explanation as to why.
404 Not Found – A common response, used by the server that it has understood the request but there is nothing to return, i.e. it’s not found the resource. This is used mainly when the server doesn’t want to indicate why or whether this condition is permanent.
500 Internal Server Error – If you see this response it’s because something the server is running (some code) has terminated in an unexpected way.
501 Not Implemented – Servers don’t only respond to ‘GET’ requests (most common) but also methods such as ‘POST’, most commonly used in form submissions. Not Implemented is returned when the server does not support the requested method.
503 Service Unavailable – Often used to indicate that the load on the server is currently too high to respond to the request, sometimes this will come with a ‘please try in…x’ message for the client.
There are some very well known misuses of HTTP status codes; the one we probably see most frequently is the use of 302 instead of 301. This is a problem because Googlebots regards 302s as temporary and therefore don’t index the resulting content. This occurs frequently because certain web-servers (notably Microsoft IIS) have redirects configured to be 302 by default.
More recently I’ve come across the common issue of a site working in the eyes of a user but returning the wrong HTTP status code. In this particular case the site was configured to have an error (404) page when a strange URL was entered but this returned a 200 OK status. This means that all broken or expired links go to unique pages which contain exactly the same content; I’m sure you can imagine the knock on effects of this.
Correct status codes shouldn’t be used only for search engine spiders but for all web clients, they facilitate many automated operations done over HTTP and in my experience they’re one of the factors that indicates if a site was built by a web-developer or an application-developer asked to build a web-site.