It seems like every week we hear about more and more sites getting penalized for unnatural backlink profiles. From widgets to article directories, popular link building techniques from yesteryear are coming back to haunt website owners and SEO professionals alike. For many companies, backlink profiles are becoming more of a liability than an asset. Our goal should be to turn the tides and reclaim the valuable aspects of our link profile.
All websites should be actively auditing the links pointing to their sites to uncover unnatural links that may put their organic traffic at danger. Some link types that may have recently seemed inline with Google Quality Guidelines are being pushed into the unnatural category. Other methods of easy links like directories, social bookmarking, or article directories may have been abandoned years ago, but are now causing link penalties for thousands of websites. As Google continues to strengthen its grip against numerous forms of ‘unnatural links’, site owners are commonly feeling the pressure to clean up poor quality backlinks including ones that they had nothing to do with.
Overcoming a Penguin or Manual Link Penalty is hard work. In this post, we will examine one often-overlooked aspect of link penalty recovery – HTTP status response codes. During any backlink cleanup project, you will need to review a site’s backlink profile for unnatural links. For large sites with millions of backlinks, this task becomes extremely complex, but the goal is simple: find all unnatural links in an efficient manner. For sites that currently face a link penalty, recovery is going to involve contacting webmasters of unnatural links multiple times and asking for links to be removed. For links that can’t be removed, you will need to create a disavow file.
One of the first steps in this cleanup process is downloading backlinks from Webmaster Tools and supplementing this with other backlink data sources like Ahrefs, Moz, or Majestic. You may decide to concentrate on cleaning up the links from Webmaster Tools first; as this list contains a good sample of the links that Google is using to understand your site. With backlink data in hand, you are going to want to crawl the backlink URLs to learn more about them (check out Screaming Frog). For example, you will want to know the status codes of linking URLs and the anchor text of the inbound links.
The status code provides a critical starting point for analyzing backlinks. Having this piece of information helps speed up the process of link analysis and also helps prevent some mistakes explained below:
200 status code: The URL is still valid and any inbound links on this URL could be helping or hurting your site. You will need a process in place to review these links for quality. In short, do you believe these links fall under Google’s Quality Guidelines? Quite frequently commercial or keyword-rich anchors are a primary reason the site was penalized.
301 status code: Your crawler may report that no inbound links to your website exists on these URLs and it is technically correct. Don’t ignore these URLs! You should follow the redirects and find the final destination URL in the redirect chain. If the terminal URL includes a link to your site, review this URL for signs of unnatural inbound links.
** If the destination URL includes an unnatural link, then you should consider disavowing both the initial URL (or domain) and the terminal URL (or domain) rather than one or the other. The initial URL could still be in Google’s index as it can take Google a long time to recrawl low quality URLs. After the initial URL is recrawled and Google processes the redirect, the destination URL will be indexed. So by disavowing both URLs, you can avoid running into new problems down the road as Google updates their index.
302 status code: Once again, you should follow the redirects to review any inbound links on the terminal URL. In this case, you can typically just disavow the initial URL if you believe the 302 status code is indeed temporary, but we would typically recommend disavowing both to be safe.
400 level status code (400, 404, 410, etc.): The linking URL is no longer active. You don’t need to visit these URLs manually. If you have reason to believe these links may have been unnaturally acquired or come from poor quality domains, you should probably consider disavowing the URL or domain.
500 level status code (500, 503, etc.): These codes are telling you that the URL is temporarily down, such as for site maintenance or there was a connection timeout. Your crawler will say that no link exists. As these errors are temporary, we would recommend you recrawl these URLs again in the future to see if any revert back to a 200 or 301/302 response.
If used correctly, HTTP status codes can provide valuable information about a backlink before you visit it manually. Redirects are especially tricky, as you will want to consider disavowing both the initial and destination URL in the redirect chain. 400 level responses let you know ahead of time that the link is no longer active, but these URLs may still be indexed and counting towards your penalty. 500 level responses let you know the linking URL is not currently active but may be in the near future. Ignoring these 500 level URLs could hinder your cleanup if these URLs are normally available and contain unnatural links. Link penalty cleanup can be a big task with lots of potential steps to trip you up. Best of luck to you on this journey. If you have found more ways to utilize status codes to recover from a link penalty, please share below!