Ayima Offices

Competitive Backlink Analysis by Jane Copland

Let’s start by making a comparison between search engine markets and sport. You might have found that you are the faster and stronger than you have ever been before. But before you race someone else, your results don’t mean terribly much. Everyone has to run together.

Participants in a lot of sports spend time analysing their rivals’ tactics and plays. The best of them are doing what the best search engine marketers are doing: they are not looking to copy tactics play-by-play. They are learning what has made their rivals better, and what their weaknesses are, so they can develop their own tactics to compete.

Competitor Identification

Before analysing what individual sites are doing, you need to identify who your true competitors really are. There is sometimes a difference (and sometimes no difference at all) between the top ten Google results for a site’s primary key phrase, and the true lay of a market’s land.

Analysing a market involves data, rank checking and battles with the Google AdWords API for search volumes, but it’s absolutely possible to do this in-house. I’ll take the UK housing market as an example. In order to get a true perspective of the market as a whole, we gather data from quite a few places, including the top 30 ranking URLs for a large number of industry-related searches, along with Google’s traffic estimates and our own click through rate estimations.

Each domain whose URLs appear is given scores depending on how much traffic those URLs can be expected to receive for their rankings. For the UK housing market, the competitive landscape currently looks like this:
UK Property Market - Market Intelligence Report
The Traffic Score here does not equal exact SEO traffic estimates: Google’s traffic predictions can be dubious. However, we can use traffic estimates to judge a site’s visibility relative to others. In a more streamlined market, a keyword list is a lot easier to compile. Market Intelligence Reports like this example will often reflect the top-ten ranking sites for high volume queries in markets where keywords are fairly uniform (e.g. [online bingo] and [ink cartridges]). In very diverse markets (clothing, current events etc), this often won’t be the case. For our UK property market, some well-ranked sites specialise in houses, some flats, some sales and some rentals. But identifying the strongest sites in an industry isn’t the only thing you should be doing when dealing with market-wide data. When you are keeping an eye on a broad cross-section of a market, you see things like this:
UK Property Market - Wall Street Journal ranking
In this example, the Wall Street Journal is ranking for two (and only two) of our valuable property keywords. However, at least one of those keywords is valuable enough and its ranking high enough that wsj.com has moved up 567 places in our market ranking, and can expect to receive a lot of traffic with that new ranking. Maybe it means nothing to others in the market and the wsj.com ranking will be gone next week. Whatever the reason, you’re at a big advantage if you have an overview of all changes and fluctuations. If this isn’t an anomaly, you’ve just noticed a new competitor before everyone else has. Such a massive jump, even from a powerhouse domain like the Wall Street Journal, isn’t that common. However, if you are engaging in weekly market analysis research, no new competitor will ever take you by surprise.

Whether a site makes a big leap into the market, or moves up more slowly (which is far more common), you will see its progress. And this is an important part of competitive analysis: it is not about blindly copying other people’s link development, or even about obsessively watching your own site rise and fall in a market. It’s also about not being taken by surprise, recognising who is rising and falling at any one time, and finding out why.

A new property site from a major newspaper? A big link building campaign push by a small affiliate? Some tricky redirection work by existing brands, purchasing new domains? Different types of links making a difference? Panda making property search engines drop and property news websites rise? Many newly-competitive domains will start out by ranking for long-tail terms while they climb through pages four, three and two for their target keywords. Both of these changes will be reflected here before you’ve noticed them making a big difference on your core terms.

Backlink Analysis

After noticing sites continuing to rise through a market, you need to look for reasons behind the movement. From an off-page perspective, it’s advisable to keep fairly close tabs on the backlink profiles of your primary competitors, but when someone enters a market for the first time, you likely haven’t been tracking their link profiles over time.

The first thing you need is a comprehensive backlink crawl, but alongside this, I like to know just how sustainable a new competitor’s SEO really is. Finding a gratuitous case of dangerous link acquisition isn’t a common result of backlink analysis, but sometimes you stumble on a newly ranking site whose backlink discovery looks like this:
Example Link Timeline Graph - Unsustainable Link Dev
Few prizes would be handed out for guessing the quality of this particular backlink profile, but most sites have a good enough framework of backlinks that you can’t accurately guess the outcome of a full crawl.

It surprises me how few people seem to value the distribution of Class C IP links and the anchor text in backlink profiles. When we perform backlink crawls, our results are always sorted by the number of Unique Class C IPs linking, rather than by individual links, domains or hosts.

Grouping links by Class C IP ranges and then ordering by anchor text has proven to be the most effective way of predicting and understanding a website’s performance, especially when it comes to sites that build too many commercial backlinks and not enough brand links. It also surprises me how people take a linking domain at face value, often not bothering to find out what sort of link a site has from, say, a prominent national newspaper. “Well damn,” they say. “They’ve got university links; they’ve got newspaper mentions. How are we going to compete with that?” On further investigation, you find that some newspapers like to supplement advertising revenue with text links and pay-per-post styled advertorials.

This is by no means the same as finding a competitor who is routinely cited in news stories, who perhaps has a ruthless PR consultant on-board with a BlackBerry address book full of journalists’ phone numbers and their favourite drinks.

Knowing the difference is very important.

Relationship Links

There’s also no point fretting over a competitor’s links from their associates and partners. We call these Relationship Links, because they’re the result of activity that has nothing to do with SEO. You can’t replicate them, but you can work out who your comparable partners are, and negotiate your own relationship links instead.
Relationship Links

Copycat SEO

At no point should competitive analysis be about copying someone else’s backlink profile. Use it for inspiration, new ideas and a better understanding of what Google values.

With backlink analysis comes the discovery of spam. This is basically unavoidable, save for when dealing with brand new domains and meticulously white hat link builders. Even if a domain has been owned by someone who is careful with SEO, junk links point to almost every website you care to research. Some of the junk pointing to your competitors’ sites help them rank. I am not talking about the obvious automated spam that they most likely did not build, but the old, lazily-acquired links that helped a site out in 2003. The links have sat around for years, passing PageRank. Some of them still do. This does not mean that a newly-acquired link on the same site would be as beneficial, and is another good reason never to fall into the trap of trying to copy.

Ongoing Backlink Monitoring

There are some interesting things you can discover about competitors’ linking activities by simply re-crawling their links on a semi-regular basis. Consider how many paid links are bought for a twelve month period. Undoubtedly, some paid links exist for longer than this time, due to webmasters either seeking payment for another year, or forgetting about the link. However, a high turnover of medium-quality links on a twelve month basis is a fairly good indicator of a certain link building strategy.

As a point, I would be confident Google takes notice when decent links come and go on a regular yearly basis too. This isn’t to say that links won’t appear and disappear: as SEOmoz found, the churn rate of the Internet is higher than you’d likely guess. If you are meticulous in re-crawling however, you may see patterns in backlink profile changes that you would otherwise miss.

Detecting Link Networks

Sometimes, you will come across websites where someone has tried to build backlinks via their own custom network of linking domains. This is a particularly poor way to do backlink development because it’s so easily detectable. It is easily detectable because it is very difficult to maintain a large number of websites without creating footprints between them.

Few people are imaginative enough, cautious enough or rich enough to create a truly great link network. Due to the difficulty of maintaining hundreds or thousands of websites on different servers, tied to different hosts and different identities, these sites also tend to drop off the web quickly, like twelve-month-old paid links. A good number of backlink networks are helping their chosen sites rank well. Again, this doesn’t mean you should create your own network when you find one working for someone else. You do, however, need to understand the link network so you know what you’re up against.

“Good” link networks don’t just link to the hub website, but they do often only link to sites in the hub sites’ niche. They are also usually fairly uniform in their subject matter, all being fairly closely related to the hub site’s subject. The following backlink summary is a real link network that Ayima found:
Link Network Footprint
Note that the creators knew about diversifying their C Class IPs (many “SEO Hosting” companies now exist to make this easier). However, the C Classes (123.123.123.*) were sequential, and the number of B Classes (123.123.*.*) very low.

Some other common traits of sites in link networks include information architecture, URL structure, artwork, file naming protocol, internal redirection similarities (if they serve ads, do they all use the same ad server?) and, in amusing cases, similar affiliate IDs if the webmasters have decided to make some affiliate income on the side.

In particularly simple cases, Bing allows you to see all sites on one IP:
Bing Reverse IP Lookup
Domain Tools will also show the sites on a C Class:
Domaintools Reverse IP Lookup
And if you’re after a quick look at hosting information, Flagfox will spill this information from a Firefox plugin. The flag icon acts as a link to the Geotool page:
Geotool
Again, building a network isn’t something we recommend doing. The effort needed to build something with the legs to outrun Google for long periods of time is immense. It’s not a good long term SEO strategy to build (or rent/buy space on) a network that fools search engines for a short period, and a “good” network can be many times more expensive to maintain than different, sustainable link building tactics.

In summary, competitive research is threefold:

  • Market analysis
  • Ongoing backlink crawling
  • Understanding a backlink profile

No athlete would show up at an event without knowing roughly how good the competition was, and many will also know a fair bit about how the competition trains and how they race. Even if your tactics are different from those of your competitors, it’s to your advantage to understand their methods.

Photo Credit: Poiseon Bild & Text

 

Jane Copland

Jane Copland

Born and raised in New Zealand until the age of 18 when she moved to the United States to take up a swimming scholarship at Washington State University, Jane has always loved computers. "I remember the first time I used ...

Find out more More posts by Jane

Showing 17 comments

Avatar

richardbaxterseo

Thank god you’re blogging about SEO again Jane. Brilliant read!

Avatar

Darren

Hi there, you mention “Class C” addresses a lot in this blog post, and it seems a regular point of contention with regards to SEO. The distinction between IP addresses in this way was largely deprecated since the mid-90s with Classless Inter Domain Routing (CIDR), which means that you can’t really gain any meaningful information regarding the make-up of a network where a site is hosted, just by looking at the third octet of an IP address (i.e. xx.xx.yyy.xx). It would be puzzling if Google, with all of their engineering talent would use this as a metric for valuing links since it returns no meaningful data in this day and age with CIDR.

However, what you can find is an AS#, which tells you which network a site is hosted on. A network can be made up of consecutive /24 blocks (i.e. multiple adjacent “class Cs” in the old terminology) or it can be apparently disparate “class Cs”. From a networking perspective, there is no distinction between the two, because they are on the same AS#.

As such when detecting the value of a link, looking at the different AS#s involved should be far more relevant than simply looking at the third or second octet of the IP address. I think correlation could be being mixed with causation here, since a single hosting provider is likely to be allocated many consecutive class C ranges (but not necessarily so, they could be disparate), but very rarely consecutive class Bs. So you’re seeing those with lots of “class Bs” as ranking more highly, because those with more class Bs are using more hosting providers, which translates to more AS#s. Those with less class Bs, or consecutive Class Cs, are using less (or just a single) hosting provider, which would appear on just one AS#.

So what you’re really calculating if this were that case, would be the value of links (and when detecting link networks) is the number of separate AS#s links are coming from. An obvious link network will have a very low number of AS#s, whilst a natural ranking strategy will see a large number of those.

Let me know your thoughts.

Regards,

Darren

Avatar

Rob Kerry

Darren: It’s a lot easier to talk in terms of a Class C IP range than a /24 block – most people in the hosting world do the same when mentioning a /24.

We’ve found that analysing by Class C correlates well with rankings, but your ASN theory would certainly be a good test. Some ISPs do of course also have multiple ASNs e.g. after acquiring other networks or when separating out their global network.

Either way, it’s important to move away from domain and IP filtering, as these correlate poorly to authority and ranking in our studies.

Avatar

Fresh SEO

Is there a link tool that will return the Class C and Anchor Text for every link? I think Majestic will give you IP address, but no anchor text

Avatar

Daniel Rosenhaus

Blekko’s /seo tag is also great for seeing a lot of other great SEO data on a domain like who else is on a C block, IP, link origin, where a site is hosted.

Avatar

David

Love it, certainly discovering the relationship links but also playing on this as you can find similar relationships with just as powerful websites in the same vertical that will potentially link to you.

The C class is a bigger mystery as i’ve recently moved a bunch of sites to a VPS to improve uptime/reliability and such but the interesting point which is they all mostly have the same C class IP now but a majority have their own unique IPs… just another element of complexity to include and but there are ways to spread the love across different C class IP addresses just takes more time and effort and so far it doesn’t appear to be needed….

Avatar

Gareth James

@Fresh SEO I’ve just had a play with MajesticSEO and cannot see a way to sort a download by anchor text and c class, I may be wrong though…I don’t find it user friendly at all.

Avatar

Rob Kerry

@Fresh SEO – I think that Raventools has this built into their backlink analysis tool, but I might be wrong. We’ve got an internal PHP tool that processes data from any link data API and does this grouping and sorting for us. I’ll see if we can release the code.

@Daniel – Blekko is a great data source, I just wish they wouldn’t keep blocking IPs! I must admit, I haven’t given their API a go yet – which could make competitor research a lot easier.

Avatar

Jane Copland

Hey guys, thanks for all the comments.

Regarding the reporting of c classes (acknowledging both Rob and Darren’s points about the usefulness of the data), it certainly surprises me that few popular backlink tools allow you to filter by c classes. It seems like a sensible metric to show.

As Rob said, we’ve found it a better metric by which to sort than most others that we grab. One thing I should have mentioned is that a large number of links from one network (using the term loosely) hardly means anything bad – this is why we also have a separate tab in our backlink crawls for site-wide links, which sometimes make backlink profiles look worse than they are. In actual fact, sites can have legitimate blogroll links that skew some mythical perfect profile.

Avatar

James Holden

I use exactly this intelligence led approach when pitching for new work. It’s essential to understand the competition before you can even begin to think about how to quote for a job.

It’s also very surprising how many times you see such narrow backlink profiles like the example above. Many SEOs are putting their eggs in far too few baskets. Diversity is essential because you’re always at risk from link networks being discovered (even if you weren’t aware your links were part of one, especially if you outsourced).

I wouldn’t be surprised if Google was looking at AS numbers, but I also wouldn’t expect it to play a large part in their algorithm. As the data above proves, and my experience agrees, most SEO hosting companies advertising multiple class C’s actually have consecutive class Cs so a simple analysis based on class B subnets is probably sufficient. For now.

When IPv6 is widespread, all of this will go out the window of course!

Avatar

P Courtney.

Very interesting article on what not to do. As a newcomer you have certainly enlightened me into backlinking strategies and helped me rethink my own strategy. However, not sure how to backlink for maximum effect other than tracking my competitors and trying to out gun them.

Avatar

Sandra

Nice read – I come across dubious link building practices more often that I’d like. But I don’t see the site’s employing these tactics (or rather their SEO agency employing these tactics) being penalised.

You’ll know how small an economy New Zealand is. Do you think a site who has back-link profiles consisting of link farms and low value links from unrelated international sites manage to retain their high ranking simply because none of their competitors are bothering to do it better?

Avatar

Sean

Interesting your Geotool analysis was run on a site I used to do PPC/SEO for!

Avatar

Jane Copland

@Sean – What a coincidence! I am pretty sure I just used whichever site happened to be open in Firefox at the time of writing :)

Avatar

Dennis Narvedsen

Hi Jane

Fantastic read. I’ve spend several years analysing links and link networks and still managed to learn a couple of new things here. I’m going to recommend this to anyone asking for a guide how to do a good competitive analysis on links.

Avatar

Jane Copland

Dennis, thanks for the kind words!

Avatar

Fresh SEO

@Rob Is it difficult to develop a tool like this in PHP? What about costs?