|SEO Expert Home
About SEO Expert
SEM (Search Engine Marketing)
Cleaning Your Link Profile
Increase Click Through Rates
Writing Effective Blog Posts
Leveraging Value from Dormant Pages
So why Google?
Tags, tags, tags!
Content is King
Share your Links
More Google Info
Fun, fun, fun
SF SEO Sitemap
All you need to know to become a cerified Microsoft System Builder.
Various resources to assist you in developing your website.
List of resources to assist you in designing stunning graphics.
Cleaning Up Your Link Profile
A Detailed Guide to Link Profile Cleaning
April 01, 2014
Bookmark | Email Friend | Save this on Delicious | Digg It!
On 24 April 2012, Google announced an algorithm update called Google Penguin which was to have a major impact on search results. Many websites now need to clean their back link profiles to account for Penguin and this article shows you how.
Link profile cleaning involves creating a list of links that points towards a website before getting in touch with the webmasters of links you don’t want and asking them to remove their links. The links you want removed are considered ‘low quality’ and this cleaning process ensures that you are left with only high quality links. While it is relatively easy to define link profile cleaning, carrying it out is a different matter entirely.
Although Google gives you the chance to download a list of links that point to your website, this is only a mere fraction of what is available. A combination of the following trio of resources will help us create a more all-encompassing list: Google Webmaster Tools, SEOMoz’s Open Site Explorer and Majestic SEO.
Personally, I use the silver package from MajesticSEO for $49.99 a month and the Pro Package from SEOMoz for $99 a month (the first 30 days are free so take advantage). If you are serious about improving the SEO ranking of your websites, I wholeheartedly recommend opening accounts with both of these companies. The amount you pay per month will ultimately be dwarfed by the results these accounts can provide. To begin, we take all possible links from Google Webmaster Tools (GWT).
Begin by logging into your GWT account. Click on the requisite URL to decide upon the site you wish to manage. In the screenshot below, you can see that I have chosen www.stevenforsyth.com. Every site you manage should be included in your account and if they are not, it will be necessary to upload a verification HTML file to the server. This file can be downloaded from the GWT website before you upload it to the root of your live server. Click on ‘Verify Site’ which is located next to the website you have just added on GWT.
Once you have made your website selection decision, you will end up on the GWT ‘Dashboard’ which is essentially an overview screen. There will be a navigational menu on the left hand side of your screen and you can expand the menu by clicking on the ‘Traffic’ link. When you click ‘Links to your Site’, the following screen will greet you:
The screen above shows the following: Who Links the Most, Your Most Linked Content and How your Data is Linked. Choose ‘Who Links the Most’ and click ‘More >>’ (found on the bottom of the links) to see this screen:
There are three fairly large grey buttons on top of the screen. The button on the right, ‘Download latest links’ is what we must focus on. It is a relatively new inclusion on GWT and came about specifically because of Google Penguin. As cleaning your profile is of paramount importance now, Google have helpfully added an option which enables you to download only the most crucial links. Clicking the above button results in all your links being downloaded into a CSV file.
This CSV file can be opened in Excel and should include a list of links from the previous three years. As we only require a single column for links, delete Column B. Once this has been achieved, ensure that the file is saved as Entire-Link-Profile.CSV.
Using SEOMoz’s Open Explorer
This is another method of getting useful back links. Log into your account and add the address of the site you wish to search before clicking on ‘Search’. This will allow you to see every link that is currently pointing to the website. Yet adjustments are necessary before the list can be downloaded.
The screenshot below this paragraph contains four drop down menus but the first one is irrelevant for our purposes. Set the second menu to ‘Only External’ in order to ensure that only external links pointing to your website are downloaded. ‘Pages on the Root Domain’ should be selected in the third menu while the final menu should be left alone.
Click the Filter button and the Download CSV link which is located immediately below the aforementioned Filter button. After downloading your new file, open it and take note of all the rows with URL’s. Choose every URL row found within the column before copying and pasting them into the Entire-Link-Profile.CSV file beneath the existing links.
This is a high quality website that is always a great source of back links. Log into your account and enter the URL into the search box. As we are looking for all links, not just recent ones, it is important to click on the History Index radio button as it provides details of every link that has been created, even those which are inactive. The screen below is what you will see once you click ‘Search’ and you should now choose the Backlinks tab.
Choose Remove Deleted Backlinks before clicking ‘Explore’ to receive a file update. Scroll to the very end of the page to see the Download CSV link on the extreme left hand side.
After downloading the file, open it and copy the rows of URLs like you did before and put them in the Entire-Link-Profile.CSV file to have a complete list.
Use Excel to open this file, choose the complete URL column, click on the Data menu and finally, choose Remove Duplicates to get rid of the duplicates which will inevitably pop up after using a trio of different sources. The last step is to save the file.
While we now have a file with a huge amount of links, it is common for links to vanish over time for a variety of reasons. As a result, it is necessary to perform a test on the URL’s list to discover the active ones as only they are useful for our purposes. Screaming Frog is an excellent piece of crawler software and is what we will be using. At £99 a year plus VAT, it represents a value for money investment. Click the links to download the software and buy a license.
Using Screaming Frog
Once you open Screaming Frog, you will come across the menu below. Click on <mode> and <list>.
Next, choose ‘select file’ and locate your Entire-Link-Profile.CSV file. Open it and find out how many URLs have been found. Close the dialog box and take the next step which is Custom Source Code Filter configuration. On the menu, choose <configuration> and <custom>. There will be a large number of options on offer and you will pick <Does Not Contain> in the first dropdown which is marked as ‘Filter 1’. Add your website address in the input field which can be seen in the screenshot below.
After clicking on <Custom>, chose Filter 1 and press ‘Start’.
You will find that this tool has the ability to go through your list of links and remove all inactive ones. These can be erased from the CSV file and you’ll be left with a list of links that can be reviewed and removed if necessary.
Back links have been taken from three different sources, have been copied to a single file and all inactive and duplicate links have been removed. At this stage, the process of link profile cleaning can begin in earnest. This includes the identification of low quality links, the gathering of contact details, getting in touch with the relevant webmasters and recording the date these individuals were first contacted. A spreadsheet is an extremely convenient method of recording all this information.
There are a host of different spam links and while there are plenty that are easy to spot, some spam links can remain undetectable for a long time if you don’t take affirmative action. The following is a concise guide which will hopefully help you identify and remove spam links.
Managed Blog Networks
Blog networks contain a large number of blogs and don’t tend to have a definitive owner. These were heavily targeted by Google Penguin. Managed blog networks expire within 12 months and have no traffic. It is common for these blogs to have a large volume of content with contextual links that utilise exact match anchor text. You’ll find that these blogs will have a closed comments section and no available contact details. The Whois database is a good resource if you’re looking to find details of blog ownership by the way.
Blog Comment Spam
This can become an absolute plague if the blog is not well moderated. Spam comments can include regular comments with exact match anchor text or else someone posts gibberish simply to get a link. Check the Whois database to see if there are details about ownership. If you find none, links of this nature need to be on your banned list
Link Exchange Pages
At one time, the ‘link to my site and I’ll link to yours’ attitude was prevalent. This practice is also called reciprocal linking and makes sense if you link to someone in your niche. It can also be extremely useful for visitors if the links in question lead to valuable information. The problem is, spamming is very common as football websites link to knitting websites and so on.
In this instance, Google does not deem these links to be valuable as its mission is to improve the user experience. If you have irrelevant links on your website, remove them immediately. Avoid the use of exact match anchor text if the links are relevant and we cover the reason why below.
Exact Match Anchor Text Links
These links can appear in many forms and in any location. It doesn’t matter where these links are placed however as Google will penalise websites that use too many of them. You need to remove links of this nature that are irrelevant. If you want to keep links to sites in your niche, change the anchor text, preferably to the naked URL of the site or your brand name.
That being said, having some exact match anchor text links is a good idea as long as they are used prudently. For instance, it is ok to have these links if the sites are of premium quality and link back to you in a natural way.
Non-Industry Related Directories
Before Penguin, this was a common method of developing large volumes of exact match anchor text links on to a website. All you need to do is submit details of your website to as many online directories as you can. There are however, two major issues when it comes to using this method:
• Penguin severely punishes excessive usage of exact match anchor text links.
• Google will penalise websites that have links to directories in an unrelated industry.
You need to contact the webmaster to remove these links if the directories are managed and once again, you will need to use the Whois database to find contact details.
Probably the most important part of your profile cleaning is contacting the webmasters and requesting the removal of your link. You must remember that when these webmasters linked to your site initially, most of them did so as part of a request by you or your SEO team. So be polite in your first contact with them! There are lots of approaches one can take but I have found the following method to be the most successful:
Dear Webmaster (Try to use the contacts name, you may find it in the whois database)
I am contacting you because my website, www.example.com, recently received a penalty during the latest Google Penguin update. We received notification from Google to clean our link profile and submit our site for reconsideration.
In light of this, we are removing any links that are not directly associated with our business. We have not been told that your site is considered to be one of the offending links, but we would like it removed anyway. Once we have removed all the links on our list we will then submit our site for reconsideration.
Unfortunately, if this link is not removed by next week, we will then have to submit the URL to Google’s Disavow Tool. We would prefer not to do this as it benefits neither of us. I hope and trust you can remove the link soon and apologies for the inconvenience.
The Link can be found here: www.example.com/offending-page.htm
The anchor texted used is: Exact Match Anchor Text
Of all the different formats I have used in my emails, the above method easily yielded the best results. The email appreciates the work of the webmaster, is polite yet threatens action. If you get no response, it may be necessary to add them to the Disavow list. This is not a good option for anyone and Google will possibly want manual removal of the link.
Google have recently launched their Disavow Tool. This is great news for webmasters, however, before you start thinking you can use this tool on any link you want removed I'm afraid I'll have to stop you there. This tool should only be used on those links you simply cannot get removed. If the reviewer deems one of the links that you sent to get disavowed could actually have easily been removed manually, then I doubt the reviewer would take any action. Google stress on their blog that the tool should only be used as a last resort, after all manual efforts have been exhausted.
It is still good news though. If you have gone through the whole process as outlined above, and have the evidence of your CSV file to go with it, then the added benefit of this tool to remove all remaining spam links practically guarantees a successful reconsideration. Before you storm off to use this tool I highly recommend you read Dr. Pete's article on SEOMoz. It's essential reading.
Finally, I think a Matt Cutts video on the tool will clear up any questions you may have regarding Google's new Disavow Tool.
I hope this page has proven useful to you and that you achieve that successful reconsideration request with Google. As always, would love to hear your thoughts on the article. If you did find it useful, please share it below. Keep an eye out for my next article which will be on Google Authorship and managing Google Accounts.
SEO Product Manager
Tallaght, D24, Ireland.(086) 122-8514 firstname.lastname@example.org
Steven's home page: www.stevenforsyth.com