Getting the technical aspects of a site correct can sometimes make the difference between ranking well or in extreme cases, being indexed at all.
In this post I will cover the top things to check when you carry out a Technical SEO Audit to ensure that search engines can find your site and rank it appropriately. Of course there are lots of things to cover across all other aspects of optimisation of a website and in future posts I will go into more detail. But today, its all about Technical SEO!
Ensuring your site is able to be crawled effectively by search engines is key to them being able to rank your sites pages. It is important that all possible steps are taken to make sure that the pages you want search engines to index are taken. Its equally important to take additional care to appropriately direct search engines not to index the pages you don’t want to appear in results pages. Below are a few of the fundamental elements to have in place and to check for when performing a Technical SEO Audit. Robots.txt file A robots.txt file is one of the first places a search engine visits when looking at your site so its the perfect opportunity to tell it what you want indexed and help it along its crawling way. Within the file you simply place the sections of the site that you want to ‘Allow’ them to index and the ones you want to ‘Disallow’ from the index whilst pointing the Search Engine in the direction of your sitemap or sitemaps. There are many more things that you can put into a robots.txt file to ensure that specific filetypes or parameters aren’t indexed and you can find more information here. Below is a good example of a very simple robots file and a good place to start.
XML Sitemap This is what you should be pointing to in your robots.txt file and this should contain all the pages on your site that you want to be indexed. It should be in an XML format and you can also set crawl priorities for the pages. This will show search engines the pages of your site you feel are the most important or more likely to change and need crawling again. I have written a great post on SEO Copywriting all about the best practices to employ with your XML sitemaps. Its worth reading, especially if you have a large site. HMTL (On-Page) Sitemap The use of an on page sitemap is a good additional opportunity for Search Engines to find your important pages and you should look for this on the site and it is usually found in the footer of the pages. This helps not only for internal linking and the crawlability of the site, it also helps your users to navigate when they cannot immediately find the pages or categories they are looking for. Response Codes Any site that is regularly updated will have changes to its pages. Sometimes pages are moved, removed or even renamed and because this sometimes happens it is important to ensure that your pages can still be reached. It is also important that if pages are changed that any link equity gained from external links is not lost too. During a full site audit it is important to look at the header response codes that are received by the crawler for 404’s or 302’s as these will need repairing. You should also look though Webmaster Tools to check for additional broken links that may have been missed as these will need repairing too. If a Search Engine tries to visit your site through an external link that is broken then it will not be able to follow the link properly and a crawl opportunity would be missed. You can find more information here.
2- Domain Duplication and Canonicalisation
The duplication of content and pages of a site can have detrimental effects on a sites rankings so during a technical audit it is good to check that a site isn’t duplicating its content at a technical level due to the site build or structure. Canonical Duplicate Home Page If inappropriately implemented when a site is created the index page of a website may well be accessible from both of the following URLs:
This can result in the Home page being indexed twice and in order to mitigate against a duplicate content it is important to ensure that one version of the homepage is accessible or indexable. There are two possible ways to resolve this; 1- A 301 redirect from the index page to the root of the domain 2- A canonical link on the index page pointing to the root of the domain Canonical Duplicate Domains This is when there are two versions of your domain live and accessible to visitors and the search engines. This could result in a duplicate content penalty from the search engines and should be checked for during a technical audit. You should look to see if the site is accessible in the following ways;
In order to resolve this you can implement the same steps I mentioned above. There are other types of canonical links that need to be looked for on certain sites such as; – Canonical links for multiple language sites. You can find more information here. – Canonicals links pointing to the alternative page on a mobile site. You can find more information here.
3- URL Structure
The well formed URL structure of a site is an additional way for a user and a Search Engine to understand what a site is about. It can also, when incorrectly implemented make it difficult for a sites pages to be indexed or in some cases, causing pages to be duplicated. URL Parameters It is important to carefully examine a sites URLs that are retuned when you crawl it during a technical audit. These URLs may, on the whole, be fine and exactly what is intended for the users and Search Engines to see. There may however be pages which contain parameters which may also be indexed by Search Engines which can cause duplication issues. These parameters are often caused by searches, logins or filters on a site and can be identified by a ‘?’ following the end of the normal URL. After the question mark there is usually an ID or set of ID’s depending on the change to the page. For example: http://www.mysite.com/page/?search=test. It is important to employ the canonicalisation I mentioned before to your pages to avoid this causing duplication or you could get really technical and add a URL rewrite the URL to remove the parameter but this is dependant on your server. Semantically Relevant URL Structure In order for the user and Search Engines to best understand the structure of your site it is important that your URL represents the most semantically relevant structure you can. This may mean that some URL rewrites are needed but its best to keep that to a minimum. A poor example of a URL structure would be; http://www.mysite.com/c24/pagename/ a much better structure would be http://www.mysite.com/relevant-category/relevant-page/.
4- Site Speed
5- Coding Language
It is important that your site is able to be read properly by a Search Engine and that it is important that your pages render properly on different browsers. To that end it is important to check your sites pages code is up to the appropriate standard and you can do this using the W3C Validator. The tool will give you a list of any code inaccuracies and what needs to be improved.
6- Cache Dates