A Step-by-step Guide to a Technical SEO Audit for Publishers

Search engine optimization (SEO) can be daunting. It doesn’t always follow our intuition for site building, and Google rarely gives a firm answer of what works and what doesn’t (and when they do, SEOs still tend to question it). What’s a publisher to do with all that uncertainty?

technical seo audit

At Distilled, our main business is SEO, and it can still be overwhelming sometimes. But here’s how we like to think of SEO, at it’s core: search engine optimization is usability for search engines. In order for search engines to be as enchanted with your site as your human visitors, they have to be able to find, read, and understand your content.

To help you check that your content is findable and relevant, we have prepared a guide on how to see what search engines see, and how to fix any problems you may run into. Knowing that most of you aren’t SEOs, we’ve done our best to provide as many free tools to help you as possible.

As a publisher, you already know how to write valuable content, and probably have a fair idea of how to choose the right content for your visitors. But a website that is easy to read for people can be surprisingly confusing to search engines, if you make a few technical mistakes. To help you check that your content is findable and understandable to search engines, we have put together a list of typical issues that publishers run into.

Part I: It’s all about Indexation (Findability)

Just because your webpage exists it doesn’t mean that search engines will find it, or find it valuable enough to list in their results pages. Most sites are going to have one or two pages that aren’t indexed: big sites often have a handful of pages that are too difficult for search engines to find, and small sites are often overlooked. Regardless of how strong or weak your site is, your first step is to check how many of your webpages are indexed.

Here’s how:

1. Get a list of all URLs on your site

If you’re using a good content management system  (for example, WordPress or Drupal)  you may be able to get a list of your URLs easily. If not, at Distilled we like to use a tool called Screaming Frog, which will scan your site and get a list of all of the URLs. Be aware however, Screaming Frog uses internal links off the homepage, so if you have a page unconnected with the other pages on your site, Screaming Frog won’t find it.

For more detail instructions on how to use Screaming Frog, please check their step-by-step guide. It is pretty intuitive to use and entirely free for the first 500 URLS.

You can also get a list of URLs by using one of any number of tools that build XML sitemaps and stripping out the XML content. XML-sitemaps.com is a good, free tool for this.

2. Check to see what hasn’t been indexed

If you have a fairly small site, you can test to make sure that each page is indexed by simply typing in “site:http://www.mysite.com/my-page” in Google or Bing. Starting with “site:” means they will only return that page, so if it isn’t in their index, you’ll get a “not found” message:

Note: don’t put a space in between site: and the URL or this won’t work.

If you have a bigger site, open a free account with both Google and Bing Webmaster Tools and submit your XML sitemap. Google will count the number of URLs that you provided in the sitemap and compare that to the number of URLs it has in its index with a bar graph. Bing will be more helpful and identify which links you submitted that it doesn’t have indexed. Don’t assume that Bing and Google will index in the same way, though.

3. Get pages indexed

If you find some pages that aren’t indexed, ask yourself a couple of questions:

A. Is the page new?

Search engines only crawl small sites every so often, so it could take a few days for new content to get noticed. If you want something noticed as soon as possible, make sure it’s in an XML sitemap that you’ve submitted to Google/Bing Webmaster Tools. “Noticed” doesn’t necessarily mean “indexed,” though, so make sure that it’s a high quality page, with at least 750 words of unique text, and multiple links to it.

To get an idea of how long it will take before your new page is indexed, Google and Bing Webmaster Tools both tell you how often they crawl your site:

For Bing Webmaster Tools, choose “Pages Crawled” in the line graph that is the homepage of Reports & Data.

If you don’t have Bing or Google Webmaster Tools set up for your site, you can see when Google last crawled your site by searching for “cache:http://www.mysite.com/.” Google will load its cached version of the homepage and the date it was last crawled at the top of the page. If you do this regularly you can get a sense of the maximum crawl frequency for your site. In the image below, you can see that when we accessed it on May 3, distilled.net was crawled on May 3.

B. Is the link to the page buried?

If you have a page that is only linked to in the footer of a few pages, or in the body of one of the relatively minor pages on the site, search engines may not notice it. Make sure that all of your pages (or, all important pages, if your site is large) have links in your top or side navigation, and cross link between similar pages as well.

If your top and/or side navigation is loaded with JavaScript, there’s a chance that search engines can’t read it. Load Google’s cache (“cache:http://www.mysite.com/mypage”) of the page, and see if Google was able to recreate your navigation. If it isn’t in the cache, there’s a good chance Google can’t read it.

Note: If your page is using a heavy dose of JavaScript, it’s possible that Google will load it improperly and wipe the entire page blank. Check the Text Version to be absolutely sure that Google cannot read a page.

C. Does an older page 302 (temporarily) redirect to the missing page?

A 302 redirect tells search engines: “the URL you tried to access is temporarily unavailable, so load this page instead.” If you have an old page that was indexed pointing to a new page, search engines will index the old page instead of the new page. Change that 302 redirect to a 301 (permanent) redirect and the problem will go away.

For a full description of redirects, here’s SEOmoz’s infographic explaining it.

4. Get Content Indexed

Simply making sure that a URL is indexed often isn’t enough. There are a number of ways that your site loads content that makes it difficult for search engines to understand it. Unfortunately, there isn’t an easy tool like Screaming Frog to help you with this for your entire site at once. Instead, look at the Google or Bing’s cached version of your pages and see what’s there.

Here are some of the most common reasons for content to not get indexed:

A. It’s Not Text

Search engines aren’t good at understanding anything that isn’t text. If text is a part of an image, loaded in Flash, or in a video, you should assume that search engines cannot understand it. (Quick tip to see if something is HTML text or not: try to highlight it with your mouse. If it’s text, the text will lighten and the background will darken.)


  • If you want text on images to be read by search engines, load the picture without text and put the text in HTML below the image, then use CSS to float the text over the picture
  • Build pages in HTML first, then layer Flash to make it prettier. Sites built primarily on Flash typically have a hard time ranking well.
  • Include a text transcript beneath your videos. It will be helpful to impaired users and lets search engines know what you’re talking about in the video.

B. The Text is Loaded with JavaScript or is in iFrames

Text loaded in iFrames technically comes from somewhere off of that page, so search engines can read it, but choose to ignore it. JavaScript, on the other hand, is difficult for Google to render. To test to see if search engines can read some text on your page, copy a long snippet of it into Google or Bing’s search box, and put the snippet in quotes. This asks search engines to only return sites that use that exact text, so it will limit the results to pages that it can find with that text only.

If the text can’t be read, it won’t return the page.

Note: This is also a great way to check if someone is scraping your site, if search engines have accidentally indexed the same page of your site twice, or if you have duplicate versions of the same page, with different URLS, competing with each other.

Part II: Marking Up Content

There are several fields within a standard web page – primarily: title tags, headers, meta descriptions – that are used by the search engines to assess the content of the page. This post by Paddy Moogan explains page structure in more detail. By using these fields properly, you can highlight more important phrases to search engines, which helps them understand how to categorize your pages.

Write Your Snippet on Search Engine Results Pages

You’re able to customize what search engines show on their results pages with the <title> meta tag and the meta description (<meta name=”description”>). In the image below, the meta description is in the red box and the page title is the bolded, purple text.

To test how your page title and description will look on actual search engine results pages (SERPs), use the Snippet Optimizer.

Page Titles

Search engines will display approximately 70 characters on their results pages, but it’s really more about pixel width than character length. Try titles on the snippet optimizer to see what will fit well.

Page titles significantly impact the way search engines interpret the content of your page. If you build a page to target a specific keyword phrase that you would like to rank for, that should absolutely be in the page title. Otherwise, title your page with the most commonly used phrase. You can check the popularity of different phrases with Google Trends (which compares different phrases to each other, or based on the time of the year) or Google’s AdWords Keyword Tool.

We also recommend that you include your brand or site name at the end of each page title. If searchers already know of your site, they’ll be more likely to click on your link if they know it’s you.

Meta Descriptions

Search engines will display up to 156 characters of your meta description. If the description is longer than that, it will end the description after the last full word that can fit, then add an ellipsis ( “…”).

Unlike page titles, search engines don’t read meta descriptions, so they don’t directly influence your rankings at all. However, search engines do pay attention to click through rates. If your meta description is well written and convinces visitors to click through to your page, your page will rank higher.

You should also be aware that writing a meta description doesn’t necessarily mean that your meta description will display for all searches. If searchers use phrases that are on your page rather than in the meta description, Google or Bing may choose to display that snippet of text from your page instead.

Identify Text on the Page

On the page, search engines can read your text, but they don’t have the understanding of a human to be able to pull out the larger message without other clues. They will look at the size of text, the placement of it (higher on the page is seen as more important), and they will look at the number of times the same or similar phrases are repeated. But, you can help them along.

Header Tags <h1>, <h2>, …, <h#>

You’re probably using these already, because if you’re using a Word-like editor in your content management system, making some text a “heading” automatically applies this tag. But, if you decide to style the header yourself, you could be missing the tag. Check the HTML of your pages to see if a piece of text uses an <h#> tag before a header, or if it only uses a standard paragraph tag <p>.

Search engines see headers as an outline of your page, so the phrases used in header tags are interpreted as a better indication of the content on your page. Make sure that the title of your page really describes what you’re talking about, and put it in an <h1> tag.

Alt Text and File Names <img src=”descriptive-name.jpg” alt=”alternative title”>

The alternative text on images and their file names are often ignored, because human visitors understand the message of the image without any more words. Search engines, though, can’t see images, and need text to guide them. Make sure that you name your files something descriptive, and use alternative text to describe the photo.

Do Not Use Meta Keyword Tags

Google didn’t look at meta keywords in 2003, much less in 2013, and Bing never has. At best, meta keyword tags won’t do anything; at worst, your competitors will see which search terms you’re targeting, and search engines may see your site as spammy and lower your rankings. It’s best to leave them out entirely.

Technical SEO Doesn’t Have to be Hard

By paying close attention to indexation and on-page mark up, you can communicate more effectively with search engines, improving your chances of ranking well on their results pages. Use this guide to clean up your site as it currently stands and to watch how you post content in the future. You’ll be surprised at how much little changes can help your organic traffic increase!

Enhanced by Zemanta
Distilled will be hosting their annual SearchLove Boston conference this May 20th and 21st. Two full days of advanced online marketing sessions brought to you by 18 industry experts! Zemanta will be there. If you want to join us, we can offer you an exclusive $100 discount. Click here.

Subscribe to newsletter

  • Pingback: Check Out This Wonderful Assortment Of Search Engine Optimization Tips | mlmtrainingz.com()

  • Pingback: What Is There About Search Engine Optimization That You Need To Know? | mlmtrainingz.com()

  • Pingback: Submit Your Site To Google | bt24News.combt24News.com()

  • Pingback: Get More Visitors With These Search Engine Optimization Tips | mlmtrainingz.com()

  • Pingback: Good Design Practices - Top 10 Technology Solution()

  • http://www.paulcarl.com/ Paul Carl Gallipeau

    Sorry to necro this thread but I just stumbled on this article and I wanted to say I agree 100% with you. I treat my meta description the same way I would treat AdWords ad copy, i try to sell the click!

  • Spook SEO

    Hello Kristina!

    Thank you so much for sharing this guide on how to check if our content is relevant. I have enjoyed reading your guide. I have learned on how to fix problems that we may run into. This is such a very helpful post. I will definitely share this post of yours to others.

  • Pingback: SEO – On Page Optimization category | Search engine marketing company()

  • Saven

    Kristina, Awesome info.. Thank you So Much!!

  • Pingback: Good News! SEO hasn't really changed at all! | Search Engine People | Toronto()

  • Pingback: A Step-By-Step Guide to A Technical SEO Audit For Publishers | Online Marketing Strategy Framework()

  • Pingback: Beginner’s check in here: SEO for Websites Checklist: Part 1 « "KnoGimmicks.com" Social Media & Web Design™()

  • Pingback: Online Marketing Made Easy | 21 things SMBs can do to make Google love their website()

  • Pingback: Blog Posts to Read for May 23, 2013()

  • Pingback: A Step-by-Step Guide to a Technical SEO Audit | Formation 2.0()

  • http://www.brickmarketing.com/ Nick Stamoulis

    “If your meta description is well written and convinces visitors to click through to your page, your page will rank higher.”

    Even though the data in a meta description isn’t factored in to the search engines when it comes to rank, it is incredibly important from a user experience standpoint. You optimize a website in hopes of driving traffic to your site. Your meta description is your “calling card” on the internet. What you post there needs to be enough to make your visitors click through. Even if you rank towards the top of the first page, you need something that is going to draw the searcher over to your site over the other links on the SERP.

  • Kristina Kledzik

    Thanks, hopefully something I shared will make your life easier or get you more traffic. :)

  • Kristina Kledzik

    Glad I could help! Meta keywords are tricky, because they seem like they’re so useful! But, they’re just a remnant of a forgotten time, back when search engines had such a hard time categorizing pages, they needed help from webmasters. :)

  • http://www.blueprintmarketing.com/ Thomas Zickell

    Hi Kristina,

    I thought that was a fantastic post and want to say that if you don’t have anything time to say Rick don’t say anything at all. Screaming frog spider SEO is a tool you probably have never heard of if you really thought that was not a Useful article you most likely have no idea what you’re doing and I don’t mean to offend you by saying that however. I answer questions on seomoz from people.think they know what they’re talking about. Kristina will be traveling from London to Boston to speak to people like myself at search love Boston I think you should reread the article and internalize it. If all you wanted were a bunch of tools sure she could’ve filled 20 pages with tools that you most likely would never be able to utilize anyway. She works were when I consider the best agency in the country when somebody of this caliber takes the time to write you should listen. The fact that your response does not add anything to the conversation except for showing me you need more tools to do search engine optimization because that’s how you think it’s done. I assure you there is no tool that will search engine optimize your website. Take a look at the company she works at and tell me that this is useless. It is extremely helpful to many. Once you get into complex issues with robot text and media data causing issues with 301 redirects turning into 302’s turning into a 404 sometimes even 500 keeping your content from being indexed correctly you will understand this is something you must know cold. Screaming frog is equal to about 40 of your run-of-the-mill tools so Next time do not throw stones in a glass house by criticizing somebody that took the time to do a fantastic job.

    Kristina I will hopefully good to meet you at search love Boston I am looking forward to another fantastic distilled conference very much.

    Thomas Zickell

  • http://www.richamorindonesia.com/ Rich Amor Indonesia

    This is a good advice to leave meta keyword tags, and leave them out entirely. Thank you


  • Rick – www.travelhaggler.com

    completely useless article. Very Basic. No useful tool listed either.

  • http://www.mabzicle.com/ MabZ ZiCLe

    pretty seems useful to me..thanks 😉