We all know Google loves content. But “more” content is not always a good thing. A website stuffed with irrelevant or out-of-date content is confusing and frustrating for users. It also can have devastating consequences for your SEO. See how we helped a college website turn things around.
Inefficient, diluted Google crawls
The time search engines spend on your site is known as the “crawl budget.” When the Google crawlers explore your website, they have limited time before they move on to other parts of the web. We don’t want Google to waste time crawling pages that are unimportant, out of date or duplicate. We want that crawl budget to be spent efficiently so the most relevant pages of your site receive the highest search engine ranking possible.
But poor Google is just a machine. It can’t differentiate which content is important for your audience to see unless you help it out. By removing unneeded pages and/or notifying Google to avoid crawling certain pages, it has more resources available to crawl, index and rank fresh, relevant content.
Case study: exploring the dark underbelly of a website
We recently helped a New England college improve its search engine rankings with a long-overdue content cleanse. In this case, the school only had a few thousand web pages that users could navigate via the front end of the site. But our assessment found there were actually 10,000+ public URLs that Google was struggling (and failing!) to address in the crawl budget. Several factors were to blame:
- Past events — Event registration pages had been removed from navigation, but were still live and ranking in searches. Imagine the confusion of a prospective student who searches for an open house and sees several listings with dates in 2014, 2012, etc.
- Test content — Content managers were testing out new designs or copy for pages and not removing them when they selected a different option.
- Private content — Several professors had websites set up for their classes on the university’s main domain. Though those sites were meant to be internal only, they were never blocked from search. Since the pages were text-rich, they were magnets for search engines and drove unwanted traffic to the website.
- Http vs. https — The school was in the midst of a switch to the secure version of http, but the content management system was creating duplicate URLs for every page.
- Old news — Press releases and news articles dated back 5+ years and weren’t adding value.
- CMS category pages — URLs created by the content management system were being crawled, but weren’t necessary for users. These included pages with the URL structure “taxonomy,” “website-index,” “event-category,” etc.
- Crawl errors — Issues included broken links, permissions errors and PHP time outs.
As a result, Google only was able to crawl ~2,000 of the college’s 10,000+ pages — and they weren’t the pages the college wanted to promote.
Plan of attack
We were able to block some of the excess content from being crawled by changing server settings and updating the website’s robots.txt file.
We cleaned up the crawl errors by permanently redirecting or removing content. Some of the older pages had to be checked manually to ensure any essential content was included in other parts of the website before the pages were permanently removed.
Slimmer site = ranking success
When we started working with the college, the site was not ranking in the top-50 search results for any relevant keywords. Within three months of making the updates, seven key terms were ranking on page three or higher, with terms such as “biotechnology major” and “education major” reaching page one.
We now have a strong foundation on which to build. Our ongoing work with the school, including blog development and content optimization, is designed to improve rankings even more.
What’s in your closet?
It’s easy for a website to grow out of control. Particularly if your website has been around for many years, is managed by multiple people and has lots of content, chances are high that an inefficient Google crawl is hurting your SEO.
As part of a comprehensive content audit, Centerboard can help you better understand what content to prioritize. You’ll make life easier for users and boost your search engine ranking.