Google Panda & Killing Old Content — A Deeper Look

Google Panda Recovery Timeline

Since I do a lot of site audit work, I often have clients with extensive volumes of old content — typically blog posts, with some “news” articles in the mix.  With Google’s Panda algorithm focused in on quality considerations on-site, many client’s assume they have to kill all the old content, while many others who don’t stay up on SEO don’t know if its hurting them or not, and its up to me to inform them what to do with that content.

Earlier today, Barry Schwartz posted an article over on Search Engine Roundtable entitled “Google: Panda Victims Don’t Necessarily Need To Delete Old Blog Posts”.

In that article, Barry covered a discussion where Marie Haynes (one of the industry’s leading link cleanup experts) asked Google’s John Mueller whether thousands of old, rarely read blog posts might harm a site specifically because of Panda.


John responded by essentially saying that it’s not an absolute one way or the other.   That’s an answer I give to clients way too often in response to questions across the entire spectrum of SEO. It’s just the nature of how complex the web is, how complex search algorithms and multidimensional considerations have become regarding quality.

What it comes down to is what I communicated in my most recent presentation on the Philosophy of SEO Audits at Pubcon in Las Vegas:

SEO is Google’s algorithmic attempt to emulate user experience

Summing up what John initially said about reasons not to kill off the old content:

There could be valid reasons to keep old content archives, and Google does their best to recognize where a site might have a lot of old content — but as long as the main focus of the site was high quality, they take that into consideration when they can as well.

He did however, clearly communicate at least one scenario where you need to not ignore that content:

But sometimes it might also be that these old blog posts are really low quality and kind of bad, then that’s something you would take action on.


To many people this still leaves the question wide open for interpretation. Since John didn’t say “always do this” or “Don’t take a risk — when in doubt delete it”, that leaves people guessing way too often. The “yeah but…” response kicks in. Or worse, they just clump that type of response in with all the other reasons out there why Google shouldn’t be trusted.


Personally, I don’t see it as an insurmountable issue to figure out.

Yes, there are always at least some exceptions to the concept of a standard “best practices” approach here.  Yet most of the time, even with variables in place, it’s a straight forward decision making process.


Look at the signals that content is sending, and it’s scale compared to quality content.  If it’s old or new is less relevant than the quality signals, especially on scale.

Individual page quality scores also need to be considered in relation to the totality of scores within an individual section of the site.  If the overwhelming majority of content in a given section is strong, that can outweigh the negative signals from those pages within that section that are low quality.

The same applies to the entire site.  If enough quality exists on scale across the entire site, that low quality portion is less harmful overall.


There are more often borderline cases that are the tough ones to decide about — mediocre content that may or may not be a problem overall, but where it’s “likely” to be weighing down the site.


Google has done a lot over the past few years to try and communicate what makes something low quality.  Though to be honest, they have done so in very generic “what would a user think” terms.

From my audit work, several patterns have emerged though that fit that notion.

  • Page Speed
  • Topical Confusion / duplication
  • Topical association with your main message / goals
  • Usefulness to Users (intuitiveness of access, helpfulness)
  • Believability

Those are all part of my five super signals:

  • Quality
  • Uniqueness
  • Authority
  • Relevance
  • Trust

From a best practices perspective the answer is simple — set a single high standard for quality, uniqueness, and relevance. If you do so, authority and trust will be an outgrowth of that effort.

Anything below that standard gets killed off. That way you don’t have to guess about whether it may or may not be hurting your site.

One other reason to slash and burn it even if its borderline is if you have a big site — that old or low quality content doesn’t get crawled as frequently as newer or higher quality content, yet it gets crawled at some point.  And that works against you from a crawl efficiency perspective.

If at any point Google is crawling that mass volume of “borderline” content, their system may very well abandon the crawl. It’s a known fact that their systems do abandon crawl all the time, the bigger the site  being crawled. So why force them to crawl questionable content and as a result, have other content that might be newer, skipped? That’s crazy.

404 or 410 — WHICH IS BETTER?

One last recommendation — I’ve found that if you set those no-longer existing pages to a 410 “gone” server status, that sometimes helps speed up the pace at which Google removes those pages from their index.  It’s an unequivocal signal — 404s can sometimes be caused by unintentional mistakes, whereas 410 is a clear signal.

Published by

Alan Bleiweiss

Alan Bleiweiss is a professional SEO consultant specializing in forensic audits, related consulting, client and agency training, and speaking to audiences of all sizes on all things SEO.

10 thoughts on “Google Panda & Killing Old Content — A Deeper Look”

  1. Nice post Alan.

    We have lots of posts that we want available on the site, such as the “turkey drive” we did in 2012, because they show our culture and that we give back to our community. However, since nobody is going to search specifically for the content on that page now, I see no reason to clog Google’s index with it. These types of pages we typically noindex,follow. There are always exceptions though. I think there are more exceptions than rules in search.

    1. Everett,

      You present a great additional reason to consider culling content that fits in with the QUART evaluation processs but where you look at the real current value, and then consider crawl efficiency.

  2. Great advice, Alan. It definitely takes more than low/zero traffic to determine whether content should be deleted. I’ve had good success with this with clients, and have executed full-scale consolidation projects as well where clients had 10+ blog posts about the exact same topic, and we’re starting to see impact. Personally, I’ve done this for my guitar blog ( and it’s had a lot of impact lately as can be seen here in the SEMRush organic traffic graph: I essentially deleted half the site, mostly duplicate content from pre-Panda years) and rewrote 25% of the site pages to ensure the content was unique and quality. Still have a little more work to do, but definitely surpassed Google’s “threshold” of site quality and organic traffic is growing in a major way. Back on page 1 of Google for “guitar blog” as well. This stuff works.

    1. Dan,

      Yeah, it definitely works when done properly.

      It seems like there’s always more work to do to improve things. At least when it comes to cleanup efforts, and improving signals. That’s a primary reason why an audit (where you perform one yourself, you have your in-house SEO or an agency do one) can help focus resources in a prioritized manner. At a certain point, some of that work is just not worth the effort due to diminishing returns based on the effort required, and that’s yet another consideration that needs to be factored in.

  3. You are right… 404’s can take a while. I always seem to use the 404 code but after reading this I will use 410 in the future.

    1. Stuart,

      Sometimes I let the 404 choice slip through, but rarely, and only when I’ve already buried the client with six months of new prioritized tasking that’s much more important. Even then I tend to at least drop in the line recommending they use 410s since they’ve got to make a change anyhow. And especially on scale — when there are a lot of pages to deindex, by setting them to 410, the next time I go into GWT and look at the error reports, if I see all those pretty 410s, it helps me filter them out from the unintentional 404s I find in the report…

  4. Google wants to clean old content that is low quality from your website. Just because an article on your blog is old does not mean you should remove it, even if it has not been viewed much. If the article is useful to the search partner than you should keep it in your website.

    I think the main thing to understand here is to change the content once a year at least to make sure that date ranges are not out of date. That your content is up to date with what your company is providing right now. Just like offline methods, if a brochure is out of date, you won’t hand it out to the client. Google wants to make sure that your brochure is up to date on the website also.

    1. Vitaliy,

      I agree that main content — services, about the company, and product offerings do need to be up to date so yes that’s a very important task.

      The primary issue with what Marie Haynes was originally talking about and asked John Mueller about relates not to that specific content but to blog posts — those are rarely considered “evergreen” content. (Evergreen content is a business’ life-blood — information that either never gets old or does need to be updated to keep it current if some aspect of the messaging in it changes) With blog content, it’s typically “timely” or “time sensitive” or only has secondary value overall to the business’ messaging.

Leave a Reply

Your email address will not be published. Required fields are marked *