25 Apr 2011

Google Panda Update – Thoughts and Solutions from Jim Boykin

Google Panda Update – Background and Possible Solutions.

I’ve been in SEO for over 12 years and I’ve seen several major Google updates over the years…and this year there’s the Panda Update that has 14% of search results shot to a new Google Hell, a Hell called Panda.

To understand Panda, you need to know some of the filters that Google has already put in place.

The Supplemental Results “filter” of the past.

One of the biggest “filters” that Google has is what used to be known as the Supplemental Results. I strongly advise that if you are not familiar with the Supplemental Results (2005-2007), that you read about that history to know what Google has already done to take out 95% of the crap content.

First, read this post that I wrote in November 2005 where I explained what the Supplemental results were, and what they mean to your traffic.

It is worth noting that I experienced and wrote in April of 2007 about Supplemental Results:

Speaking of Supplemental Hell, here’s something else that I’ve been experiencing…if you publish 300 pages today and 3 months goes by and no one links to any of these 300 pages…guess where they might be going? (Supplemental Hell?) – (Moral, don’t publish a bunch of pages in once (especially in a new folder) unless you know they’ll get some backlinks and trust to those pages within a few months…I’ve seen many sites have entire folders go to supplemental hell after people published hundreds of pages in that folder in 1 day and then nothing there got a link for the first 3 months of existence….and seen other new folders/pages survive (so far) that got a small handful of nice backlinks to even a few pages in new folders…..kinda says something about checking for “quality rating” of new pages…no links to any pages in a folder and supplemental hell for them all. (Moral, publish a new folder with 300 new pages, get some of those pages some trusted backlinks, and hurry!)

Also, in April of 2007, Forbes wrote an interesting article about the Supplemental results that sounds eerily like a lot of the noises that I’m hearing again… deja vu….really worth reading again.

In July of 2007, I gave advice on how to stay out of this Google Hell…this advice is still great and worth reading today.

In August of 2007, Google removed the Supplemental results label on pages in the search results, and started indexing those pages more frequently… making it hard to detect which pages are in the supplemental results…so I wrote about a method that’s still good at finding pages that are definitely in the supplemental results…but FYI, keep in mind that today, there’s this + Panda.

After the days of the Supplemental Results

I believe that after they removed the ability to clearly see which pages are in the supplemental results, that they then went on a binge of putting way more % of pages into this “Supplemental index”.

So something to understand today with Panda is that Google was already pretty good at tossing the majority of everyone’s pages on their sites into the supplemental results. At least the deep pages, and the pages with little content, and the pages of dup content…I had a call with a client today who had 397,000 pages indexed in Google…I told him that already Google probably had all but perhaps a few thousand pages in the supplemental results…and now, after panda II, he has about 20 quality pages…the rest needs to be redone and updated for 2011 and beyond…or should I say from after being Pandasized.

Google’s been tossing duplicate pages, and poorly linked pages, and pages with little content into the supplemental results for years… this hasn’t been an issue with Google….the issue comes when you have original content, on a page that is not deep enough to be at the “supplemental level”…

So, first thing, remember that things like this have happened in the past….understand the supplemental result history…and know that this was 4 years ago…and they’ve gone way beyond this now..

Caffeine Update and Beyond:
Since then they’ve added some other signals….what used to be the “supplemental index” has probably been rolled into a later update that they did called Caffeine in 2009…but telling what is “good” and what is “bad” when it comes to original content with “power” has been a weak point…just because a page is original, and powerful, doesn’t mean its a quality page.

If I were Google…what would I look at?

So how would I, if I were Google, tell if a page were “good” or “bad”?

Time on Page?
….Is time on the page important?…maybe a little…if I see 1000 words, and the average time on the page is 15 seconds, I wouldn’t give all that content much weight in the content part of their equation… but it can still certainly “solve the question” someone was searching for…in fact, it can solve the question in just a few seconds and still be “good” content..

Pages Visited?
Does the number of pages visited on your site make a quality page?…is it better that someone engages with your site…but can’t they get the answer w/o engaging with your site….does Google care if they engage with your site…or do they care if the searcher quickly finds what they were seeking?

Brand Loyalty?
Is “brand loyalty” important?… well…does someone need to come back to a site for it to be a good search result?…maybe…but then again, Google probably doesn’t care too much if people go back to your site again and again.

Click Through Rate?
What if it gets high Click Through Rates (CTR’s) in the Search Engine Results Pages (SERP’s)?…yes, that can be an indicator that that is what people are looking for…and Google, I’m sure, is giving sites with high ranking CTR’s in SERPS a ranking boost… and those with low CTR I’m sure are getting a “negative” in that part of the algorithm. I believe that CTR in SERPS has been a signal for years…but that still doesn’t tell if the searcher found what they are looking for when they did go there.

But…consider this…
Consider, also, what if you were ranked #1, and your CTR was 50% (Average CTR for #1 organic rankings is between 35-45 % of all clicks)…so that would be a great Click Through Rate…but what if 98% of those people returned to the same Google search, and then they clicked on the guy at #2…and then they didn’t return to Google for another hour, where they performed some other search for something else…I’d say that they must have found what they were looking for at #2…and they did Not find what they were looking for at your site….if 98% of the people that went to your site, just backtracked to that same search and stopped at #2, then I’d say that your content might not be great for that phrase…even if it’s original content…and even if it’s at a level of not being supplemental’d.

Why does dup content rank higher now?
So why, if you grab a phrase, post panda; do the scraper pages rank higher?…why I’d say it’s because you tripped a negative content filter…and a negative content score must be worse than a supplemental results page…thus the supplemental results show up first, above you….making it look like they’re ranking those pages higher…but even those scraper pages won’t get much traffic being in the supplemental index.

So how do you measure this?...where in the webmaster tools, or Google analytics, can you find who went back to Google???…answer…you can’t…does Google know…yes…do we…no, and that is the main reason why people are looking over their analytics and going insane seeking the “perfect” rules for what you need…that’s hard to do with the information that is available to you even with Analytics and Webmaster tools….there must be large signals that are missing to fill the gaps where analytics and usability fail in the analysis of Panda…

I believe that the main data that’s missing is who quickly went back to Google.
I know of no way to tell who went from your site back to Google, and what they did once they went back to Google….and, if they went back to Google,  #1 did the searcher return to the same Google SERP and clicked on someone else….or, #2, did they returned to Google and run another search?

We can’t measure this…Google doesn’t give us this information….so how do you tell if the content is good?

What Google has said since Panda
On February 25th, the day after Panda Started, Amit Singhal, a Google Engineer, gave an interview with the Wall Street Journal. Here’s the important part:

Singhal did say that the company added numerous “signals” or factors it would incorporate into its algorithm for ranking sites. Among those signals are “how users interact with” a site. Google has said previously that, among other things, it often measures whether users click the “back” button quickly after visiting a search result, which might indicate a lack of satisfaction with the site.

In addition, Google got feedback from the hundreds of people outside the company that it hires to regularly evaluate changes. These “human raters” are asked to look at results for certain search queries and questions such as, “Would you give your credit card number to this site?” and “Would you take medical advice for your children from those sites,” Singhal said.

…Singhal said he couldn’t discuss such details and how the search algorithm determines quality of site, because spammers could use the answers to game the system.

Data from hundreds, or billions?
Now looking at this…would I trust the data I get from “hundreds of people”…or from looking at 34,000 searches per second?….yea, I’d use the data from the searches a lot more than I’d put weight in what “hundreds” of people say to your questions about credit cards and medical advice things….

But anyways… a week later, Matt Cutts joins Amit Singhal in an interview with Wired Magazine:

Wired.com: How do you recognize a shallow-content site? Do you have to wind up defining low quality content?

Singhal: That’s a very, very hard problem that we haven’t solved, and it’s an ongoing evolution how to solve that problem. We wanted to keep it strictly scientific, so we used our standard evaluation system that we’ve developed, where we basically sent out documents to outside testers. Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”

Cutts: There was an engineer who came up with a rigorous set of questions, everything from. “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines.

Singhal: And based on that, we basically formed some definition of what could be considered low quality.

Singhal: You can imagine in a hyperspace a bunch of points, some points are red, some points are green, and in others there’s some mixture. Your job is to find a plane which says that most things on this side of the place are red, and most of the things on that side of the plane are the opposite of red.

Cutts:…But for example, our most recent algorithm does contain signals that can be gamed. If that one were 100 percent transparent, the bad guys would know how to optimize their way back into the rankings.

Keep in mind how these answers differ from the original interview with just Amit Singhal….in the original interview with just Singhal, it was said, “Google has said previously that, among other things, it often measures whether users click the “back” button quickly after visiting a search result, which might indicate a lack of satisfaction with the site.”

But now, a week later, with Matt Cutts there, the “click back to Google” isn’t mentioned…instead it’s just the

hundreds of people outside the company that it hires to regularly evaluate changes.” that are talked about…some, things like “clicking back” are not mentioned in the interview a week later when matt’s there as well…but Matt Cutts does say “…But for example, our most recent algorithm does contain signals that can be gamed. If that one were 100 percent transparent, the bad guys would know how to optimize their way back into the rankings.”

It’s worth noting this link where over 2000 people are crying to Google saying they were collateral damage…and probably 14% are really collateral damage….you can add your site to that list for the Google engineers to check out: Google Webmaster Central Help Forum In there, WYSG, a Google employee states:

Our recent update is designed to reduce rankings for low-quality sites, so the key thing for webmasters to do is make sure their sites are the highest quality possible. We looked at a variety of signals to detect low quality sites. Bear in mind that people searching on Google typically don’t want to see shallow or poorly written content, content that’s copied from other websites, or information that are just not that useful. In addition, it’s important for webmasters to know that low quality content on part of a site can impact a site’s ranking as a whole. For this reason, if you believe you’ve been impacted by this change you should evaluate all the content on your site and do your best to improve the overall quality of the pages on your domain. Removing low quality pages or moving them to a different domain could help your rankings for the higher quality content.

We’ve been reading this thread within the Googleplex and appreciate both the concrete feedback as well as the more general suggestions. This is an algorithmic change and it doesn’t have any manual exceptions applied to it, but this feedback will be useful as we work on future iterations of the algorithm.

So…what are the possible solutions?:

You can add your site to the over 2000 sites above in the Google forum…You can analyze your webmaster tools data and your Google analytics data and get a lot of “partly smoking gun” information (which is helpful, and we do this as well)…

And, certainly can’t hurt to look at your pages and ask these questions:

  1. Would you be comfortable giving the site your credit card?
  2. Does the site have excessive ads?
  3. Do you consider this site to be authoritative?
  4. Would it be ok if this were in a magazine?
  5. Would you be comfortable giving medicine prescribed by this site to your kids?

Perhaps the biggest question of all I really think should be “How do I get people to not quickly go back to the same Google search?”

What do you think?

PS…did you hear that We Build Pages is changing names to Internet Marketing Ninjas in a few months?

Other posts on Panda that I did:

Click Here
If you were affected by
Panda or Penguin!

Comments

  1. Google Panda Update – A Broader View of U.S. Traffic Patterns | WebProNews April 26, 2011 at 8:17 AM

    […] Jim Boykin wrote an interesting piece about Panda, with a bit of a history lesson, referencing Google’s “supplemental […]

  2. Kevin April 26, 2011 at 9:26 AM

    Jim, the links you reference to past posts on the supplemental index all 404.

  3. We Build Pages April 26, 2011 at 9:50 AM

    Thanks for the heads up Kevin – fixed!

  4. SearchCap: The Day In Search, April 26, 2011 April 26, 2011 at 2:04 PM

    […] Google Panda Update – Thoughts and Solutions from Jim Boykin, http://www.internetmarketingninjas.com […]

  5. Joel April 26, 2011 at 6:53 PM

    Another good threshold question is “Would I recommend this site to someone else?” If the answer is yes, you shouldn’t have anything to worry about from Google.

  6. Adam April 26, 2011 at 11:44 PM

    i am so happy you have started blogging again. your articles are gold and we would love to get them at least once in a month. Thanks.

  7. The Panda Fallout – Has Google Missed The Mark? | Marketing Blog | Creative Development April 28, 2011 at 5:01 PM

    […] is not as clear cut as that. Even Jim Boykin, a long time SEO analyst, has been forced to create a long winded explanation of the changes Google went through. Basically he says that if the website looks trustworthy, then […]

  8. Google Panda Update – User Behavior and Other Signals. May 2, 2011 at 7:33 PM

    […] Google Panda Update – Thoughts and Solutions from Jim Boykin […]

  9. paavan May 3, 2011 at 9:46 PM

    I understand your solution, but some of website have over 10,000 how can they survive, change content its very difficult.

  10. Edgar May 17, 2011 at 2:05 AM

    If what you say is true, we will soon see robots to defeat the algorithm.

    Suppose a robot searches google for my keywords, clicks on my competitor’s page and immediately clicks on the back button to make an other choice (with proxies of course)… One could easily destroy his ranking.

    I have a hard time to believe that Google would implement something so simple to beat, but the harsh reality is that they probably do.

    Once, Google claimed that ‘you are safe from any nasty actions undertaken by your competitors’. I think that more and more nothing could be further from the truth.

  11. roey May 17, 2011 at 1:33 PM

    i agree only partly with this post
    don’t belive in ctr as a signal

  12. We Build Pages May 17, 2011 at 7:07 PM

    Roey, I don’t believe that CTR is a signal either…
    what I do believe, is that people who go to your site from a google search, and then go back to google and click on another site, that that’s a Very strong signal to google that your page wasn’t good….(yes, everyone can name “exceptions to why people would do this…but most of the time, that behavior is a bad user experience in the eyes of google…
    this metric is isn’t click through rate…it’s a metric that only google knows.

  13. We Build Pages May 17, 2011 at 7:18 PM

    Edgar,
    if you were using this metric as one of the major metrics in your algorithm, what would you tell people…um….you’d probably say something like…like everything that google has been saying so far…

    but…spamming google with fake user bots emulating real “random….yet optimized” behavior isn’t as easy as it might seem…

    There’s a lot of problems with faking user behavior…like….Google collects so much data from so many different sources…and I’d guess those “signals” would have to be consistent across all channels that they were measuring…and the amount you’d have to “spoof” would be huge (I could go on and on…I just deleted several paragraphs of text talking about that…better no to publish details of things you’d need to look at to consider a Panda Bot.)

    But yes….there might be a few really really clever “bad guys” who might be able to spoof this user behavior across thousands of pages of someones sites, using many times more thousands of google searches to emulate this behavior of all those users from all over the world, across every medium that google uses to measure these things… to a large enough degree to impact a site to be “hit” by panda…there’s only a few people who can do that (and I’d plead the 5th on if I’d know anyone who could ever do that)…but anyways..there’s millions and millions of people who publish “crap” on the web…If I were google, I’d take the few “bad guys” over the millions of people who publish content that makes people want to return to google….I’d hope that the Google team would have enough “cross reference” checks to make fake bot users impossible

    I did talk to a friend who was working on a bot tool that he wanted to see if it could be used to help get his site OUT of panda… gee…if that worked, how much would the “Panda Poop Off” tool be worth?….

Leave a Reply