January brought a spate of stories about Google’s search quality. Reading through some of these recent articles, you might ask whether our search quality has gotten worse. The short answer is that according to the evaluation metrics that we’ve refined over more than a decade, Google’s search quality is better than it has ever been in terms of relevance, freshness, and comprehensiveness. Today, English-language spam in Google’s results is less than one-half what it was five years ago, and spam in most other languages is even lower than in English. However, we have seen a slight uptick of spam in recent months, and while we’ve already made progress, we have new efforts underway to continue to improve our search quality.
Just as a reminder, webspam is junk you see in search results when Web sites try to cheat their way into higher positions in search results or otherwise violate search-engine quality guidelines. A decade ago, the spam situation was so bad that search engines would regularly return off-topic webspam for many different searches. For the most part, Google has successfully beaten back that type of “pure webspam” — even while some spammers resort to sneakier or even illegal tactics such as hacking Web sites.
As we’ve increased both our size and freshness in recent months, we’ve naturally indexed a lot of good content, and some spam, as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual Web pages, e.g., repeated spammy words — the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content. We’ll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.
As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the Web loud and clear: People are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content. We take pride in Google search and strive to make each and every search perfect. The fact is that we’re not perfect, and combined with users’ skyrocketing expectations of Google, these imperfections get magnified in perception. However, we can and should do better.
Mediabistro Event Job Search Intensive, and interactive online event starting June 11, 2013. Over four weeks, we'll show you how tow revamp your resume, get noticed by recruiters, nail your interviews, and negotiate your salary. Sign up for our Webcast Only or Full Registration package. Either way, save by signing up soon.