Recently the European Commission
opened a preliminary inquiry into competition complaints. Part of the complaint alleges that Google operates without sufficient transparency into how and why web sites rank in our search results. The notion that Google isn't transparent is tough for me to swallow. Google has set the standard in how we communicate with web site publishers. Let me tell you about some of the ways we explain to sites how we rank them and why.
One of the most widely-discussed parts of Google's scoring has always been PageRank. That "secret ingredient" is hardly a secret.
Here it is. That early paper not only gave the formula for PageRank, but mentioned many of the other signals in Google's ranking, including anchor text, the location of words within documents, the relative proximity of query words in a document, the size and type of fonts used, the raw HTML of each page, and capitalisation of words. Google has continued to publish literally
hundreds of research papers over the years. Those papers reveal many of the "secret formulas" for how Google works and
document essential infrastructure that Google uses. Some of these
papers have spurred not only open-source
projects but entire
companies in their own right.
Academic papers are one thing, but Google also aims to engage and educate in many other ways. In 1999, Sergey Brin participated in the first
Search Engine Strategies conference for webmasters. In 2001, Google became one of the first search engines to engage online at a publisher forum called
WebmasterWorld. One representative (GoogleGuy) has posted over 2800 times, while another (AdWordsAdvisor) has posted almost 5000 times.
Google's efforts at transparency and communication have evolved with the web. We started blogging in May 2004 and have written thousands of posts on our official blog. Google now has over 70 official blogs, including an
official webmaster blog specifically to help site owners understand how Google works and help them rank appropriately in our search results. Google publishes more blog posts than almost any other large company. We also provide
extensive public documentation on our web site with advice for publishers,
in dozens of different languages.
As the head of Google's webspam team (which tries to stop attempts to violate our clearly documented, public
webmaster guidelines), people often ask me questions about how Google works. That's why I started
my own personal blog in 2005 and have written hundreds of posts about Google. The topics range from
common web site mistakes to
advice for new bloggers. I've had the pleasure of speaking to web site owners or doing public web site reviews at over 30 different search conferences.
We've tried all sorts of experiments to help site owners understand how Google's search ranking works. We've done multiple
live webmaster chats online with hundreds of simultaneous participants. We've experimented with
tweeting. We've participated in
podcasts. And here's one of my favorite ways we've helped to break out of the black box and give advice to publishers: in the past year, we've taken questions from the public and posted hundreds of video answers on a
webmaster video channel. Those videos have been watched
over 1.5 million times (!). We also engage online across the blogosphere to answer questions about Google's practices.
The list goes on and on. Google has reached out to other search engines on methods to make life easier for website owners. The resulting standards include
specifying preferred web site url formats as well as
Sitemaps, an easy way for webmasters to tell search engines about the pages on their site. Google provides a webmaster forum where both Google employees and helpful outside "superusers" hang out and answer questions about specific sites. We've run in-person website clinics to provide specific one-on-one feedback and advice in locations from
San Francisco to
India to Russia to
virtual site clinics in Spanish. We've even confirmed ranking signals that Google doesn't use in our algorithms, such as the
keywords meta tag, which saves site owners from doing needless work and
helps avoid frivolous lawsuits.
The frustrating thing is that even if all 20,000 employees at Google worked full-time on answering questions from website publishers, we still couldn't talk to every site owner. Why not? Because the web has over
192 million domain names registered. That's why we introduced
Google Webmaster Tools, a one-stop location to provide scalable, self-service information and to let webmasters provide us with data. Describing the powerful tools we provide to site owners for free would take an entire other blog post, but a number of the offerings include:
- Site owners can get recommendations about issues like duplicate meta descriptions or missing title tags.
- Site owners who we believe have violated our webmaster guidelines and where Google has taken corresponding action regarding their site in our index can submit a request for reconsideration.
- Site owners who have been hacked can get details about malware on their site. After they remove the hacked content, they can fetch pages from their site as Googlebot to make sure the malicious content is really gone.
- Site owners can find out about errors that Google encountered while crawling their site.
A Google employee recently blogged about using these free, public tools to
diagnose an issue with his webhost where he had exceeded his bandwidth quota. Millions of webmasters have taken similar advantage of Google's free tools for site owners to get helpful information about their site.
At Google, we try to be as open as we can, even to the point of helping users
export their data out of Google's products. At the same time, we don't think it's unreasonable for any business to have some trade secrets, not least because we don’t want to help spammers and crackers game our system. If people who are trying to game search rankings knew every single detail about how we rank sites, it would be easier for them to 'spam' our results with pages that are not relevant and are frustrating to users -- including porn and malware sites.
Ultimately, criticising Google for its "secret formula" is an easy claim to make, but it just isn't true. Google has worked day after day for years to be open, to educate publishers about how we rank sites, and to answer questions from both publishers and our users. So if that's how people choose to define "secret," then ours must be the worst kept secret in the world of search.
Posted by Matt Cutts, Principal Engineer, Search Quality Team