"the ranking of Google search results" 是 google 的命脉,因此很少能看到官方的说明。原因倒也无可厚非:"competition and abuse",这两件事情确实哪一个都不容忽视!
不过近日,google 负责搜索质量的 VP, Udi Manber,透露了一些有意思的资料,"Introduction to Google Search Quality"。对 search 感兴趣的千万不要错过!我列举一下我感兴趣的。
- more than one thousand programmer/scientist years have gone directly into their development (that is to say "the ranking algorithms").
- divided into some teams
- The heart of the group is the team that works on core ranking
- Another team in our group is responsible for evaluating how well we're doing
- Another team is dedicated to new features and new user interfaces
- There is a whole team that concentrates on fighting webspam and other types of abuse
- There are other teams devoted to particular projects
- PageRank [1] is still in use today, but it is now a part of a much larger system. … made significant changes to the PageRank algorithm in January, 2008.
- some other parts
- language models (the ability to handle phrases, synonyms, diacritics, spelling mistakes, and so on)
- query models (it's not just the language, it's how people use it today)
- time models (some queries are best answered with a 30-minutes old page, and some are better answered with a page that stood the test of time)
- personalized models (not all people want the same thing).
- Google conducts evaluations typically in three manners, (1) automated evaluations every minute, (2) periodic evaluations of our overall quality and (3) evaluations of specific algorithmic improvements.
- In 2007, … more than 450 new improvements, about 9 per week on the average.
- .. work on projects where the sole purpose is to simplify the algorithms. Simple is good.
Udi Manber 是我最景仰的几位科学家之一。"Chief Algorithms Officer"这个职位,就是 Amazon 专门为表彰他的贡献而首创出来的。
推荐阅读:
- 20 (Rare) Questions for Google Search Guru Udi Manber
- Insight Into Google's Search Quality Efforts
在Google’的 Q3 2006 earnings call 中,Google CEO Eric Schmidt 用大量篇幅提到了“personalization“,并将”personalization of information”归入了Google的使命,另外,还提到了相关的一些计划。这引起了我的注意!
先对相关内容作简单的摘录:
We believe that people’s information and the information they want to receive … needs to be accessible when and where they want it for them in a very personalized way.
The interesting thing is that this approach to having your information personalized is a benefit not only for the user who can continue to refine and target information … but also for businesses who want to know they are spending their money in an effective and targeted way.
As we continue to innovate and bring out … new products, we’ll also continue to … improve the experiences, bringing the most personalized and targeted information to people, which is ultimately our mission.
[We] provide access to the world’s information … [and] organize it in a very personalized and targeted way. That benefit drives the entire cycle of Google, and it’s fundamental.
联想到日前备受瞩目的“Kiko拍卖”事件,不得不让人担心,正在围绕“Personalized”展开业务的那些轻量级创业公司,他们的前景究竟如何?
Paul Graham 是Kiko的投资者之一。当Kiko刚开始在eBey上进行拍卖的时候,他曾经表示,Google Calendar的发布以及同GMail的完美结合,是导致Kiko失败的主要原因之一。他建议,新兴的创业公司应当从Kiko身上吸取教训,远离 Google的前进道路。那么,现在,当Google准备进军“Personalized”的时候,对于相关的这些公司,应该如何是好呢?
我个人倒是认为,在“Personalized”方面,Google一定不会是通吃的赢家!在《垂直搜索 or 个性化推荐》一文中,我也曾经提到,可以引入个性化技术的应用数不胜数,而且也不存在普遍适用的推荐算法,Google不可能也没有能力将其业务覆盖到所有这些方面。因此,只要选准一个方向,研究出最合适的推荐方法,那么,领先Google绝对是有可能的!
其实,就我个人来看,Google最擅长的,应该是开发Google员工在工作中使用的产品或服务。搜索自然不用讲了,其他的,例如,GMail, Google Calander,Google Reader,这些领域相关的服务商,我想基本上没有什么太多的机会了。但是其他的,例如,Youtube之于Google视频,Findory之于 Google News,我认为前者的胜算可能就比Google大。这是因为,Google员工应该很少会在工作中观看视频或者浏览新闻。况且,Google已经完完全 全地发展成为一个庞然大物型的公司,而这正是Google为数不多的弱点之一。因为通常情况下,随着公司规模的扩大,官僚主义作风也会随之加重,这会导致 其比较难接受新奇的事物。
因此,致力于“Personalized”的轻量级创业公司,我的结论是:选好方向,发挥创意,放心大胆地冲吧!