最好走的路越走越难,最难走的路越走越容易

Follow guwendong on Web
  • Subscribe to Beyond Search via RSS
  • Follow @clickstone on SinaWeibo
  • Join Resys Google Group
  • Follow @clickstone on Douban
  • Follow @clickstone on Twitter

Tag Archives: ml

Are Machine-Learned Models Prone to Catastrophic Errors?

最近忙,paper 看得多,blog 看得少,险些错过一些非常有意思的文章。上一次提到的 "Introduction to Google Search Quality" 算一篇,这次要说的是另外一篇 "Are Machine-Learned Models Prone to Catastrophic Errors?"。 不过这两个 blog 都被我们伟大的 GFW 拌掉了。

Peter Norvig 这样的大师的意见,我们需要仔细体会。我整理一下我感兴趣的。

  1. tow phase of google search algorithms
    • An offline phase, which is time-consuming and query-independent.
    • An on-line phrase, in response to a user query in a few milliseconds.
  2. Tons of training data … from the armies of "raters" employed by Google
  3. The big surprise is that Google still uses the manually-crafted formula for its search results, despite the fact that, their best machine-learned model is now as good as, and sometimes better than, the hand-tuned formula on the results quality metrics that Google uses.
  4. two reasons
    • the human experts who created the algorithm believe they can do better than a machine-learned model
    • Google's search team worries that machine-learned models may be susceptible to catastrophic errors on unforeseen query types, which is different from the training data.
  5. Nassim Taleb divides Black Swan phenomena into two classes
    • Mediocristan
    • Extremistan
  6. The current generation of machine learning algorithms can work well in Mediocristan but not in Extremistan.

So the thing is, how to figure out whether new machine learning algorithms can be devised that work well in Extremistan, or prove that it cannot be done?

 

ResysChina 发起人
1. 持续关注 个性化推荐 技术;
2. 持续关注 Semantic Web 技术;
3. 评论与上两项相关的互联网业务与产品;

我相信技术的力量!
wendell.gu@GMail.com

Archives