Toby Segaran 是 Recommender Systems 及 Semantic Web 方面的大牛,著有两本很受欢迎的技术著作,《Programming Collective Intelligence》[1]、 《Programming the Semantic Web》。他具备一项羡煞旁人的技能——轻描淡写之间理清算法脉络,让枯燥的公式变成具有喜感的代码。我是他的 fans。
今年上半年,他受 Andreas Weigend 之邀,在 Stanford 的 Data Mining and Electronic Business 课堂上做了一次有关“Recommender Systems”的讲座。Andreas Weigend 之前在 Amazon 任职,位居首席科学家,在 Amazon 的推荐引擎建设方面做出了大量的贡献。下面是 Toby 自己列出的一些主要观点,
- Amazon makes 20-30% of its sales from recommendations. Only 16% of people go to Amazon with explicit intent to buy something
- The data that you collect matters much more than the algorithm you use. Amazon’s
algorithm is essentially a large product-product correlation matrix for
the past hour, but it works for them because hey collect so much data
through user actions - Many problems including shopping, targeted advertising, dating, finding events, etc. can be framed as recommendation problems
- Very important take away: find ways to collect as
much user input as possible without being disruptive. People don’t
train systems, they try to benefit themselves, but this is the best
kind of training data - There are a lot of different types of data that can train a system:
votes, clicks, page-view time, purchases, tagging, adding a title — the
user does these things anyway, and you can use the data - A/B testing is an effective and underused way to learn about
people. Simply by varying the way you phrase something, you can learn
more about your users - Very few systems now are combining metadata or content with
collaborative filtering. The consensus in the class when discussing a
music recommendation system was that this could be very effective






Leave a Reply