|In October 2006 , a little-known American entertainment company called Netflix announced the Netflix Prize, a competition offering US$1,000,000 to any member of the public who could improve the performance of Netflix's recommender system by at least 10%.|
 Can you guess the year Netflix was founded? Here's a hint: it was 1997.
Netflix evaluated submissions based on RMSE (surprise!) which we covered in depth on Day Eighteen, and can be expressed as sqrt(sum((y - y_pred)^2)).
The Netflix algorithm was called Cinematch (a mid-2000s name if there ever was one) and its best peformance at the time was 0.9525, so to win the prize a participating had to achieve an RMSE of 0.8572 or lower.
Here's how it all went down:
|2007||0.8723||8.42%||KorBell (an alternate name for BellKor, the AT&T Labs team)|
|2008||0.8627||9.27%||BellKor in BigChaos (the previous year's leading team in a crossover episode with another team, BigChaos)|
|2009||0.8567||Winner! 10.06%||BellKor's Pragmatic Chaos (the previous year's leading collaboration joined by yet another team, PragmaticTheory)|
|The leaderboard and original contest rules ares still accessible here.|
|Netflix reported over 5,000 teams making valid submissions over the course of the competition, and the three-person KorBell/BellKor team spent over 2,000 hours in year one coming up with the 107 algorithms that led to their 8.4% reduction in RMSE.|
So despite the seemingly negligible differences in RMSE from year to year, a tremendous amount of time and energy was devoted into carving out each decimal point gain.
KorBell/BellKor joining forces with other teams to form some kind of data science suicide squad wasn't all that unusual. In 2007, the second-placed team on the leaderboard was called Dinosaur Planet. Dinosaur Planet later joined forces with a team called Gravity to form When Gravity and Dinosaurs Unite.
Nor was the competition without its fair share of drama. A team called The Ensemble (a merger of the teams Grand Prize Team and Opera Solutions and Vandelay United) matched KorBell/BellKor's final result with an RMSE of 0.8567, but the KorBell/BellKor collective submitted their results 20 minutes earlier.
Apart from being a neat historical detail, the story of the Netflix Prize serves as a useful introduction to the concept of collaborative filtering.
Collaborative filtering is a technique used by recommender systems. Let's get right to it.
Let's say we have a subset of Netflix shows, users, and the ratings users gave to each show.
|User 14 hasn't watched Moana... But they watched Zootopia and liked it, giving it a 5. They also watched Jessica Jones and Daredevil but didn't like them, giving both shows a 1.|
We know User 14 hasn't watched Moana. Should we recommend that they watch it?
Putting aside the fact that Moana is one of the greatest movies of all time and the perfect way to teach your kids that the same organizational practices critical to the execution of an existing business model will inevitably lead to the disruption of that same organization, let's see what we can infer from the data.
Based on our very small dataset, we can see that Users 14 and 29 exhibit similar behavior - they both gave Zootopia a high score and Jessica Jones a low one. We know that User 29 gave Moana a high score. If we're working off the assumption that Users 14 and 29 have similar tastes, we can reasonably expect User 14 to give Moana a high score as well.
Taking this one step further, let's assume that each show has a set of characteristics that affect how they are likely to be rated by different users, and that each user has a set of preferences that affect how they are likely to rate different shows.
How could we categorize the four shows (all owned by Disney)? We could describe Moana and Zootopia as cartoons, and Jessica Jones and Daredevil as superhero shows.
And user preferences could be as simple as "likes cartoons" and "likes superhero shows".
|likes superhero shows||72||2||5||4|
|likes superhero shows||211||1||1||4|
|But the world is richer and more complex than that. What if the reason User 211 rated Jessica Jones highly wasn't because they like superhero shows, but because they like shows with a strong female protagonist? Daredevil doesn't seem like an obvious recommendation to make in this case.|
We need to make allowances for multiple user preferences and show characteristics, not all of which we might be able to articulate - or even be aware of.
Does that sound like a problem we can throw some randomly generated weights at? Because it sure sounds like a problem we can throw some randomly generated weights at.