Discussion about this post

User's avatar
Overcoming Bias Commenter's avatar

I can confirm that the models at work in the NetflixPrize are ridiculously simple. Rather than complex bayesian statistical formulations and multilevel models, you have early stopping with a little ridge regression. I was humored to see the earlier reference to single exponential smoothing. I had just used something similar to that to great effect.

Though it's not clear to me how competitive the contest was. On multiple occasions, what were essentially amateurs scaled to the top 10 within a few months of beginning their efforts. I have a model that beats the best published ones, and I don't really have any clue what I'm doing.

Expand full comment
Overcoming Bias Commenter's avatar

Same comment as Phil Goetz. Empirically on the high-stakes ultra-competitive Netflix Prize, the best performance was not put forth by simple models but by combining many models ranging from simple to complex. But conversely, most statisticians who tried their hand at the Netflix Prize did much more poorly than the best performers. We may be looking at inadequate incentives, inadequate controls for overfitting, prestigious folk who are not the best performers, prestigious folk who overuse complex and impressive models with inadequate checking, or it may just be an empirical fact (though it would surprise me and I would have expected the opposite) that the machine learning community has its act together and the statistical learning community doesn't.

Expand full comment
12 more comments...

No posts