Justin Santa Barbara’s blog

Databases, Clouds and More

Open Competition

I recently entered Netflix’s latest competitive / crowdsourcing competition, the Netflix OSS prize. Similar to their million dollar prize for a 10% improvement in their recommendation algorithm, this one offered (smaller) prizes for enhancements to their open-source software. The first competition started a whole industry of competitions for crowdsourcing answers to tough problems: Kaggle, 99 designs, even GE got in on the fun. I think this competition will likely encourage a similar movement: we’ll see lots of competitions that aim to give a boost to open source projects.

First, a bit of background. I’m a big believer in open source software and clouds. I submitted an entry based around making a more Machiavellian Chaos Monkey (I added 14 new ways instances could fail). I’m honoured to have been selected as one of the finalists. But don’t worry, this isn’t a request for votes (unless you star my repo, I don’t think there’s any way to show your support, and even then it’s a tough field!)

I think the competition is most interesting because of the way it was conducted. To enter, you had to fork the repo on github, adding in your submission and links to your code, also on github. That meant that every entry showed up in the network graph, even during the competition, and you could easily click around to see what everyone else was working on. It’s still up there today, and every entry should remain public forever. I’m certain none of this was accidental. It’s a new way to run a competition, and I would love to hear the story of what the lawyers said when this was proposed, but kudos to Netflix for getting it done!

There were a few very minor things that I think should be polished up should this be repeated. There was obviously an incentive to enter late in the competition. Open source is normally better when everyone is releasing early and often; this encourages greater collaboration. I think the majority of the entries were finalized in the last week or two. Some of them obviously had a lot more than a week’s work go into them - IBM ported the entire Netflix OSS suite to their own preferred stack. It would be great to encourage earlier submissions. It could be as easy as saying that e.g. 20% of judging would be based on when the entry was recieved. The original Netflix prize did an excellent job of encouraging early submissions (though not with collaboration in mind): there was a hidden dataset which forced people to submit their work so they could get feedback; the first to hit 10% would win it all; and they awarded interim prizes along the way. So I think there are solutions here!

More problematically, collaboration is much trickier when there’s a “winner”. Along with $10k in cash, there’s the chance to attend the AWS conference in November, and there’s only one ticket available per category. Teams could enter, but they are responsible for sharing the prize. This is a much harder problem to solve.

More important than these minor issues were the things that Netflix/Github/AWS got absolutely right. Primarily that was being 100% open. It was great that everything had to be open source (Apache licensed) and completely open for everyone to see (on Github). It removes any concerns about IP for everyone - if you’re not comfortable making everything public, then don’t enter. (It’s my variant of the New York Times rule: if you wouldn’t want to see it on github, you shouldn’t enter it into a competition.) This extreme transparency gives everyone a sense of fairness and - more importantly - fun.

In terms of what Netflix got out of it, I’m sure they would have liked to have more submissions (I’d guess they had about 40-50 entries). But I think that’s missing half the point: it’s like a slogan competition (“I want to go to Hawaii because…”). The goal is not to get a slogan for the next commercial “on the cheap”, but to get you to think about all the great things about Hawaii.

Netflix OSS gained a whole lot of awareness, probably 100 developers that are now familiar with the codebase to the point of having made code changes, and probably 1000 that have read (bits of) it. Given the T&Cs said that entering was not an offer of employment (a rather odd inclusion, at face value), they probably are hoping to hire a few people as well, and now have some great leads. And running a competition like this certainly gives their “developer cred” a nice little boost. For $100k (I’m guessing AWS are covering the AWS related prizes), I think Netflix did very well.

I think this could be the first of many similar “open competitions”. It may even spawn a few companies (like Kaggle) to help companies run these prizes. Time will tell!