Affect Labs

The Eurovision Problem

Eurovision, to those uninitiated in this glorious annual ritual of self-parodying and ultra-serious Europop, is technically a European version of The X Factor. Only with a voting system Congress would be proud of, with countries picking local favourites, allocating points, and the winner being the country garnering the most points overall.

Voting has traditionally been a wondrous mish-mash of politics and geography combined with points directly proportional to the cheesiness of the act. For example, the UK always tends to vote Ireland up, and vice versa; Eastern European countries pat each other on the back, and Germany never gives points to France.

(This is somewhat of an exaggeration, but as a teenager, Eurovision was how I learnt international politics and, later, the French for ‘Bosnia-Herzegovina’.)

So, I went on what I’ll fondly call a public transport experiment on my way home from the airport on Saturday night. This is relevant, because it means I was on a bus in the middle of nowhere for most of Eurovision. Fortunately, thanks to Twitter, it was as if I was sat at home in front of the TV.

Nothing really comes close to Twitter for event coverage when you’re away from civilisation. It really was amazing. Snark and sarcasm from celebrities coupled with genuine patriotism, descriptions of astounding costumes, and mildly-concealed insults (it’s not xenophobia if it’s Eurovision, right?).

The First Eurovision Problem

The title of this post is misleading; there are two Eurovision problems (discounting the fact the UK didn’t come last, disappointingly).

Firstly was my simple inability, when on the move, to only follow certain Eurovision-related tweets. I heard that @Schofe and @Wossy were providing great commentary, but their tweets either got lost in the flood of ‘all updates’ or ‘all eurovision’; I didn’t have a way to see ‘all (friends + eurovision)’.

Nor did I, using Tweetie, have a way to temporarily define a group of people whose updates I wanted to follow. I was tempted to create a new Twitter account just to follow a few people and get Eurovision that way, but figured it would be too awkward to do this by phone.

Of course, this is all my own fault for following so many people in the first place, so I suppose the solution would be to do a grand Twitter prune, or set up a second account just for information overload. But that doesn’t really seem in the spirit of it.

The Second Eurovision Problem

This is a fun and meaty information filtering problem that relates to realtime predictions in a big way. I didn’t have a chance to watch Hubdub/Betfair/etc change as the show was going on, but I dearly wish I had.

Clearly, as people see the various acts, their opinion of the best one changes. Thus the probability of a certain act winning changes over time as more variables enter the equation. This is also affected by hype and, sadly, the aforementioned geography and politics (although I think this is less the case than it used to be).

With Eurovision, it’s likely a safe bet to say that as each act plays, it introduces a new probability of that act winning into the overall picture, and also affects the probability of previous acts’ victories. (Note that a bad song may increase the previous acts’ chances!)

The probabilistic question is whether to start off assuming each act is equally likely to win, or to break time into discrete units and assume that only acts that have played so far have a probability of winning (so at t=2, with two countries having played, the only possible winners are those countries).

Perhaps a mix of the two, mirroring the viewer’s tendency to ‘pick a favourite’ but also look forward to certain new acts. This combines hype and visibility. Once the act has played, it becomes a known variable, affected by future acts but also far more tangible than before.

Would you feel more or less comfortable putting your money on Norway before or after they have played? How about after everyone has played? At what point would you commit £100 to a win - or would you always hedge and put some on your second favourite?

Where this becomes a really interesting problem, for me, is in social media analysis. I was very tuned into the Twitter conversation around Eurovision, although due to information overload and 3G black holes I didn’t see or digest every single tweet. What took part was the pub or living room conversation, on a larger scale.

To what extent did Twitter sentiment about the Eurovision participants reflect the overall voting?

To what extent did it reflect the voting of the United Kingdom?

To what extent was it wildly wrong?

The latter is interesting. Given country X, with a ridiculous Euro-trash entry in some language nobody’s ever heard of, with pink hotpants and glitter and other ridicule-worthy aspects, the conversation traffic about it might be surprisingly positive. It would certainly be disproportionately high given the entry’s quality.

But does this reflect perhaps a sympathy vote? If everyone’s ridiculing Nowherezikstan, does that stop at Twitter snark or does it translate into points? How can we tell the difference between genuine excitement, ridicule just because it’s bad, and ridicule because it’s so bad it’s actually quite good?

Back to the first two questions. Thanks to Twitter geocoding, we can strip out the UK opinion from everyone else’s, or we can just assume that the majority of English-speaking tweets who care about Eurovision will come from the UK. We do need to do some filtering, or else we will just assume our own country wins; as countries can’t vote for themselves, we need to remove that as a possibility.

The ultimate question and gold standard involve two things: how do the betting companies do it? and how can we build something that reflects twitter/online sentiment (think Facebook Connect on a Eurovision live stream) over time, comparing that to votes? It’s like a constant, ongoing, realtime poll that could affect betting as well as simply being a fun way of automatically watching bar charts change as you talk.

Of course, there are problems associated with the IR/NLP side of things. How do we know which entry a tweet refers to? How do we track @-conversations to measure agreement with sentiment? (e.g. @Wossy says Norway’s act is amazing and 100 people say “@Wossy I agree!!!!”). How do we strip out the sarcasm, or do we? Do we build a probability model specific to Eurovision and refine it after every act by looking at the sentiment, or do we simply track mentions and normalise? Do we even normalise?

There are answers to some of these problems, varying from the complicated to the simple (”We don’t”). Some of it is more experimental, to see what’s the best result. And some of it is just academic fun :)

So, next year, if you see an interactive, realtime, constantly-changing chart of who’s going to win Eurovision, you know who created it — and some of the hurdles along the way!