Sunday, September 20, 2009

Why Did the Poisson Model Get the Manchester Derby So Wrong?

I'll warn you in advance that this article is very much a thought-process piece. I work through why the Poisson model made the predictions it did and what potential issues there could be. If you'd rather not see how the sausage is made then I suggest you pass on this one and wait for my next article, which should be out tomorrow (tonight if you're in Europe). In that article I will actually discuss the match itself and give my subjective thoughts.

The Poisson predictions for the Manchester derby seem to have been pretty far off - the model predicted a low-scoring affair with 0-0 the most likely result. It's possible that it was a fluke and if these teams played a million times they would mostly be low scoring. It's also early in the season the predictions of the model with just four or five matches played probably aren't all that reliable. The small sample does help though as it makes it easier to work through how the model works.

Where did the low-scoring prediction come from?

The model takes into account home-field advantage and higher order relationships with the schedule - how good your opponents' opponents are and so on, but I will simplify things by looking only at how well these teams did against their opponents compared to other teams.

Before this match began (when the prediction came about), Manchester United had played 5 games. Their opponents were Arsenal, Birmingham, Burnley, Tottenham and Wigan. Against those opponents they averaged a respectable 2.2 goals per match and conceded just 0.6. Man City had played against Arsenal, Blackburn, Portsmouth and Wolverhampton. They averaged 2 goals for and 0.5 goals against.

Let's look at how many goals these teams scored and allowed on average before this weekend against their other opponents:

Manchester United (2.2 for 0.6 against) versus:
Arsenal - 4 goals for, 2 against per match
Birmingham - 0.5 for 0.75 against
Burnley - 0.25 for 2.5 against
Tottenham - 2.75 for 1 against
Wigan - 1 for 1.75 against
Average - 1.7 for 1.6 against

Manchester City (2 for 0.5 against) versus:
Arsenal - 3.67 for 1.33 against
Blackburn - 1.33 for 1 against
Portsmouth - 0.75 for 2.25 against
Wolverhampton - 0.75 for 1.5 against
Average - 1.625 for 1.52 against

Note that Arsenal's numbers are different for the two because for United City's score was included and for City United's match was included. Note that the average team right now has about 1.42 goals scored or conceded.

The point here is that the Manchester clubs played opponents that, taken collectively, scored and conceded at a rate above the league average. This made their well below average goals conceded look very impressive. It also made their strong scoring numbers look a bit less impressive. As a result, the model concluded that two elite defensive teams that are good at scoring were playing a match. Two teams that are world-class defensively and good but not great at scoring when compared to the best clubs in the world is a common situation in the knockout stage of the Champions League. The outcome tends to be what the model suggested of this match - both legs of the tie tend to be low-scoring and often one of them is 0-0.

So that explains why the model (naturally I'll be saying "the model" made the predictions when they went like this and that "I" made the predictions when they are spot on) thought this would be a low-scoring match: it was two teams that are incredibly good at stopping the other team from scoring and merely good at scoring themselves. That raises another important question: should we really conclude that those adjectives describe these teams based on the results? That's a tough call. On one hand, these teams did allow roughly a third of the goals that their opponents' opponents did while scoring only 50% more. That certainly points to great defense and good offense.

On the other hand, Arsenal dominates much of the opponents' average goals scored and they each played a very weak team that is responsible for a large percentage of the goals their opponents have allowed. Of Manchester United's 5 matches, 2 were against teams that, in their small sample of other matches, had not been very good at scoring having scored just half a goal and 3/4 of a goal per match. They played two good attacking teams, Arsenal and Spurs, and a team a little below average scoring. City faced Arsenal, a team about average in scoring and two teams that averaged only three-quarters of a goal per match. Looking at those numbers, it seems that each team had a couple soft spots on their schedule where a decent team would be capable of keeping the sheet clean a lot of the time. For both clubs their goal-conceding rates were impressive against those opponents, but not necessarily amazing. This is a place where sample size kicks in because any team can go on a four or five match run where it seems impossible to score on them. That's not my point though. I'm saying that if you look at the result of individual matches instead of aggregating it all, their defending records don't look quite as impressive.

The story at the other end is similar. Both teams played some fairly stingy opponents, but a couple teams that are bad defensively push the aggregate numbers up. It's not as bad on that end though.

Suggestions for a Better Model

What I've described is a simplification of what the model actually does. It doesn't just average the goals allowed by the opponents. It does something pretty close though. Each team's scoring factor is their total goals scored divided by the sum of their opponents defensive factors with the defensive factors. For that sum, those opponents they played at home have their factor multiplied by the home-field factor. As a result, a team playing against a good opponent and a weak one will appear to have a similarly difficult schedule as a team that plays two mediocre opponents.

This makes worse the previously discussed simplification that all goals are equally important according to the model. While I have argued that over the course of a season goal differential is a very strong indicator of points, it seems bad in the short run to rate two teams equal when one wins 4-0 against a bad team and loses 0-2 versus a good team and the other gets two 2-1 wins over average teams. I think this is a situation where things will even out as more matches are input, but at least with these small samples it can be a big problem and rate teams that have unbalanced schedules pretty badly.

I am still working out the kinks of a model that I think could be an improvement. The idea is to look at how well each team does in each match compared to how many goals on average their opponent scored and allowed. To make predictions, these are then combined with the average goals scored and allowed by the opponent. I'll work that all out and compare its predictions to that of the Poisson model.

No comments:

Post a Comment