Thursday, December 3, 2009

Stat Series: Fouls

I'll now continue the stat series by looking at fouls. Like corners and time of possession, about which I'll write in the future, fouls are often used as an indicator that one team was a lot better than the other in a match. A defender usually fouls a player with the ball because he cannot otherwise make a tackle. It is often because the attacker is making a run past the defender or receiving the ball in a dangerous area. Other than a few exceptions like fouling the keeper on a corner, fouls tend to be called on you when your team is getting outplayed, at least for the couple seconds before they occur. But are fouls really a strong indicator of performance in a match?

If that is the case then we should see a link between the number of fouls committed by each team and the match result. If getting fouled a lot more often than fouling your opponents indicates that you are playing better, and it's a given that playing better leads to winning, then getting fouled more often should be well correlated with getting results.

Looking at the data, I was quite surprised to find that there is practically no connection between the number of fouls called for each team and the results of matches. In other words, foul statistics seem to not indicate anything at all in terms of which team was better on a given day. My data set is all matches from the previous two seasons in the English, Spanish, Italian, German and French top flight. I may expand the dataset for this study but frankly I didn't see the point after seeing what the results were for the last two years.

I looked at it a couple different ways. The first was just to look at the correlation between foul differential (home fouls minus away fouls) and goal differential (home goals minus away goals). A positive correlation would indicate that fouling more often than your opponent tends to lead to outscoring them. A negative correlation would be more in line with what you would expect; if you get fouled more often than your opponent you will tend to outscore them. If the correlation coefficient, which varies between -1 and 1, is very close to zero then that indicates that there is little relationship between the two. As it turned out that was the case. The correlation coefficient was -0.01278. In other words, the data indicates that there is essentially no link between number of fouls committed by each team and the goal difference in a match.

To look into this further, I broke the matches up into three groups: those where the home team committed more fouls than the away team, those where the away team fouled more and those where the two teams committed the same number of fouls. In the sample there were 1,582 matches where the home side committed more fouls. In those, the home team won 729 times, got a draw in 415 and lost 438. When the home team committed more fouls they averaged 1.64 points per match. The away team fouled more in 1824 matches. In those, the home team went 853-484-487 for an average of 1.66 points per match. There were 246 matches where the two teams were called for the same number of fouls. In those the home side went 122-55-69 for 1.71 points per match.

For the fellow nerds out there, there is no statistical significance when comparing 1.66 points per match and 1.64. More importantly though, there is no actual significance! "Winning the fouls battle" was only worth 2 hundredths of a league point. That is absolutely nothing. If you gave that up over an entire season, it would still cost you less than one point on average: 0.76 for a 20-team league. In a previous article on goal differential and points, I showed that there is a very strong link between goal differential over a season and points. Using the formula from that article, giving up .02 points per match every match for the entire season is about the equivalent of conceding one more goal overall. The stats indicate that the difference in points from getting fouled more or fewer times in a match almost certainly is due to random noise and not a real effect. My point is that even if it were due to something real, the difference is so small that it isn't relevant.

To put the final nail in the coffin, I looked only at matches where one team or the other committed a lot more fouls. In those where the away team committed 5 or more additional fouls than the home team, the home side went 442-273-260 for an average of 1.64 points. When the home side committed 5+ fouls more than the away team they went 330-196-208 for an average of 1.62 points per match. So the difference is the same as in the other case. I find it curious, though it doesn't matter at all, that the home teams did better in matches where the number of fouls committed by the two teams were close than either extreme. So even in the extreme case, it's extremely likely that the difference is just due to noise and even if it isn't the difference is so small that it doesn't matter.

There is no evidence whatsoever of a connection between the number of fouls committed by each team in a match and the result. If in the future you see me using the difference in fouls to make the case that one team was better than the other despite the result, please yell at me in the comments.

In the future I will look at full-season information for different teams as well as different leagues. It is possible that there is a link between fouls and results in individual matches, but it is restricted to certain leagues due to style of play or officiating.


  1. Do you think that if instead of considering the total number of fouls committed in a game by a team you considered only the number of fouls committed in dangerous areas of the pitch, say for example the attacking third, that there might be a correlation between fouls committed and goals conceded? Intuitively it seems like the more fouls you commit in dangerous areas of the pitch the more set piece opportunities you give your opponent and the more likely they are to score. But that assumes that the chance of scoring from a set piece is higher than from the open play situation in which you foul.

    To take it back a level further I sometimes find myself thinking about when it is +EV for defenders to commit a foul and when it isn't. I know the question is too vague to even answer, I mean how can you even quantify with any accuracy what the chance of scoring a goal from any given open play situation is, but defenders with experience gain an understanding of when to foul and when not to foul and it would be interesting to know whether the conventional wisdom accrued over years of playing experience agrees with these hypothetical EV calculations.

    I'm a little preoccupied by this Adem Ljajic transfer situation right now so that was probably a bit rambly.

  2. Thanks for the comment.

    That would be quite interesting. I'm not aware of any recording of the location of fouls. There is for shots, but soccernet, but I haven't seen one for fouls. I suspect there would be a difference looking just at dangerous fouls. If there wasn't then it would indicate that fouling is good because there will be some difference strictly based on the chances created.

    Fouling on purpose is something I've thought of as well. There are some cases where it's obvious, like handling the ball to stop a breakaway. Stopping a goal-scoring opportunity is another, although there it can be tougher since you are likely to be sent off which would have to be in your mental expected value equation.

    For analysis purposes the free-flowing nature of the game is a problem. Though it's one of our favorite things about the game, the lack of recorded stops and starts makes it tougher to attempt to answer such questions than baseball or even American football.

  3. Jared,
    Do you think it make sense to look at yellow/red cards as a possible predictor (with or without fouls) ? I think this data should be available.


  4. Hmm, I feel that yellow/red cards should have some correlation, but it will be very slight.


  5. I haven't run everything, but the next article in the series is on cards. I'll leave a bit of suspense and not reveal anything. It should come out next week as I'll be writing about the World Cup draw happening shortly and giving a weekend preview.