Friday, January 29, 2010

Half-Time Report: La Liga Española

The Spanish league has finally reached the halfway point.

Most Impressive

This is pretty obvious, and it's FC Barcelona. They are in my view the best team in the world and it's not close. While the league as whole seems tougher, particularly their rivals in Madrid appear to be, they keep rolling along. The Blaugrana are 5 points clear at the top of the points table, 9 goals better in goal differential. They are top two in pretty much every statistical category. In the less-important stats they are 5th best in foul differential (times fouled - fouls committed) and 2nd in corner differential. While they are second behind Real Madrid in shot differential, more importantly Barça are best at shot-on-target differential averaging nearly four more shots on target than their opponents in each match. To make matters worse for the rest of the league, they have the highest ratio of goals to shots-on-target (shooting percentage) in the league to go along with having the most shots. They have been doing well defensively and in goal as well with the second lowest shooting percentage against.

Last season at this stage they had 50 points and a GD of 46, so they are one point and 7 goals behind last-year's pace. They did slow down a bit in the second half; if they can keep up their current pace they will break last-year's record of 87 points and a goal differential of +70. About the only bad thing you can say about them is that they won't win the treble this year as they were eliminated from the Copa del Rey by Sevilla, largely thanks to the heroics of Sevilla goalkeeper Andrés Palop.

Biggest Disappointment

Here it has to be Atlético de Madrid. There has been much turmoil in the boardroom at the Calderón and that seems to have worked its way onto the pitch. Atleti last season managed to pip Villarreal and Valencia for the fourth spot and they qualified for the Champions League at the start of the current campaign. Unfortunately, they were a failure in that competition, having busted out fourth in their group, not even qualifying for the Europa League. In the league they have actually been decent over the last month or so and that has pulled them from the relegation zone up to midtable. They are still well out of the fight for a spot in Europe. Their best hope is in the Copa del Rey as they've made the semifinals.

Real Madrid

They don't fit either of the above, but I believe it's illegal to write an article on the Liga without spending time on Real Madrid. Actually, I think there is a lot of interesting stuff to say about them, so they'll get more space than Barça despite not being as good.

I have to give the Madridistas credit as things appear to have gone much more smoothly than I thought they would. They have pretty much sorted out the Raul situation as his role is as a sub or to cover for injured players as he will likely do this weekend. They also have managed to play a reasonably balanced lineup, often featuring both Lass and Xabi Alonso in the midfield. Comparing this team to last year is like night and day, as it probably should be since they shelled out so much coin. Last season at this point they had 38 points and a goal differential of +14. Right now they are at 44 points and a GD of +30. Last season they averaged 2.18 goals per match, so far they are at 2.32 this season. I mentioned in the preview for the season that I thought their issues were on the defensive side of things and they have greatly improved there, going from 1.37 goals against per match last year to just 0.74 in the first half of this season. They have taken 0.8 more shots-on-target per match than last year and allowed their opponents 0.9 fewer. In rankings terms, they have the 5th fewest shots-on-target allowed, a drastic improvement from last year when they were only 13th best. Any season other than this one and the last they would be clear favorites at the top of the table with this level of play.

Something often brought up in the Spanish media is that Real Madrid are dependent on whoever their biggest star is. A few years ago it was Zidane and that has obviously now shifted to Ronaldo. I personally viewed this as fairly ridiculous given the surplus in attacking talent, but this year at least there seems to actually be evidence of this. The season thus far is a pretty good time to look at this because he has missed several matches due to the ankle injury he suffered playing for Portugal. He has played in 12 matches and missed 7 in the league. 7 is a small number but it at least gives something to work with; normally in the first half a season the top players play nearly every match. Ronaldo missed the matches against Atletico de Madrid, Getafe, Racing, Sevilla, Sporting, Valencia and Valladolid. Those are tougher opponents on average than those he played against. While they got about the same number of points per match, 2.3, whether he played or not, the stats indicate that Real Madrid were far better when he was in the lineup. They average half a goal more per match when he plays and concede about two thirds of a goal per match fewer. When Ronaldo played they averaged 1.8 more shots on target per match, allowed their opponents an impressive 2.6 fewer shots-on-target per match. While he's far from a great defensive player this makes sense due to balance. If he makes them far more dangerous in attack both when they are able to build it and on the counterattack then that forces the opponents to be more defensive and you have fewer shots coming at Casillas.

The stats and results-only models both allow me to take into account the difference in opponent strength. Using them it is striking how much better Real Madrid were in the 12 matches featuring CR9 than the 7 matches without him. Using the stats model, Real Madrid with Ronaldo are just behind Barcelona with 88.6 expected points*. Without him they are 5th best with 65 expected points. Going off the results model, Real Madrid with Ronaldo would still be behind Barcelona but it would be close. They have an expected goal differential of 70 according to the model. Without Ronaldo that number drops to 39. Again I'll caution that the sample sizes here aren't large enough to say for sure, but going by that it seems that Ronaldo is worth about 31 goals in goal differential to the Merengues, which corresponds to about 20 league points. It'll be interesting to see who they fare the next two matches as they he has been suspended for breaking Patrick Mtiliga's nose in their match against Malaga last week.

Luck

As I have said in previous articles, luck plays a role in a lot of ways and the two I write a lot about are the relationship between points and goal differential as well as performance in front of goal. The former appears to be essentially all luck, while converting shots into goals, or keeping your opponent from doing the same, are certainly reliant on both skill and luck. I'll write about this more in future.

When it comes to performance in close matches, which is really what getting a lot of points out of goal differential is all about, Deportivo, Tenerife and Athletic de Bilbao seem to be above expectation while Malaga is the lone unlucky standout. Going off the regression formula in my previous article on goal differential and points, the average team with Deportivo's goal differential would have 28 or 29 points, while they have 34. They currently sit fifth, so that is the difference between being in the thick of the race for a European spot and sitting midtable. Tenerife have been bad so far with a goal differential of -20. Going by past years, teams around there at this stage of the season usually have around 13 points so they are about 4 points closer to staying up than they should be. Athletic de Bilbao are similarly about 4 points over expectation. On the other end, Malaga are between 5 and 6 points below where they should be. Historically, teams with a goal differential of -5 at this point have had an average of nearly 23 points, 6 clear of the relegation zone, but they find themselves even on points with Tenerife in 18th. Last season, much as it pains me to say, Betis were very unlucky in this regard and got relegated largely because of it. Their fellow Andalusians could potentially go that same way.

In terms of performance in front of goal, there are a few teams near the top that have struggled at one end or the other. Sevilla, my favorite club, are 4th best according to the stats model but only 6th best in the table. That is largely due to only being 14th best at shooting percentage allowed. Perhaps I'm being optimistic, but I think that's mostly due to luck. Palop has been in good form lately and with him in goal last year Sevilla were 3rd best at shooting percentage against. Villarreal have similarly struggled - they are only 17th best at shooting percentage against compared to 8th last season. I believe largely because of that they are in 9th instead of a few spots up where the stats model puts them. Getafe are currently 7th in the table, just outside of the Europa League. The stats model puts them 5th. It seems their issues have been at the attacking end as they are 15th in shooting percentage. With them I'm far less convinced that it's bad luck though, as they were 19th in the same category last season.

In the relegation battle, the stats model rates Zaragoza 5 spots higher than their current position in 19th. They rate dead last when it comes to shooting percentage against, allowing their opponents to score on a dreadful 44% of their shots-on-target. They were promoted from the second division, so they almost certainly should be near the bottom, but with numbers that bad I'm sure they've been unlucky as well. On the other hand, Racing and Espanyol should be in the relegation zone according to the stats model. This surprised me since they are 7 and 16 goals better in goal differential than the best team in the bottom three. For Espanyol it appears to be a combination of a bit of good luck when it comes to close matches, they are about 2.5 points above expectation there, and they are doing a bit better at stopping shots than one would expect from a team in their position. They are 13th best though so it's not too extreme. Racing is a different story as they are 3rd best in shooting percentage and 9th best in shooting percentage allowed. A big factor there is the play of this year's big revelation Sergio Canales. The talented 18-year-old has scored 5 goals on just 6 shots on target. Otherwise their lineup does not feature guys you would expect to be near the top when it comes to putting away chances, so they'll likely cool down.

Predictions

Before going into new predictions, I want to just go over a few I made for the season just after the season started.

Firstly, I predicted that the top 6 would be the same as last year. While predicting the top 6 is never easy, it looks like I'm off there as Atleti don't look like climbing back in there. I still think there's a good chance that both Sevilla move up and Villarreal do as well so 5 out of 6 are in there. I predicted that Getafe would do much better than last season and that is clearly the case. Last year they ended even on points with Betis who were relegated. Right now they are 7th, 3 points out of the Europa League. Halfway through the season they are only 12 points short of where they were at the end last year. Osasuna I predicted would be in a safer position as they finished just one point clear of relegation. They are currently 6 clear and looking significantly better than several teams below them. Last season Almeria finished 4 clear of relegation, I picked them for the drop or to at least be even closer this year than that. Right now they are only 1 point clear so it's up in the air. Finally, I picked Sporting Gijon to be relegated. They were incredibly fortunate not to be last year despite finishing 14th; they had the worst goal differential in the league at -32. They were particularly bad in defense having conceded 79 goals. This season they have turned things around and are a legit midtable team. Overall I think I did pretty well with the predictions, but there were definitely a couple misses in there.

Here are the new predictions:

Barcelona will win the league. This shouldn't surprise anyone. Real Madrid still have some chance certainly, but Barcelona are better and have a five-point cushion. I don't expect any surprises here.

Real Madrid and Valencia will get the other two automatic Champions League spots. The way I see the league, and the models agree, there is Barcelona, then a fair drop down to Real Madrid, then a similar drop to Valencia. After that there is another gap to the next group of teams like Sevilla and Mallorca. Right now I think Real Madrid are close to a lock to finish in the top 2 and Valencia are almost there when it comes to the third spot. Right now they have a 6 point lead on Sevilla. If they can get a point in the Sanchez-Pizjuan this weekend they'll be in fantastic shape for that important third spot.

Sevilla will finish fourth. I'm a bit wary to make this prediction given that I'm biased and it goes in the direction of that bias. Right now Sevilla are just a point back of Mallorca and Mallorca are also 4 goals better in goal differential. None the less, I think Sevilla have had well more than their fair share of injury problems and are better than the numbers indicate for the first half.

Mallorca and Villarreal edge out Depor and Getafe for the two Europa League spots. Right now Villarreal are sitting 9th, 7 points out the Europa League. I think they'll claw their way up, jumping over Athletic de Bilbao who will drop down to the middle of the table. That leaves a four-club race for two spots which I think will come right down to the last week of the season. I give the edge to Mallorca and Villarreal, but I could see any of those four playing in Europe next year.

Tenerife will stay up, Almeria will drop. To be honest, I don't really believe this. As with the previous half-time reports, I'm trying to pick a team to stay up that it is currently in the relegation zone and a team that is currently clear to drop. I think these are the most likely two sides to flip, but I think Tenerife are more likely to get relegated than Almeria.



*this is what they would average if they played a large number of seasons at the level shown by the stats and results from this year so far. This model appears to overestimate the chances of a result for bad teams against good teams so differences at the top are probably larger in reality.

Monday, January 18, 2010

EPL Halftime Report (long)

I meant to do this last week, but better late than never as they say.

Most Impressive

I suppose it's Chelsea. The blues are three clear in points, one ahead of Manchester United but with a match in hand, and two goals better in differential than Arsenal, who sit second in that category. I'm not sure if others share it but the feeling I have from watching the league is that the top clubs are less dominant this year than in the last few seasons. Just glancing at the table though the opposite appears to be true. Chelsea are on pace for a goal differential of around 60 this year, Arsenal a few less and Man United at around 50. Over the last several seasons the club with the best goal differential has had somewhere between 50 and 60 more goals than their opponents. It looks like the big clubs are cruising along just fine.

Going forward Chelsea's pace will likely slow down due to the knee injury suffered by Michael Essien. Essien is in my view not only the best and most important player at Chelsea, but he is probably the best player in the league. Looking at the season thus far, he has played in 14 of Chelsea's 21 Premier League matches. While the sample sizes of 14 and 7 are very small, comparing the blues' results with and without him in the lineup gives interesting results. When he played, Chelsea outscored their opponents by 1.79 goals per match. When he didn't, they only scored an average of 1.29 more than their opponents. 1.29 would be behind both Arsenal (1.52) and Manchester United (1.36). The most important team stat is shots-on-target differential (SOTD). In the matches where Essien played, Chelsea averaged 11.3 shots-on-target and 4.2 shots-on-target against for a SOTD of 7.1 per match. When he didn't play they slipped to 7.0 shots-on-target and 2.9 for their opponents for an average difference of 4.1. So when he played they were about half a goal and 3 shots on target better than when he didn't. Adjusting for the strength of opponents gave similar results. It'll be interesting to see how things actually shake out.

Most Disappointing

This just has to be Liverpool. Despite selling Xabi Alonso to Real Madrid in the summer, the reds had high hopes for the season. They finished second last year, with the highest goal differential, and their rivals in Manchester sold off the (second?) biggest star in the game. This season little has gone right and their dreams have certainly been tossed and blown. They find themselves out of the Champions League having picked up only one point against the two decent sides in their group, Fiorentina and Olympique Lyon. They lost to both at Anfield, something the commentators likely will ignore next time and every time in the future that they play at home in the Champions League. They are surely a dog to make it into Europe at all this year as Spurs, Man City and Villa all look pretty good and there are only two spots going to those four. It's hard to see Rafa Benitez lasting long; maybe a shakeup is what they need.

Luck

Normally I open the luck section of the report by discussing goal differential and how that is turned into points. Something I have admittedly glossed over with other leagues is injuries. For the English Premier League this season injuries have played a very large role for several clubs. Liverpool were without their three best players for several matches as Gerrard and Mascherano each missed 5 and Torres was out for 7. That may seem pretty bad until you consider Arsenal and Manchester United. Essentially every attacking player for the gunners and defensive player for United that most fans have heard of has been out for a significant number of matches. It is quite impressive that both of those clubs have the depth to get results despite all of that. At the top Chelsea had been in relatively good shape, though Essien had missed some matches with a hamstring injury. Unfortunately with a new and likely long-term injury to such an important player, I don't think we can consider them lucky at this point.

Moving on to goal differential, as I've said repeatedly (and will continue to do so) there is a very strong link between goal differential and points over a season and teams doing well or poorly in points compared to their goal differential seems to be only due to luck. This year there is one result that makes me question that a bit, which is Tottenham's 9-1 win over Wigan. The work I've done on goal differential essentially said that big results will, for the most part, even out over the course of a season. So a team's skill level is best represented by their goal differential but they could do well or poorly in close matches due to luck, and that would mean more or fewer league points. In the case of such a big win, I don't think this holds. Going up to a 4 or even 5 goal margin, I think the extra goals are informative; if two teams have otherwise identical records but A beat four different opponents by 4 goals each and B won those same matches by 5 then I think B is most likely better than A. In going from 6, 7 or 8 goals to 9, I don't think the same thing holds. In other words, I think the usual criticism of goal differential - that it overemphasizes big results - isn't all that great in normal circumstances but with that one extreme result it is valid for those two teams.

Looking at the league, the most unlucky team in this regard are Portsmouth. They are at the bottom of the table but are 15th in goal differential per match. Adjusting for schedule and applying the formula from a previous article on goal differential and points, they are running about 5 points below expectation. Next unluckiest are West Ham who are also running about 5 points low. To make matters worse for these clubs, their relegation rivals appear to be fortunate. Wigan rate as the luckiest. They are running at 7.5 above expectation leaving the huge loss to Spurs alone. Converting it into a 6-1 or even a 5-1 loss makes them still the most lucky with over 5 points more than the average team with their (adjusted) goal differential has had. Wolves, Hull, Blackburn and Burnley are all running between 4 and 5 points above expectation according to the model.

Looking near the top of the table there is nothing too extreme. In a shocking twist, Manchester United appear to have been unlucky despite what you might think due to a certain 6th-minute-of-stoppage-time goal and, well, them being Manchester United. Not to worry though, they are only a point or two below where they should be and there is a lot of season left for Alex Ferguson to turn on his luckbox. Chelsea are two or three points below expectation. Arsenal are the top club that is worst off when it comes to this with between 4 and 5 points fewer than they should have according to their goal differential. Combining those two, if the top three had as close to average luck as possible when it comes to goal differential and points then Chelsea would have 50 or 51 points, Arsenal 49 or 50 points and Manchester United 48 or 49 points. That's pretty similar to where we are now.

Efficiency

Another way I measure luck is to compare each club's spot on the table with where they are according to my stats-based model. Unlike the previous section, there is a major skill component. A big reason for this difference is efficiency in front of goal. I'm actually adjusting the model to account for shooting percentage and shooting percentage against and will probably have the new version ready next week.

The first club that jumps out is again Portsmouth. In my stats model they are actually 10th in the league! My first thought when I saw this is that there must be something seriously wrong with the model. That may be true, as I said I'm adjusting it, but there is good reason to think that Pompey are an average team instead of one of the worst. The main argument is that they are 7th best in both shot and shot-on-target differential. I was quite surprised to see that they have taken more shots than their opponents. They are averaging right around 1 more shot and shot-on-target per match than they allow. Given that they have been outscored by 14 goals in 20 matches, it's clear that something is going wrong in front of one or both goals. It is emphatically both. While it is likely the case that the shots they are taking are less dangerous and maybe they are allowing tremendous opportunities for their opponents, the Portsmouth midfielders shouldn't be too happy with their teammates. They are dead last in scoring percentage, putting in an impressively low 13.7% of their shots-on-target*. That is only about 60% of the league average which is 23.2%. At the other end they are only slightly better at second worst. They have allowed their opponents to score on 28.6% of the shots that make it on target. If shooting was about average for both sides in their matches, they would have a goal differential of +4, some 18 goals better than where they are now. To summarize their season thus far, Portsmouth have been put into such bad financial shape that they are struggling to pay their players, they've been both bad and very unlucky in front of goal at both ends and on top of that they are running really bad when it comes to getting league points for a team with their goal differential while their rivals in the relegation battle have been fortunate. Good times.

Another interesting side is Wigan. They are 14th in the table, 12th if you go by points per match to account for the differences in number played. The model puts them 13th best so they are basically right around where they should be if you go by that. They have been nearly as bad as Portsmouth at both ends of the pitch. They are 18th best at converting shots-on-target into goals and the worst at shooting percentage against. Like Portsmouth, they have been some combination of very bad and really unlucky in front of both goals but unlike them Wigan have been fortunate in close matches. That all washes out and it looks like they are about where they should be. If they want to stay up though, they'll need better shooting and goalkeeping from Kirkland.

Stoke City are on the opposite side of the luck wall. They are the worst side in the Premiership when it comes to shot-on-target differential. It's not very close either. Stoke have allowed their opponent to take an average of 3.7 more shots-on-target than they do in a match compared to 2.9 for second-worst Hull. This is hardly new territory for the Potters. Last season they allowed their opponents 155 more shots-on-target than they got. Next worst was Middlesbrough whose opponents took 82 more shots than they did. That is a pretty ridiculous difference and if I only gave you that info you'd be wise to think that Stoke were relegated last year, probably bottom of the table. Both last year and this season thus far the problem has been at the attacking end. Last season Stoke took just 137 shots-on-target and no other team was below 200. This season they are on pace for 134 and the next worst, Hull, will have 181 if they keep up their current rate.

In 2008-2009 Stoke got by due to very good finishing. They had the best shooting percentage in the league. Given the very low number of shots I wonder if this might somehow be part of their strategy - if they are more patient than the other teams and wait for a good scoring opportunity to shoot then that would lead to this pattern. This season they haven't been quite as good but they are 6th best in the league. They are also fourth best at shooting percentage against. Considering that they don't have the talent of the top clubs, these numbers are better than what one would expect from them. I'm sure they have gotten lucky, especially in scoring the number of goals they have the last season and a half on so few shots, but maybe there is something more going on. I personally hope they stay up for at least another couple years because it'll provide more data to see if it is just variance or something deeper.

At the risk of boring the readers that support bigger clubs, I want to also mention Birmingham City. The blues are strong in 8th position, which is better than most would have given them credit for after being newly promoted to the top flight. They are 8th despite being 15th in shot-on-target differential because of luck and apparently good play by their young goalkeeper Joe Hart. They have allowed their opponents a goal on only 13.3% of shots-on-target, the best in the league. In just over half a season he's put up the kind of shot-stopping performance that should get Capello's attention. Maybe not though since David James has been bad in this regard for Portsmouth and is considered the England #1. He also wasn't great last year as Portsmouth finished with the 16th highest shooting percentage against. There is more to keeping than stopping shots, but David James doesn't exactly have a great reputation in those areas.

Moving to the big boys, nothing is far off from expected. The stats model has the top three in the same order the table does right now with Manchester United just a bit better than Arsenal. Chelsea have been good but not great at both ends. They are 7th best at converting shots in to goals and 6th best at preventing their opponents from doing so. Arsenal have the best shooting percentage in the league scoring on 34.4% of their shots-on-target. It seems that Almunia is the player Arsenal fans complain about the most and perhaps that's justified. They are 10th best at shooting percentage against. That's not terrible, but it's easily the worst of the top 3 clubs. Manchester United are 4th best at shooting and have allowed the 5th lowest shooting percentage.

United fans may be interested to know how their performance at the attacking end compares to last season now that they have 100% less Ronaldo. Firstly, they are on pace for 84 or 85 goals this season. Even if they slow down they will likely pass the 68 they scored last season. I should point out that scoring league wide appears to be up. I'm not sure why that is other than that shooting percentages seem higher so maybe the goalies aren't doing as well. Returning to Man U, they are averaging about a shot and a half more per match this season compared to last but they are behind in shots-on-target by that same margin. They are scoring on a much higher percentage of their shots-on-target: 26.3% this year compared to 18.1% last season. Because there is a decent amount of noise in shooting % (expect an article on this shortly), I would say that the data point to a club that is not quite as strong in attack as last season but not far off at all.

Predictions

It will remain a three-horse race for some time. I'm a bit nervous about this since I was flat out wrong in my prediction that Liverpool would still be around at this stage. Arsenal and Man United are working their way out of the injury problems they've had for much of the season. Chelsea, even without Essien, have looked at least as strong as those two and currently have a cushion. It's tough to see any of the three dropping out of the race. It seems likely that at least two of them will still be fighting the last couple weeks. At this point I think Chelsea remain favorites, though I'm not willing to stick my neck out. Any of them could win it.

Spurs, Man City and Liverpool will fight for the two remaining European spots until the final few weeks of the season. I'm a bit on the fence as far as putting Villa in with this group. I think they're easily the most likely of the four to fall off, though more chaos at Liverpool could prove me wrong there. Right now if I had to pick two I'd take them in the order they are now - Tottenham for the Champions League playoff spot and Man City in the Europa League. Liverpool are far from out of it but they are now four points behind and going the wrong way. I think the reds are a dog to qualify for either competition at this point and are in serious trouble for the Champions League.

Portsmouth will stay up. I'm going to make this my bold pick. While it doesn't seem like a good idea to pick a team that is bottom of the table and in such a bad spot off the pitch, there is a good amount of evidence that suggests that Pompey aren't as bad as their table position. I think at the very least they will climb out of the cellar and fight for another year in the top flight until late in the season. If you disagree with me, and even I might, you aren't alone - Portsmouth are currently the team with the shortest odds in the relegation betting market.

Burnley will be relegated, Bolton will stay up. I'm going to leave the Portsmouth prediction out of it and make Burnley and Bolton my teams to flip. At around the quarter mark of the season, I predicted that Burnley would fall from the middle of the table which has happened. They started out strong with early wins over Manchester United and Everton, but have struggled overall. I like them to drop along with Hull and Wolves. Bolton aren't a great side and I think they will probably be in the fight all season, but they are better than those three.

Finally, I want to follow up on my other prediction form the quarter-time report, which was that West Ham would get out of the relegation zone and be well clear of it by the end of the season. So far so good. I think they'll keep moving up under new ownership. Other than picking Liverpool to stay in the title race, my predictions from then are looking good.


* again I'll say that this might be slightly off due to own goals (likely) not being counted as a shot-on-target. For simplicity of language I ignore this. Due to the rarity of an own goal and that they tend to come on good scoring chances I don't think it changes the implications of the analysis at all.

Thursday, January 14, 2010

Derbies: Can We Really Throw Out the Records?

One of the most common clichés in football is that in a derby the records of the teams don't matter. They are thought to be drastically more unpredictable matches in which anything can happen. Is the cliche accurate? Do results from the rest of the season really matter less in a derby?

Another question about them is how important home-ground advantage is. It seems that it could go either way. On one hand, the atmosphere is much more intense and hostile toward the away team. On the other, the travel is greatly diminished because, by definition, they are matches between clubs that are near each other.

Data and Methodology

Fortunately, these questions can both be answered by looking at actual results. To do so, I am using a relatively simple ordered-logit model that is similar to the one I used to look at results from Boxing Day and beyond. First, for each match, I calculate average goal differential for the home and away team in all their other matches. The main variable is the difference. For example, suppose for a match this is 0.5. This means that in all their other matches, the home team's average goal differential is half a goal better on average than the away side's. In a 20-team league, this would be a difference of 18 or 19 in goal differential. To check home-ground advantage I add what is called a dummy variable. It takes on a value of 0 if the match is not a derby and 1 if the match is. To check for the importance of other match results in a derby I include a variable that takes on a value of 0 if the match is a derby and the difference in average goal differential in other matches if it is. I also added controls for different countries.

The benefit of this approach is that it is able to use aggregate results of other matches. Not only is that necessary to test whether the cliché is true, since it directly references results, but it gives a pretty accurate indicator of the relative strengths of the two teams. Not controlling for this would be even more problematic in a derby since the clubs involved aren't typical. Because they are usually from larger cities, in most of these rivalries one of the clubs is huge, a perennial favorite for the league title. The other tends to be mediocre for a top-flight club.

The data consist of every match for the last 10 seasons (1999-2000 through 2008-2009) in the English Premier League, Spanish Primera Division, Italian Serie A, French Ligue 1 and German Bundesliga 1. To determine which matches were derbies I simply went with those listed at footballderbies.com as either city derbies or local derbies. I left out rivalry matches such as Real Madrid - Barcelona because they do not have the geographical component. I feel pretty comfortable with what they listed for those with which I was familiar. There were 372 such matches.

Home-Ground Advantage in a Derby

I'll start with the advantage of playing at home. Due to the atmosphere and short traveling distance, it's not clear whether playing at home is more beneficial or less so in a derby. This question addresses not just derbies but home-ground advantage in general. It's clear that home sides have a big advantage but it's not clear why since both teams play on the same pitch under the same rules. There are several explanations that all likely play some role: the wear of travel, less familiarity with the pitch and surroundings, a push from the local crowd for the home team and perhaps the away team tightening up for the same reason, the ref being influenced by the crowd etc. In a derby, the familiarity bit is the same as for any other match, reasons based on the crowd are much stronger and those based on travel are much less present.

In the regression, if the coefficient is positive and significantly different from 0 that would indicate that it home-ground advantage is bigger in a derby than a regular match. The opposite would be the case if it is negative and significantly different from 0. If the coefficient is close to zero then that would indicate that there is no evidence that a derby is different from a regular match when it comes to the benefit of playing at home. As it turns out the value was -0.322 with a standard deviation of 0.098 making it very strongly significant. From this it's clear that home-ground advantage is less important in a derby. I think it's safe to conclude that the burden of travel for the away team plays a far more important role in home teams having an edge than the difficulties that come from playing against a hostile crowd.

Because the model is what is called log-linear, there is unfortunately not a direct linear connection between the coefficient and the result. In other words, I couldn't say something like "if it is a derby then the home team is 10% less likely to win". The reason for this is that the change depends on the relative strengths of the teams involved. For now I'll leave out the other question and give a graph of results showing only the difference in the advantage the home side gets in a derby compared to a regular match, leaving out any potential "throw the records out" effect where weaker teams perhaps do better than usual. This graph is for the English Premier League but it would look very similar for other leagues:



The horizontal axis gives the difference in average goal differential for all other matches between the home and away teams. To get an idea of scale, a difference of 2 would be a team near the top of the table playing a team near the bottom. So +2 would be the club near the top at home, -2 would be the one near the bottom playing in a familiar ground. A difference of 1 is about right for a match between a team near the middle of the table and one near the top or bottom. Obviously it varies year to year but that should give you a general idea. As you can see, the effect is largest when the two sides are closest together. The vertical axis is the expected (average) points that the home team would get from the match. The biggest difference in home-ground advantage between a derby and a regular match is when the away team is just a bit better, 4 goals or so on the season. That leads to a difference of 0.22 expected points per match. For most matches it is between 0.15 and 0.22 expected points. In extreme cases where one team is a lot better than the other the difference is about a tenth of a point in expectation.

To put this in perspective, let's compare the edge for the two types of matches when the teams are equally skilled. In this case, according to the model the home team would be expected to win in a regular match 47.6% of the time, the away team 23.5% and there is a 28.9% chance of a draw. This makes the expected points for the home side 1.716 and away 0.995 for a difference of 0.722 points. In a derby with two evenly matched opponents, according to the regression, there is a 39.7% chance of a home win, 29.8% of an away win and 30.5% of a draw. That makes the expected points 1.497 for the home team and 1.198 for the away team and a difference of 0.298. So going from a regular match to a derby reduces the home team's advantage from 0.722 points to 0.298 or by about 59%. That's a pretty big difference and much more than I was expecting.

Can we really throw the records out?

So derbies are different in that the home side has less of an advantage. But can we really throw out other results when two rivals collide?

Not surprisingly, it isn't close to true. The coefficients for the difference in goal differential for all matches and for derbies are not anywhere near enough together in size for one to conclude that the strength of the two teams makes no difference. In fairness, people making this claim (hopefully) aren't being literal but instead are arguing that upsets are more likely in a derby than a regular match. Is that the case?

As it turns out there isn't evidence that this is true either. In the derby matches studied, inferior teams did get better than expected results but they were well within the range that could be chalked up to randomness. For the fellow nerds, the value of the coefficient was -0.062 with a standard deviation of 0.134. Assuming there is nothing special about derbies in terms of bad teams getting better results, there is just over a 64% chance that the outcomes are this extreme due to variance alone. Again, if derbies are as predictable as regular matches then it is more likely than not that the results would be similar to these or that the underdogs would do even better. The standard cutoff varies by discipline but in the social sciences it tends to be a 5% chance to assume that something is statistically different from 0. With a p-value of 64% the data present no evidence whatsoever that a club's record in other matches is any less predictive in derbies.

Let's step back for a second. Statistical significance is great for publishing in an academic journal, but this is a damn football blog. Suppose it is significant or, more accurately, nobody cares if it is or not. How big is the implied benefit to the bad team?

Not much. In the most extreme matches, the difference in goal differential between the best and worst clubs is usually around 90 goals for the leagues with 20 teams. This would correspond to about 2.35 per match. Let's go a bit more extreme to a nice round 2.5. In this extreme case, the model says that the home team in a derby will go from an 87.4% chance of winning down to 85.7%. The chance of a draw goes up from 8.7% in such a regular match to 9.9% in a derby. The horrible away side goes from a 3.9% chance in a regular match to 4.5% in a derby. So the dominant team's chances of winning go down less than 2% and they only about half a percent more likely to lose. This is a difference of only 0.04 expected points. Remember that this is the most extreme situation. Most seasons there aren't two teams with goal differentials that far off and I suspect that in recent years there hasn't been a derby that featured that kind of difference in quality. Despite all that, the worst team only does just a little bit better. So even if nerds like me were satisfied that it passed statistical tests, the difference between a derby and a regular match when it comes to the likelihood of an upset isn't actually significant.

Conclusion

To me, the most interesting thing is the first bit on home-ground advantage. I was surprised that the results indicate that once you account for the quality of the teams it is less important in a derby to be playing at home than a regular match. I thought it would be about the same or even a bit stronger. Perhaps this is projection because the atmosphere is very impressive and hostile and it's hard for those of us who aren't professional footballers to imagine being able to play well in front of 50,000 rabid fans that hate you for the shirt you're wearing. According to the last 10 years of results, it appears that the crowd doesn't matter nearly as much as the burden of traveling. I didn't find the second part very surprising. I think this is simply a case of people remembering big upsets and forgetting the others where the favorite won or just focusing on crazy stuff that happened on the pitch instead of the fact that the better team won.

Monday, January 11, 2010

Winter Break Review: Ligue 1

In previous writings I have mentioned that the French Ligue 1 is very tight with a lot of good teams. Bordeaux have now separated themselves and are 9 clear at the top. I think that margin is larger than it should be, but that's a big lead at this stage. After that though there isn't much to separate teams and the top 15 are all pretty strong.

Most Impressive

Perhaps I should rename this section "Let Me Tell You about the Team at the Top of the Table" since I'm going with Bordeaux. While Sevilla are the only club I support, I will admit to being a bit partial to Bordeaux as they are my favorite side to watch outside of the Spanish League. The clock is always ticking for French clubs in that situation, but they have a lot of quality throughout the side. While they don't get the attention of Gourcuff and Chamakh, the defense has been excellent all season. They lead the Ligue 1 with only 12 goals conceded and they were very impressive in the Champions League group stage, having only conceded two goals in 6 matches - including 2 each against Juventus and Bayern Munich.

I will say that Bordeaux seem to be running above expectation. Teams with their goal differential have about 39 points on average, so they have about 4 points more than expectation. Again, in studying goal differential I have found little to no evidence that this is anything other than luck so they've been fortunate there. More surprising to me is that my stats model has them only third best behind Lille and Marseille. The model puts heavy weight on shots and shots on target. Bordeaux are only fourth best in shot differential and 3rd best when it comes to shots on target. Since they have the best goal differential in the Ligue, it's clear they've excelled in front of goal at one or both ends of the pitch. Looking at the numbers they've been good at converting chances into goals, 6th best in the league, and fantastic defensively and in goal. They have the league best shooting percentage against having allowed a goal on less than 20% of their opponents' shots. They have conceded 5 or 6 goals fewer than they would have if they had allowed their opponents a league-average shooting percentage. While randomness or luck play a role I think they owe a good portion of that to Cédric Carrasso. Last season with Rame in goal Bordeaux were 11th at shooting percentage against, allowing a goal on about 26.5% of their opponents' shots. That is with largely the same set of defensive players. Carrasso had a similar record at Toulouse, though they have kept it up this season having played 4 different keepers. So it seems that Bordeaux have probably gotten a bit lucky to only concede 12 but they've gotten fantastic play from Carrasso.

Honorable mention for most impressive club goes to Montpellier. They, too, are fortunate to be where they are but it's impressive that they appear to be one of the best 5 or 6 clubs in the top flight a season after getting promoted from the Ligue 2.

Most Disappointing

I don't think there really is one here. It seems Lyon aren't as good as they have been, but halfway through the season they are three points out of the Champions League so it's far too early to call them a disappointment. I thought Marseille would compete for the league and they're 11 points out with a match in hand so I guess they're the closest. More on them in a bit.

Luck and/or Efficiency

The first thing I look at to measure luck is goal differential. I've written a lot about goal differential and luck before, the main point is that there is a strong link between goal differential and points and it seems that the differences are most, if not all, luck. The three clubs that are more fortunate in this area are Montpellier, roughly 5 points over expectation, Sochaux, 4 to 5 points, and Bordeaux at right around 4 points over expectation. Bordeaux are double winners really since the third unluckiest team when it comes to goal differential and points is Lille who have run about 4 points under expectation. Based on goal differential, Bordeaux would be expected to be between 1 and 2 points ahead of Lille but they are 9 clear. The two teams that have had the worst luck are Paris Saint-Germain, between 5 and 6 points below expectation, and Grenoble at around 4 points below expectation.

Looking at stats, mostly based on shots and shots on target as well as goals, the biggest difference between the model and league table is Toulouse. Toulouse are currently 14th, 15th if you go by points per match, but the model puts them 10th. While I don't think I've seen Toulouse, the data points to a few reasons for this. One is that they have run about 3 points below expectation for a team with their goal differential - they are 11th in goal differential per match. They have also had problems converting on their chances. Toulouse are 16th in the league in percentage of shots on target that are goals. Despite losing their goalkeeper to Bordeaux, they still have conceded few goals for the number of shots on target allowed. They are second there. So Toulouse are sitting near the relegation zone instead of midtable because of below average performance in close matches and some combination of bad luck and bad play in front of goal. This is negated to some degree by good fortune and play defensively and at goalkeeper.

The team near the top with the worst luck according to the stats model seems to be Marseille. The model rates them second, above Bordeaux, but they are well out of the title hunt. OM are best in the league in shot and shot-on-target differential. Despite that they find themselves only third in goal differential, 10 goals behind Bordeaux. They are right around average at converting their shots into goals, but have allowed more goals per shot on target than any team in the league. I found this quite surprising to be honest because it certainly seems like Mandanda is a very good keeper. Last season, Marseille were right around league average in this category. So it seems like they have gotten unlucky, and figure to concede fewer goals in the second half of the season, but I wonder if I've been overrating Mandanda a bit.

Another top club that the model rates higher than their position in the table is PSG. This appears to be fully explained by their subpar performance in close matches, again I think this is mostly due to randomness, leading to few points for a team with their goal differential. Les Parisiens have been pretty good with shots at both ends. They are 8th best at shooting percentage and 4th best at shooting percentage against. This isn't surprising since they have better than average players in attack and defense.


On the fortunate side of things are Auxerre and Montpellier. Like PSG, their place is largely based on differences between goal differential and points. To see how luck can be such a big factor, let's compare these three teams. I think most everyone would agree that PSG are ahead of those other clubs. By coincidence they have all played 12 matches that were decided by one goal or that ended in a draw. In their 12 matches, PSG are 2-5-5 for just 11 points. Auxerre are 5-5-2, 20 points, and Montpellier 7-3-2 for a huge 24 points. Despite being nearly unanimously regarded as the best of the three, PSG have gained 9 points fewer than Auxerre and a barely credible 13 fewer than Montpellier in close matches. There is a lot of variance in football.

Predictions

Bordeaux will win the league relatively comfortably. Though they probably are running over expectation, they are the best side in the league and have a 9-point lead at the midpoint of the season. I don't expect a surprise.

The top 3 will be Bordeaux, Lille and Marseille. I think these are the best three clubs in the Ligue 1. I think Marseille will jump over Montpellier and there will be a decent gap between the top three and the next tier.

The fight for fourth will come down to the wire. This isn't much of a prediction. Frankly I'm not willing to go out on a limb because I could easily see Montpellier, Auxerre, Lyon, PSG, Lorient or Stade Rennes finishing fourth. Even one of the next set of teams could go on a heater and take it. If there is a repeat of last year in the Coupe de France and one of the top sides don't win it, there will be a war for the Europa League spot and several clubs will be disappointed to be left out. At least this year, France deserve an extra spot as a the best couple clubs that finish out of the Europa League spots would be well above average for that competition.

Le Mans will get relegated and Saint Etienne will stay up. For (almost) every league I'm picking a team in the relegation zone to stay up and a team in the clear at the moment to get relegated. Here I'm going admittedly weak by flipping the 17th and 18th teams. Based fully on looking at stats, it seems that the bottom three teams are the worst in the league with St. Etienne clearly the best of that lot. Grenoble actually look better than Boulogne so I'll make them climbing out of the cellar a mini prediction to make the bottom of the table vaguely interesting. They are 6 points down and have played an extra match so they have some ground to make up. Le Mans seem to be pretty bad as well and are 6 points back of Nice so they got chosen for relegation.

Wednesday, January 6, 2010

Winter Break Review: Bundesliga

Continuing the winter-break series, I move to Germany.

Most Impressive

Though they sputtered a bit toward the end with some disappointing draws, I've got to go with Bayer 04 Leverkusen. It's tough to go to another side when there is an undefeated team at the top of the table with the highest goal differential. Looking at stats, Leverkusen also topped the table in shot differential (they took 6.88 more per match than their opponents) and shot-on-target differential (3.29 more per match than their opponents).

The new manager and some players being brought in seem to have helped. At the back, Hyppia appears to still have it at 36. They've also gotten some contribution from Schwaab and the return of Reinartz. They are currently tied with Schalke for fewest goals allowed, quite a difference from last season where they finished 7th in that category. While they may well start allowing more goals, they are on pace to concede just 26, 20 fewer than they let in last season.

Up top, Stefan Kießling has put 12 in. Not only is he ahead of everybody else in a league with a lot of attacking talent, that equals his total from all of last season. While his luck is probably a bit better than last year, looking at the stats it seems to be more about getting and creating more scoring chances. He's on pace to get 30 more shots and 20 more on target than last season. I don't want to go too far with just a half season of data, but given his age, 25, it's certainly possible that he's coming into his peak.

Most Disappointing

It was looking like Bayern for a while but they have been playing great football and pulled themselves back into the race. I'll take Stuttgart. Last season they finished 3rd. Expectations weren't extremely high since they lost Mario Gomez last summer, but they should certainly not be even on points with a team in the relegation zone at this stage. It's even more amazing when you consider their solid run in the Champions League. They got out of their admittedly easy group in second place with 9 points in 6 matches. Compare that to 16 points in their 17 Bundesliga matches. Obviously the variance is high in a 6-match group format, but having seen 3 of those matches they played like a pretty good side against opposition that is significantly better than nearly all of their domestic opponents.

The reason behind their struggles is some combination of bad luck and bad play in front of goal at both ends of the pitch. They have taken 59 more shots than their opponents (3.47 per match) with 15 more on target (0.88 per match). Despite that, Stuttgart have allowed 7 more goals than they've scored. They are dead last at converting shots-on-target into goals* and only 15th best in shooting percentage against. I think luck is definitely a factor but it looks like they are struggling to replace Gomez and are not getting good enough play out of Lehmann. I'm no manager, but I would strongly consider replacing the 40-year-old netminder.

Luck or Efficiency

The first indicator of a team being lucky or unlucky that I look at is their goal differential and points. Looking at past data, there is a very strong correlation between the two and it seems that differences are all due to luck, or nearly so.

The first team that jumped out was Werder Bremen. According to the goal-differential model which adjusts goal differential for differences in difficulty in schedule (which at this point just means a small drop for the clubs that have played an extra home match and increase for those with an extra one away) they have been running over 5 and a half points below expectation. A team with their goal differential should have nearly 34 points on average and they have 28. They have 7 draws and in matches decided by a goal they have 1 win and 2 losses. So in 10 close matches they have 10 points. Considering they have won 6 matches by 2 or more goals and only lost 1 by that margin, that's not very good. Others would probably say that they need to do better in close matches, I think it comes down to getting unlucky.

While not as extreme, Hoffenheim are also running below expectation in points for a team with their goal differential. They are at 25 points and the average team with their goal differential would have nearly 30. In contrast to Bremenm, Hoffe don't have a lot of draws, but they have 4 losses by a single goal to go with their 2 wins by one and 4 draws. Given their 5 wins and no losses by 2 or more goals, they've certainly been unlucky in close matches and should have more points.

The three teams at the other side of the luck spectrum are Freiburg, who should be about 4 points closer to the relegation zone, and the two Ruhr rivals Dortmund and Schalke. Dortmund are 3 points above expectation and Schalke 2.5. Those aren't too extreme, but both clubs will need to play better or keep getting fortunate results to stay where they are. This is especially important for Dortmund who are sitting in 5th in what promises to be an intense battle for the right to play in Europe next season.

The next stage in looking at luck is to compare how teams are in the table at the moment with how they rate in my stats model. The most important inputs are shots and shots on target. As a result, big differences are often due to a team shooting very well or poorly and/or stopping a high or low percentage of their opponents' shots. Unlike with goal differential and points, there certainly is a skill component at play here. If you have good forwards and your defenders and goalkeeper are solid then you are going to be above average when it comes to converting your chances into goals and preventing the same for your opponents. Having said that, there is definitely a luck component as well. I will study this more in the future. For now I think it's safe to assume that both luck and skill play a role.

The previously discussed Stuttgart are the club performing most below expectation given their stats. After them is Werder Bremen. The model rates Bremen 2nd best, just ahead of Bayern, but they are sixth in the table. When it comes to shots, they are second behind Leverkusen in both shot and shot-on-target differential. They are 10th in the league in percentage of their shots on target that are goals. Bremen seem to have gotten good play from Wiese as they have allowed the lowest shooting percentage against them in the Bundesliga. It seems that they are in 6th position almost solely because of their bad luck when it comes to getting points in close matches.

On the fortunate side of things, Mainz is the club with the biggest difference between the table and where they are in my stats model. This appears to be a combination of running a bit above expectation in close matches and being above average in front of both goals. Schalke also seem to be in an overly fortunate position. In addition to running over expectation in close matches, they also seem to be doing overly well at converting chances and in goal. They are only 6th best in shot differential and 9th best going just by those on target, but have the 4th highest percentage of shots-on-target going in and 3rd lowest allowed for their opponents. Having said that, they have some good attacking players in Kuranyi and Farfan and Neuer is a very good goalkeeper. One would expect them to be near the top in both categories.

Predictions

It will be a two-horse race between Bayer Leverkusen and Bayern Munich for the title. I don't think Schalke will be able to keep up this pace. Hamburg and even Bremen could enter the fight, but it would take a good run plus both Leverkusen and Bayern falling off. I haven't discussed Bayern much in this article, but it seems like van Gaal has finally got his system in place and the guys playing great football. The betting markets have them as favorites to win the league at this stage. I still give a slight edge to Leverkusen, but I think it will come down to the last couple weeks.

The battle for European spots will come down to the last week. This might not be much of a prediction, but I think the Bundesliga will be the league with the most tension near the end of the season. After the top two it should be tight as I think Hamburg and Werder Bremen should put up a good fight with Schalke for the third Champions League spot. Hoffenheim and Dortmund can't be counted out either, so it will be 5 good teams fighting for those 3 positions depending on the cup result.

The bottom three now will all be relegated. My original intent when writing about the leagues during the break was to pick a club in the relegation zone to stay up and a club out of the relegation zone to get the drop. That lasted all of 1 article. Bochum, Nurnberg and Hertha Berlin are the three worst clubs in the Bundesliga and are in the bottom three spots. If you forced me to flip two I'd say Koln is the most likely to be relegated of the safe teams, with Freiburg just behind. Bochum is the only one of the bottom three that looks to have any chance of recovery.


* as I wrote in my Serie A article, this could be slightly off as there may be goals, especially own goals, that result from no shot being taken. For simplicity I will define shooting percentage as goals per shot on target.

Monday, January 4, 2010

Winter Break Review: Serie A

I'm going to try to go over each of the big leagues at this point discussing where we are and what should be expected going forward. I missed the Liga and will probably write that article at the halfway point. I will be giving my opinions more in these articles than usual but will still use the goal-differential and stats-based models.

Most Impressive

This is the easiest selection of any league. Inter have made the Serie A look like one of the small leagues that has one strong team. They are 8 points clear of their Milanesi rivals, who sit in second place with a match in hand, and 9 clear of Juventus. They are a ridiculous 13 goals better in goal differential than Juventus who are next best. It would take something major for Inter not to win at this stage.

Most Disappointing

I'm not sure where to go with this. To me the most disappointing club at the start of the season was Napoli, but they've done a good job of clawing their way back into the race for the European spots. Since Mazzarri took over they have a record of 5 wins and 5 draws, including a win at Juventus and draws against Milan and Parma. That's still not enough to challenge Inter, but it looks like they'll fight for Europe at least. I still wonder about their tactics. The board at Napoli seems to be quite fond of having just 3 backs as they have run a 3-4-3 or 3-5-2 with Reja, Donadoni and now Mazzarri. In my scoring model they rate as the 5th best team at scoring but only 14th best at stopping the other team from doing so. We're just short of halfway through the season so a lot can change, but it does seem like they have some defensive issues for a team challenging for a Champions League spot.

Milan are a similar story as they started out slowly but have since moved up to around where they should be. To be honest I'm not sure that there really is a standout team that fits the bill. Genoa and Fiorentina have slipped a bit but I think that's mostly just because the league is pretty tight after Inter. They are a short run of good results away from moving back up into the European spots.

Biggest Surprise

Again, it's tough to find one. Inter were expected to be better than the other contenders and they are. I'm a bit surprised that they are this much clear already, but it's far from shocking. There has definitely been some good football, but in terms of surprises or suspense as far as the results go, it's been on the boring side frankly.

Lucky (More Efficient?) Sides

Luck always plays a big role in sports results. It's not discussed enough in my view, which is why I try to write a lot about it. The models I use are a good way to look at which teams have gotten better or worse than average luck. The first way I measure it is comparing points and goal differential. One place where luck plays an important role is in the timing of goals. As I have argued, the number of points a team gets is based on two things: goal differential and variance. If a team has a lot of points compared to other teams with the same goal differential then they've been lucky and vice-versa. I make a slight adjustment to goal differential by using the goal-only model to account for differences in schedule.

The stats-based model can go a step further because it can also give an indication that a team has gotten lucky or unlucky when it comes to scoring and allowing goals. For example, a team may be consistently getting more shots than their opponent but have the opponent's keeper make a lot of great saves while giving up some goals on bad bounces at the other end. The stats model I believe to be very accurate in ranking the teams, though it is overly favorable to weaker sides.

Looking at goal differential and points, it is unsurprising that the teams that have been most unlucky are near the bottom of the table. The three with the worst luck are Catania (18th in the table), Lazio (16th) and Siena (20th). At the other end you have Milan, Parma and Sampdoria as the most fortunate. Looking at Milan, they are 3 goals behind Juve but have one point more. According to the model and the regression formula from this article, Milan are nearly 6 points above expectation for a team with their goal differential.

Digging a bit deeper, the biggest thing jumping out from the stats-based model are two teams that appear to have been quite unlucky: Lazio and ChievoVerona. Lazio rate as 9th best in the model but are 16th in the table, 17th if you go by points per match. They also are 15th in goal differential per match so it's not all about getting a lot of big wins and close losses as discussed in the previous paragraph. I'll write more about shots later, but both shots and shots-on-target are strong indicators of the strength of a team. Lazio are 6th best so far this season in shot differential per match and 9th best in shots-on-target differential. They have shot the ball an average of nearly 2 more times per match than their opponents and have the same number of shots on target as they've allowed. That doesn't sound like a team closer to the bottom of the table than the middle.

Since they have taken and allowed the same number of shots-on-target but have allowed 6 more goals than they've scored, it's clear that they have problems in front of goal at one or both ends of the pitch. The problem is 100% on the attacking end. Lazio rate dead last when it comes to converting shots-on-target into goals. They have taken 7.5 shots for every goal, a scoring rate of 13.3%*. For comparison, the league average is 3.58, or scoring on 28% of shots on target. In other words, they require more than twice the number of shots on target to get a goal as the league average. Defensively and in goal they are actually good. They have allowed the 4th lowest shooting percentage; their opponents have scored on just over 21% of their shots on target. I'm sure with numbers that bad they've been unlucky, but it seems clear that the Aquile need to get better showings out of Zarate, Rocchi and Cruz up top.

ChievoVerona also seem to be running well below expectation. The model rates them as the sixth best side, while they sit 12th in the table (13th if you go by points per match). They are 6th best in shot differential per match and 4th in shots-on-target differential. Given that, it's surprising that they find themselves in the bottom half of the table. While I don't think I've seen them play this season, the numbers indicate that they have had some combination of being unlucky at both ends in front of goal, not doing a good enough job of finishing and letting too many shots go in. Looking at goals per shot-on-target, they rate 15th best at converting shots into goals and 12th best when it comes to goals conceded per shot-on-target allowed. While not as extreme as Lazio, Chievo have probably not been good enough at converting chances and been a bit unlucky as well.

At the other end of the spectrum are Genoa and Palermo. Both are about 6 spots higher in the table than the model ranking. Genoa have been extremely efficient in front of goal, scoring on a higher percentage of their shots on target than any other club in the Serie A. They have scored on nearly 42% of their shots, about one and a half times the league average. Particularly impressive are Crespo up top with 4 goals on only 7 shots on target and in the midfield Mesto who has 4 goals on just 6 shots on target. I have nothing against the Genoa players, Crespo certainly was elite in his prime, but I'd say they have been fortunate to score as many goals as they have. I expect them to slow down. As for Palermo, I'm not really sure to what to attribute their position.

Predictions

I'll make a few predictions and revisit them from time to time later in the season. A lot of these are pretty clear based on what I've written above.

Inter will run away with the league title. This prediction is coming in off a limb.

Juventus will jump over Milan. I don't think this is too crazy of a prediction either. Juve have looked better to me and both of my models agree. I think they are enough better to overcome the current deficit which is one point plus one match. I expect Juventus to be clear while Milan and Roma fight for the third automatic Champions League spot.

The four Champions League teams next year will be Inter, Juventus, Milan and Roma. Right now Parma are tied with Roma and there are 7 other teams within 4 points. I don't think any of those teams will jump into the top 4.

Bologna will get relegated, Catania will be safe. In each of these articles I'll pick a club that is out of the relegation zone to go down and one in the bottom three to stay up. My picks for the Serie A are Bologna to get the drop and Catania to stay clear. According to the stat-based model, Bologna rate as the worst team in the Serie A. Part of this is that they are last in both shot and shot-on-target differential. They have some decent, though past their prime, attacking talent but have surely been fortunate in that they have put away a high percentage of their chances. Of the three bottom teams, the model rates Catania the best at 15th. I expect them to be more stingy the rest of the season and get out of the relegation zone.




* This could be slightly off. I've never seen a definition of a shot on target so it's possible that you could have a goal without a shot. It seems likely that an own goal, especially the most embarrassing kind, would not be recorded as a shot. None the less, for simplicity of language I will define shooting percentage as goals per shots on target.