Friday, October 30, 2009

North London Derby Preview

The biggest London derby kicks off at 12:45 local. If you are in the US you can see it on ESPN2 at 8:30 AM Eastern.

History

Though they played before, this derby really got going in 1913 when Arsenal moved from Plumstead to Highbury. That put the teams four miles from each other. Unlike the East Lancashire Derby, these two teams have met nearly every year. Since 1950 they've been in the same division every season but one. In fact they have met 144 times in league play. In those matches, Arsenal have had the upper hand winning 59. Tottenham have won 45 and 40 were draws.

Recent history is significantly worse for Spurs. They haven't beaten Arsenal in league play in nearly 10 years, last doing so at White Hart Lane on November 7th, 1999. The last time Spurs won in Highbury in league play was in 1993. The last 10 years at home in the derby Arsenal have 8 wins and 2 draws. Those draws came last season and 4 seasons ago so I suppose you could argue that Spurs are doing better in this fixture if you like.

Form

The two clubs are level on points though Arsenal have a match in hand. Despite that and the season still being pretty young, their form is pretty different. Spurs started out on fire with four wins. They have since gone 2-1-3. To be fair, two of their three losses were at Chelsea and against Manchester United so schedule plays a role. They can't feel good about their loss last week though as it was at home against Stoke City, the first win for the Potters away from home this season. Since losing 4-2 at Manchester City, the end of two straight matchdays of losing in Manchester, Arsenal have won 4 and last week got a draw at West Ham. Expanding beyond league play, both teams won at home in the midweek against opponents from Liverpool. Spurs beat Everton 2-0 while Arsenal took care of Liverpool 2-1. The previous week Arsenal also had a disappointing 1-1 draw at AZ Alkmaar, giving up the equalizer in stoppage time in the second half.

Injuries and Suspensions

Arsenal have several injuries. Rosicky is questionable with a knee injury and Denison, Djourou, Wilshere and Walcott will all be out. Goalkeeper Lukasz Fabianski picked up a thigh injury in the league cup so he should be out as well. It's unlucky for Fabianski who was playing in his first match of the year due to a knee injury.

For Spurs, Jermain Defoe is still out suspended and Modric out with a broken leg. Other concerns are Aaron Lennon and Giovani Dos Santos. Both have ankle injuries and may or may not be ready. Jonathan Woodgate and Ledley King will probably be available but have had injury issues as well.

Scoring and Conceding

Arsenal have been the best scoring team this season. They have scored 5 more goals than any other club despite playing one fewer match than most. For the Gunners, the problems look to be at the defensive end. They have conceded 13 goals, 5 more than Chelsea, 2 more than Manchester United and the same as Liverpool. Again though, that's playing one fewer match than those title contenders. It is inevitable that their goal-scoring rate will slow down as they are on an unsustainable pace. They rate as the best team at scoring but only 10th best at defending in my EPL ranking.

Looking at goal differential, Tottenham have been running a bit above expectation. The teams around them have significantly higher goal differentials. My model rates them as the 4th best scoring team, 9th best defensively and 6th best overall.

Predictions

I'm a bit in between models right now. I'm using the PLM to predict the result and the Poisson model to give me the most likely scorelines. Both models have Arsenal as huge favorites. The PLM gives them a 67% chance, Spurs just a 14% shot with the other 19% the likelihood of a draw. It gives 3-1, 3-2, 4-1, and 4-2 all about the same chance as the most likely scorelines, though each only have around a 5% chance. This is one of those matches where I don't fully trust the model. The problem is that Arsenal have been running at a ridiculous pace, almost certainly scoring more goals than expected given their skill level. They are on pace for 122 goals for the season while last year they scored only 68. I think Arsenal are still favorites but not to that extent. Something more like 50% Arsenal win and 25% for Spurs and the draw seems about right to me.

I'm making my prediction 3-1 Arsenal.

Wednesday, October 28, 2009

MLS Playoff Predictions

In case you are unfamiliar, here is the format. The first round is the semifinals for each conference. Each pair of teams plays two games, one in each team's stadium. At the end of the two matches they add the scores up and the team with the most total goals wins and advances to the next round. If the scores remain tied (note that away goals don't matter as they do in some other competitions) then they play two 15-minute periods of extra time and penalties if necessary. That's the format for the conference semifinals. For the conference finals, it is only one match played at the higher seed's (smaller number) stadium. The MLS Cup final is a one-off that will take place November 22nd at Qwest Field in Seattle.

Personally, I like the one-game final at a site decided before the playoffs. I think the conference finals should follow the same format as the semifinals. Especially with the unbalanced schedules during the regular season, I think it gives too much of an advantage to the higher-seeded teams. If it's due to scheduling I think they should flip it so that the first round is just one match and the conference finals have the two-leg format.

Here are the matchups:

East
Real Salt Lake (4 seed) - Columbus Crew (1)
New England Revolution (3) - Chicago Fire (2)

West
Chivas USA (4) - Los Angeles Galaxy (1)
Seattle Sounders FC (3) - Houston Dynamo (2)

The first team listed plays at home the first leg. This gives an advantage to the higher seed because it's better to play at home the second leg since extra time could happen in the second leg.

Predictions

I'm actually unhappy with the version of the Poisson-Logit Model that predicts scorelines. The problem is that it doesn't seem to be sensitive enough to the attacking and defending qualities of the two teams. Therefore, I'm reverting back to the Poisson Model. Though it has problems as well, I think it will even out somewhat since each team plays at home making the predictions reasonably close. If it's tied after two legs, I'm giving the team that plays at home last a 55% chance of winning.

Here are the predictions:

Real Salt Lake - Columbus Crew

Columbus got a tough draw in my opinion. Despite finishing best in the league, their reward is the best scoring team in the playoffs. Real Salt Lake were unlucky to get just 40 points and sneak into the playoffs and I have them fourth overall. In addition, Utah is never an easy place to play due to the altitude. I think this tie will be closer than most people think. The first leg should be huge. If Columbus can hold Salt Lake scoreless, they will be in great shape heading home for the second leg. It will be tough sledding if they are down at that point.

Poisson predictions:
Columbus to advance: 52.4% (42.1% outright, 10.2% in extra time or shootout)
Salt Lake to advance: 47.6% (39.2%, 8.4%)

New England Revolution - Chicago Fire

As unlucky as Columbus were in the matchup, Chicago were just as lucky. New England are by far the worst team in the playoffs. They only got in due to above-expectation results in close matches which, again, is mostly, if not fully, due to luck. The Revs are the worst team in the playoffs at both scoring and conceding. Despite all that, there is a lot of variance in this format and since the MLS has so much parity, it won't necessarily be as lopsided as I'm implying. Chicago should win but New England certainly have a chance.

Poisson Prediction:
Chicago to advance: 57.3% (46.7% outright, 10.3% in extra time or penalties)
New England to advance: 42.7% (34.3%, 8.4%)

Chivas USA - L.A. Galaxy

Both these teams were near the bottom at scoring but pretty good at keeping opponents from scoring. I wouldn't expect a high-scoring affair. I'm sure others think the Galaxy have a big edge, but I have these two teams as very even. I think this will be very tight tie.

Poisson Prediction:
Galaxy to advance: 51.9% (40.2%, 11.7%)
Chivas to advance: 48.1% (38.5%, 9.6%)

Seattle Sounders FC - Houston Dynamo

This is a great matchup that should be better than the conference final will be no matter which of these teams advance. In my rankings these teams are right in the middle offensively and the two stingiest teams in the league. Houston edge out Seattle by one spot in each category. I expect a good, hard-fought tie that will still be up in the air in the second half of the second leg.

Poisson Prediction:
Dynamo to advance: 52.1% (40.5%, 11.6%)
Sounders to advance: 47.9% (38.3, 9.5%)

Championship Predictions

As I mention above, there is a lot of parity in the league. While I think New England is the worst team in the playoffs, the gap between the Revs and the top teams isn't nearly as large as you would see in cup competitions in Europe or South America even in later rounds. That combined with the format, which features a single match in the conference finals and the final, mean things are wide open. Here are the title percentages according to the Poisson model:

Columbus - 17.4%
Seattle - 15.8%
Houston - 15.1%
Galaxy - 14.5%
Chicago - 12.2%
Salt Lake - 9.9%
Chivas - 8.5%
New England - 6.6%

Needless to say, there is no clear favorite.

MLS End of Regular Season Rankings

The season has wrapped up and the playoffs begin on Thursday. I'll shortly post an article with playoff discussion. I wanted to separate the ranking discussion because it turned out to be a lot longer than I thought it would be.

End-of-Season Rankings


O Rank - goal-scoring rank
D Rank - goal-preventing rank
EGD - expected goal difference if each team played two matches against every other team at level shown by season results

A feature of my rankings is that at the end of the season for most leagues the rankings converge to goal differential. They don't for the MLS and Scottish leagues due to the unbalanced schedule. Each team plays every other team twice with two additional matches against a conference opponent. They stay pretty close, but teams with harder or easier opponents those extra two games will have their expected goal differential adjust based on how hard they were. For example, Houston and Columbus finished even on goal differential but Houston's extra games were against tougher opponents - Salt Lake and Dallas compared to Chicago and Toronto.

For a comparison to other rankings, here are Elo ratings. There is an explanation link near the top of that page. Elo ratings are commonly used for chess. There is a formula that gives or takes away points based on the result, including the score, of each individual match. The Elo formula emphasizes the match result which I ignore. I just take goals scored and allowed into account giving no extra bonus for winning or penalty for losing. The other big difference between that and what I have here is that there is some carryover from last year in those ELO ratings and I just started from scratch at the beginning of this season. To be honest, I think including previous results is probably better for stating which teams are the best. It's one of many things I plan to work on in the future as I'm not sure how to best use old results.

For a subjective rankings, check out Soccernet's Power Rankings. They "consist of a consolidation of votes from Jeff Carlisle, Jen Chang, Steve Davis, Frank Dell'Apa, Kristian Dyer, Ives Galarcep, Allen Hopkins and Andrew Hush." In my view they seriously overrate New England and Chivas while underrating Salt Lake and Colorado. This doesn't surprise me as these were teams that over or underperformed in close matches, which I've shown is mostly (and probably all) luck.

Looking at the season as a whole, I think it's fair to say that Houston and Columbus are the top two teams in the league. I think you can argue either way which of the two is better. Columbus won the Supporters Shield given to the team with the most points in league play. Houston finished just one point back and played a slightly tougher schedule. Really there's not much separating them and a bounce here or there could have flipped things. The Galaxy finished the league even on points with Houston but again I think this was because they had good luck in close games.

Tuesday, October 27, 2009

Liga Española Matchday 8

Barcelona continued their great run with the 6-1 destruction of Real Zaragoza. Sevilla and Real Madrid both got 0-0 draws against bad teams. I'd love to talk up the league and discuss how exciting it will be but to be honest so far my favorite league is shaping up to play out like last year. Barça are currently 3 points clear of Real Madrid and 6 clear of Sevilla. They have looked consistently a lot better than those two rivals and certainly seem far less prone to drop points as Sevilla and Real Madrid did last weekend. For evidence of that look at the Copa del Rey result. Real Madrid lost 4-0 to Segunda B (third flight) team Alcorcon!

I think Barça will probably continue to pull away and a win in the Camp Nou rendition of the Clásico November 29th could put them 9 or 10 points clear. Last year, Barcelona went 12 clear of Real Madrid after a 2-0 win over the Merengues at the Camp Nou on December 13th. At that point Real Madrid were 6th in the table. From that point on they were very strong, but could not make up that deficit. We shall see if history repeats itself.

Rankings:


O Rank: rank of team in scoring
D Rank: rank of team in preventing scoring
EGD: expected goal differential if they played a full season at the level the results thus far have shown

As you can see, Barcelona moved to the top of the rankings in both categories. Like I said, it's shaping up to look like last season. Atleti and Villarreal moved up a fair bit. It'll be interesting to see where they are in a few weeks.

Monday, October 26, 2009

English Premier League Quarter-Time Report

I'm going to mix it up a bit and post some thoughts about the league in general since we are a quarter of the way through the season.

Biggest Surprise

Positive: Burnley
I think everyone expected the Clarets to be the caboose this season. They may well be, but currently are sitting in the middle of the table four points clear of relegation. They also have the most surprising result of the season by virtue of their 1-0 win over Manchester United.

Negative: Everton
The Toffees finished fifth last year and while I don't think anybody expected them to challenge for the league title, currently sitting 14th 7 points out of the Europa League and just 3 points clear of relegation is definitely not what they were hoping for.

Some Predictions

West Ham will not be relegated. The Hammers are currently second from the bottom and have had a bad run over the last 5 or 6 matches but I think they will be safe this season. Their schedule has been brutal so far and their results haven't been bad considering. Their goal differential is currently -4. Given their goal differential they have been running below expectation for points (see this article for that discussion) and that's without having huge wins like Arsenal have had. I think West Ham will be out of the relegation zone a month from now and out of the relegation fight well before the end of the season. I'll go so far as to say that I think they'll be closer in points to the team in 11th (midtable) than 18th (relegated). While it's certainly possible that they get the drop, I will be very surprised if that happens assuming they don't do anything rash while they are near the bottom of the table.

Burnley will slide. I said they are the surprise team of the first quarter of the season, but that's partly due to variance. They have for the most part either won close matches or been beaten by several goals. That's not a good sign when it comes to long-term prospects. I'm not going to flip the West Ham pick; Burnley have a good shot at staying up. Having said that I think they'll certainly be fighting for the right to play in the Premier League.

The fight for the league will remain competitive with 4 or more teams within striking distance at the new-year break. I'm far less sure of this prediction than the previous two. It's easy to see why as a team or two falling off makes it fail. I expect Chelsea, Manchester United, Arsenal and Liverpool to be within 5 points when they take a break after the end-of-the-year push. That's about halfway through the season. Man City and Spurs could definitely be in that mix as well, but I think both will be a bit further out. This is likely to be one of the most competitive seasons over the last few years.

Rankings:

Here are the updated rankings:



I'm starting to think that Everton are actually a lower mid-table team this season instead of a team that will fight for a spot in Europe. I still think the rankings have them too low but I'm getting less and less confident every week that they'll move up significantly. Arsenal are an interesting case as well as they have stayed on top another week. Their defense needs to improve if they are to have a shot at finishing where I have them now. If you are an Arsenal fan you'd better hope that their goal conceding slows down when their goal scoring does, because that's inevitable.

Saturday, October 24, 2009

Sunday Preview: Liverpool - Manchester United

My apologies for getting this out late. I was hoping to have it Thursday but some personal things came up plus I've been working on a new model and some other stuff.

The biggest rivalry in England has no doubt lost its luster but remains one of the better rivalries in sport.

History

The clubs at each end of the Mersey are easily the two most successful clubs in England. Both have 18 domestic league titles, both have 14 domestic cup titles (though Manchester United have 11 FA Cups to Liverpool's 7). Liverpool have the edge in Europe with 5 Champions League or European Cups plus three UEFA Cups. United have 3 Champions League/European Cups and a Cup Winners' Cup.

I say the rivalry has lost its luster because most all of the above for Liverpool happened over 20 years ago. In the very likely event that Liverpool fail to win the league title this season, they will have gone 20 seasons without doing so. You could probably be a millionaire now if you'd bet just a few bucks (or quid if you like) on them having 20 dry seasons in a row. In contrast, Manchester United have won 11 league titles in that span.

Looking head-to-head, in league matches Liverpool have won 51 matches, United 58 matches and they have 43 draws. That 7-win gap is exactly from the last 19 years where Liverpool have not won the title. In that span Liverpool have won 11, United 18 and they have had 9 draws. In the last 10 seasons at Anfield Liverpool have 3 wins, Manchester United 5 wins and they played to a draw twice.

Form

Liverpool are on an historically bad run. They have lost 4 matches in a row in all competitions, which hasn't happened in more than 20 years. Looking just at the league, they have 3 wins and 2 losses in their last 5. Manchester United have been going in a more Manchester Unitedly direction. They have 10 wins and a draw in their last 11 matches in all competitions and 4 wins and a draw in their last five league matches.

Of note, Rafa Benitez has a lot of pressure on him and it's conceivable that the Liverpool manager could get fired, sacked as it were, if things go really badly tomorrow.

Injuries

Injuries are a big story, unfortunately. For Liverpool, Torres and Glen Johnson have been out and Gerrard was subbed out early in the Champions League loss due to a nagging groin injury. All three are in doubt but it looks likely that Torres and Johnson will play. On the other side, the most important injury news is that Manchester United are expected to be without Rooney due to an injured calf. Darren Fletcher and Park Ji-Sung will be out as well. Fortunately for them, Giggs and Evra are expected to be back.

Scoring and Conceding

Based on schedules and goals scored, the model rates the two teams as almost exactly even offensively. The biggest difference is at the other end. Defensively, Manchester United rate just behind Chelsea in second. Liverpool are significantly worse, ranking 7th defensively. The model says that if they played a full season at the level shown so far, Liverpool would be expected to concede 15 more goals than United.

Predictions

The model does not take form or injuries into account (something I'm working on probably with a new model). Given how extreme both are for this match, keep that in mind. It gives Liverpool a slight edge, largely due to playing at home. It says they will win 39% of the time, Manchester United 33% and they will play to a draw 28% of the time. To be honest I'm not sure how it should be adjusted. Gerrard being out or far less than 100%, Torres not being at his best assuming he plays and Liverpool's awful form and the potential chaos surrounding Rafa Benitez all go against Liverpool. On the other hand, Rooney being out is a blow for United. I think that overall probably favors Man U. I would say flipping them seems about right, so 33% for Liverpool, 39% for Man United and 28% for the draw.

For scorelines, according to the model the most likely result is a 1-1 draw with 2-1 and 1-2 after that. I think with Torres not fully fit, Gerrard and Rooney likely out, 1-0 or 0-1 are more likely to be winning scores than 2-1 and 1-2. I'll make my prediction 0-1 for Manchester United.

Wednesday, October 21, 2009

Stat Series: Corner Kicks

This is the first in what is hopefully a several part series on stats that are out there and what they say about a match.

Conventional wisdom on corners.

A team is awarded a corner kick when the ball goes over the goal line outside of the goal and it is last played by the defense. In almost all cases this happens because the team winning the corner was in a dangerous position. Therefore, if A has a lot of corners and B just a few or none at all then A is dominating B and is surely the better team on the day. The number of corner kicks is a decent indicator of how the match went.

What does the data say?

To examine the significance of corner stats, I looked at all matches from the previous five seasons in the English Premier League, French Ligue 1, German Bundesliga, Italian Serie A and Spanish Primera Division. I was surprised by the results.

Again the main tool I used was the correlation coefficient. First I looked at the correlation between goals for the home team and their number of corners in the match as well as the same for the away team. I was quite surprised that not only was value very small but it was negative! For the home team there is a correlation of -0.046 and -0.031 for the away team. In other words, while the relationship is extremely weak, the fewer corners a team has the more goals they tend to score. Because the sample size is large, 5432 matches total, these numbers are statistically significant at the standard 5% level. Having said that, they are clearly not actually significant. In other words, the data suggests that the number of corners really doesn't indicate much of anything about the number of goals a team will score but to the very small extent that it does there is a negative relationship.

Next I did the same thing, but for the opposing team's corners; I compared the number of goals scored by the home team with the number of corners won by the away team and vice-versa. This turned out to be positive, but very small. For the home team and away corners the correlation is 0.0195. For away goals and home corners it is 0.0265. So again, the numbers indicate that there is basically no relationship (these numbers are not statistically significantly different than 0), but there is, if anything, a slight relationship between the number of goals a team scores and the number of corners they concede.

Still wondering what was going on, I divided the matches into those where the home team won, those that were draws and those where the away team won. I then broke those into matches where the home team got more corners than the away team, they got the same number and those where the away team got more corners. Again I was surprised by the result. Home teams that won the match got more corners 55% of the time, the same number of 10% of the time and fewer corners 35% of the time. That seems to make sense and contradict the previous findings. The story doesn't end there, however. In matches that were draws, the home team got more corners 60% of the time, the away team 30% of the time and 10% of the time they got the same number. Finally, when the away team won the match, the home team got more corners 63% of the time, the away team 28% and they got the same number 9%.

In other words, the home team won the corner battle a majority of the time no matter the outcome of the match. However, in matches that were draws they got more corners with higher frequency than those that they won. In losses for the home side they won the corner battle even more often. So again we have an indication that teams winning tend to get fewer corners than teams losing, though the difference isn't all that large.

Finally, I looked at the connection between margin of victory and corner difference in an individual match. Starting with correlation, it again was small and negative: -0.058. Looking into it further, I took the average corner differential for each different scoreline. Here is the chart:

Goal Difference : Corner Difference (sample size)
-3 : 1.36 (163)
-2 : 1.84 (423)
-1 : 1.84 (821)
0 : 1.64 (1445)
+1 : 0.98 (1261)
+2 : 0.65 (758)
+3 : 1.00 (349)

In words, in matches where the home team lost by 3 goals they averaged 1.36 corners more than the away team. As you can see the story continues. The home team on average wins more corners than the away team in all types of matches. However, this margin is larger for matches that the home team loses. than those they win. Again we have a negative relationship.

What is going on here?

The "correlation does not imply causation" cliche applies here. A team obviously can't improve their chances of scoring a goal by playing it back and kicking it out to give the other team a corner. It's just a theory, but here's what I think is going on.

At the start of this article, I stated what I feel is the conventional wisdom that due to the usually dangerous circumstances that create a corner, dominating a match will lead to more corners. I think this needs to be changed. It's not dominance that leads to most corners but sustained pressure. Corners are more likely to occur when both teams are in around the area. For example, a team with a corner kick or other set piece around the goal often gets a corner when the ball is headed out by a defender or a shot is deflected wide off a defender or the goalkeeper. These situations are more likely to develop when a team is behind and pressing. If a team is ahead then they don't need to press forward as much, and usually won't because they would then be susceptible to a counterattack and a goal scored is less good for them than a goal conceded would be bad. On the other hand, a team that is behind is more likely to press forward since they are desperate for a goal. In a sense I'm saying that scoring a goal to go ahead will cause a team to later give up more corners.

The only evidence I can offer of this is that I think the same applies to home teams, which is why they average more corners than the away team. There isn't data on "pressing forward" but it certainly seems like a team will tend to push guys forward more when they are at home than away. Perhaps players are just the slightest bit more tentative when playing away and that makes a difference. Maybe this is the reason that there is a home-ground advantage even though teams are playing on the same pitch under the same rules.

Keep in mind that the explanation I'm offering is why the very small relationship is in the direction it is. There is a lot of randomness when it comes to corners. Sometimes a team will win a corner just about every trip into the attacking third. Other times their attacks will just so happen to end in throw ins, goal kicks, clearances or goals instead of corner kicks. In my opinion, the main thing to take away from all this is that corners alone don't say much about which team was better in a match. It's possible that a stat like "corners while the score is tied" would indicate which team is better, but I have never seen that sort of thing published.

What do you think of my theory? Do you have a different explanation for why teams that win more corners tend to score just a tiny bit less than those getting fewer? Were you surprised that there is pretty much no relationship at all between them? I'm very curious what you think.

Tuesday, October 20, 2009

La Liga Update

Here were the results last weekend:
Deportivo 1 - 0 Sevilla
Real Madrid 4 - 2 Valladolid
Valencia 0 - 0 Barcelona
Xerez 2 - 1 Villarreal
Espanyol 2 - 1 Tenerife
Mallorca 3 - 1 Getafe
Zaragoza 2 - 2 Racing
Malaga 1 - 2 Almería
Athletic 1 - 2 Sporting
Osasuna 3 - 0 Atlético

The big match to talk about was Valencia - Barcelona, which is one of the most entertaining 0-0 draws I can remember. Depor are looking more like Champions League contenders than a mid-table team right now with a nice win for them over Sevilla. Real Madrid continued to roll and their city mates Atletico continue to struggle. Villarreal also got a loss away at Xerez, almost certainly the worst team in the league.

Here are the rankings:

Premiership Update

I'll be quick here as I'm working on some other stuff at the moment.

After a week off here are the results from last weekend:

Aston Villa 2 - 1 Chelsea
Arsenal 3 - 1 Birmingham
Everton 1 - 1 Wolverhampton
Man Utd 2 - 1 Bolton
Portsmouth 1 - 2 Tottenham
Stoke 2 - 1 West Ham
Sunderland 1 - 0 Liverpool
Blackburn 3 - 2 Burnley
Wigan 1 - 1 Man City
Fulham 2 - 0 Hull City

Here are the rankings:



Things are pretty similar to last week. Despite Tottenham getting a win and Man City a draw, City moved above Spurs. Despite losing, Chelsea moved ahead of Liverpool, who also lost in comedic fashion. The biggest mover in the rankings was Bolton, who did much better than expected at Old Trafford. Unfortunately for them a loss worth no points even if it does move you up in my incredibly important rankings.

Poisson-Logit Model, a Deeper Look

I've done a bit more work with the Poisson-Logit Model (hereafter PLM), so I thought I would make a post about it that I can reference in the future.

How It Works

As the name implies, it has two stages: the Poisson stage and the logit stage.

The Poisson stage breaks the game down into two parts which are the scoring at either end. Using goals scored and conceded for each team, total home goals to get a home-ground advantage, and the schedule of each team it outputs a scoring and defending factor. When two teams play each other, the average goals a team will score is their scoring factor times the other team's defending factor, times the home-ground advantage multiplier if they are playing at home.

The logit stage uses previous data to get a representation of how often each result (home win, draw, away win) have historically happened when two teams meet with average goals as determined by the Poisson stage. So the inputs are the scoring and conceding factors for each team as well as the home-ground multiplier. The output is a home win probability, an away win probability and the probability of a draw. For occasions where scorelines are of interest I also have run a different version which as the same inputs and gives as an output the probability of all scorelines from 0-0 all the way to 10-9.

How Has It Done?

Past results with it have been quite good. I have run it on 8 seasons of data from several leagues. An important thing is that my suspicions were confirmed and it worked a lot better when I didn't use it for matches where the home team had played fewer than 10 matches. In other words, I waited 10 weeks into the season for there to be some kind of sample size so there wouldn't be crazy data or some of the small sample problems we've seen here (see the Manchester derby predictions).

The first test is how many predicted home wins, draws and away wins the model predicts compared to what actually happened. Here are the results:

Type Pred. Actual
Home 16456 16479
Draw 9565. 9444
Away 8756. 8853

As you can see, it's off by a bit but quite close when you think of them in percentage terms. Using the Chi-Square goodness of fit test, I got a p-value of 0.267. This indicates a good fit.

For the next test, I broke the matches up into groups by how often the model predicted the home team to win (I did the same for draws and away wins and the results were similar). For each group I then compared the predictions to how often the home team actually won to see how accurate the model got the predictions. The results there were quite good. Here's a graph with them. Instead of using bar charts I'm using a line graph as I think it's easier to see where the (slight) differences are:



The groups are every 5%. For example, there were only 26 matches where it was predicted that the home team had less than a 5% chance. These predictions averaged out to 2.96%. The home team actually won 11.5% of the time so you can see that at the bottom of the graph the actual line is a fair bit above the prediction line. It goes back toward the actual line because the model did better from 5% to 10%.

Other than the very bottom group, the predictions of the model were within a couple standard deviations of the predicted percentage, usually a lot less. Running a similar goodness of fit test on this, I got a ridiculous p-value of .695. This model appears to fit the data incredibly well.

Is This Model Perfect?

If it was then quite frankly I would move to Vegas and be rich in a season. Unfortunately for me it's not that simple. I am extremely happy with the model and predictions it makes, but it certainly has limitations. The problem is that it doesn't take into account outside factors, most notably injuries. These things, when you have a huge data set, cancel each other out. Say you have a match where the model predicts the home side will win 40% of the time. That 40% is really an average. Sometimes the home team will have a star player with an injury and really be more like 30% to win, other times they will be healthy and the visiting side will have injury or suspension issues. In these spots they might be 50%. Over the whole set the injuries will even out so the home team will actually win the 40% of the time, but looking at individual matches the predictions can be off even if on aggregate they are perfect or close to it.

Conclusion

I am quite happy with the model. It is just as simple as the Poisson model and makes predictions that are as accurate as possible given that simplicity. I feel quite comfortable using it as the base model so for now I will stick to it. None the less, it should be thought of as a baseline and not the be-all-end-all. Some thought is certainly needed to assess how to adjust for factors outside the model such as injuries, suspensions and even things like tactics if some teams possibly matchup particularly well against a certain style of opponent or something along those lines.

Monday, October 19, 2009

UEFA Playoff Mini-Preview

The draw took place earlier today. The matchups are:
Ireland - France
Portugal - Bosnia-Herzegovina
Greece - Ukraine
Russia - Slovenia

The playoffs will take place November 14th and 18th. The team listed first plays at home first.

I actually had the teams pretty similarly ranked to the FIFA rankings. From best to worst I had it: France, Portugal, Russia, Greece, Ukraine, Slovenia, Ireland and Bosnia-Herzegovina. Bosnia are an interesting case as they rate the best at scoring but are by far the worst defensively. Russia are the best defensively and I was surprised to see them 6th out of these 8 teams when it comes to scoring. Given the rankings, you could say that Portugal and Ukraine were fortunate while Greece and Ireland were the most unlucky. These rankings and the predictions below are all based on matches from both qualifying and the finals of Euro 2008 and the qualifying matches for this world cup.

Predictions

I had to add a bit to the PLM, which I'll write more about later today or tomorrow, in order to get predictions because of the format. The playoff round is played out like the Champions League knockout round. Each team plays a home game, the scores are added together and if a team is ahead at that point they go through. If the two teams have the same number of goals then the team with the most away goals advances (no idea why they say they count double as that's confusing). If the scores in each leg are the same, so the teams are tied with the same number of away goals, then they go to extra time and penalties. The away-goals rule counts in extra time as well so if both teams score a goal then the away team that leg goes through.

A caveat about these predictions. They assume that in the case of extra time and penalties each team is equally likely to win. That's probably not realistic as it seems like the home team in the second leg is more likely to win in those situations. I went with a coin flip because I'm not sure what assumption to make. That's something I'll get into later before I do the full previews for these playoffs. Therefore, you should probably slightly (probably half a percent or so) adjust things to the second team. Also keep in mind that injuries are not taken into account.

Ireland - France:
Ireland: 38.9%
France: 61.1%

Portugal - Bosnia-Herzegovina:
Portugal: 70.5%
Bosnia-Herzegovina: 29.5%

Greece - Ukraine:
Greece: 55.1%
Ukraine: 44.9%

Russia - Slovenia:
Russia: 58.4%
Slovenia: 41.6%

I'm surprised how close a lot of these are to be honest. It shows how much variance there is in a two-legged tie.

Saturday, October 17, 2009

East Lancashire Derby: Blackburn - Burnley

By reader request, here's a preview of an interesting match, the East Lancashire Derby. If you are like me before doing research for this article, you aren't very familiar with this rivalry. While it has been overshadowed by the Merseyside, Manchester, Northeast and various London derbies, it is important historically.

Geography

I'll go ahead and start with the geography of it since many of you might not be familiar enough with England to know where these towns are. As you could guess by the name of the derby, Blackburn and Burnley are located in Lancashire, in Northwest England. The towns are quite close together - according to google the grounds for each team are just a 15 mile (24 km) drive from each other. That's a bit longer than most derbies but it's quite close none the less.

History

History is what sets this rivalry apart. It actually goes beyond just the history of the derby and is more about the history of the game itself. There were clubs before, but the first season of the Football League was 1888-1889. Twelve clubs took part. They were, in order of how they finished, Preston North End, Aston Villa, Wolverhampton Wanderers, Blackburn Rovers, Bolton Wanderers, West Bromwich Albion, Accrington, Everton, Burnley, Derby County, Notts County and Stoke City. You could make a great case that this is the oldest derby in football.

The first East Lancashire derby took place at Turf Moor, the stadium in Burnley, on November 3, 1888. Blackburn crushed their neighbors 1-7. Later that season Blackburn won again, 4-2 at Ewood Park. An interesting side note is that both clubs still play in their original stadia so Sunday's match will take place at Ewood Park.

Unlike the more famous derbies, these teams have missed each other due to playing in different divisions. Burnley had serious problems in the 1970s and fell from the top flight all the way down to the fourth division. They have since recovered and this is their first season in the Premiership. They were last in the top flight in 1975-1976. Meanwhile, Blackburn primarily bounced between the first and second flight. As a result, they have mainly met in cup matches and in lower divisions. This is the first time the two clubs have met in the top flight since 1966. Indeed the last time the clubs met in league play at all was the 1982-1983 season. They last played in the 2004-2005 FA Cup. Blackburn won that meeting 2-1.

The time off has not made things less heated. This derby is considered one of the most intense. Like the steel city derby in Sheffield, this is probably more passionate than those of the big clubs because virtually all of the fans are from the area or at least have ties there. Because the clubs are so close together and the fanbases local, Blackburn fans know and work with Burnley fans and vice-versa. The Independent has an interesting read; former players and managers discuss the derby. It sounds pretty intense even compared to other such matches.

The results through time have stayed very close. The teams have met 91 times. Blackburn have 39 wins, Burnley have 37 and 15 times things finished even.

Analysis of Season Performance

A couple things stood out to me when I looked at their results so far. The first is probably just due to the small sample of 8 and 7 matches. Neither team has a point away from home. Combined they have 7 losses. Burnley have yet to score a goal and conceded 14 goals in four matches. At home both are much better. Burnley have 4 wins in the same number of matches and Blackburn are a respectable 2-1-1. Based on that, Blackburn look to have a big edge. Having said that, who knows given that it will be so heated at Ewood Park and the clubs are so close together.

The other thing that struck me is how tough the schedule has been for Burnley, particularly away from home. As I said they are 0-0-4 with 14 goals against and not one scored. That looks absolutely terrible and if I only told you that you'd probably say they will be playing in the Championship (odd name for the second flight if you ask me) next season. They opened the season at Stoke City, who will probably finish mid-tablish. Their other three away fixtures were just brutal: Liverpool, Chelsea and Spurs. In light of that, their away record doesn't look all that bad. Their home schedule has been pretty tough as well; the big upset of the young season was Burnley's 1-0 win over Manchester United. I think it's pretty safe to say they've played the toughest schedule in the league. Given that, they are in pretty good shape at this stage.

The models I use, Poisson for the rankings and the PLM for the predictions, take schedule into account. They rate these teams pretty similarly. At scoring Blackburn are slightly higher in 12th with the Clarets 14th. Flip that around for defense; Burnley are 16th best and Rovers 17th. Overall the model puts Blackburn 16th and Burnley one spot higher at 15th.

Predictions

As you can see from the above, the PLM has these teams very close. In fact, if they were playing at a neutral site it would have each equally likely to win. Because Blackburn are playing at home, they naturally have an edge. The PLM predicts that they have a 47% chance, Burnley a 24% chance and there is a 29% chance of a draw. 1-0 and 1-1 are the two most likely scorelines.

Something I will work on in the future is how well rivalry matches can be predicted compared to normal matches. If you believe the pundits then they can't be at all. I think that's over the top, but there certainly are differences compared to normal matches. I'm not sure I've gotten a single guess correct, but I'll go with a 2-1 win for the home side.

Saturday Preview: Valencia - Barcelona

The biggest match this weekend in Europe is Valencia - Barcelona. This is pretty similar to Sevilla - Real Madrid the previous matchweek. For Barça it is the first true test of the season. They have rolled through their previous six matches. All are wins, by an average of over 2 goals. On the other side Valencia are similar to Sevilla in that they are a club that has title aspirations but isn't expected to beat the two big clubs. While it's only one match, given that the Che find themselves 7 points back right now a loss might well end their title hopes even at this early stage, or at least feel that way.

The match kicks off at 22:00 local (only in Spain would matches start at 10!). Spain is in the Central European timezone so subtract an hour if you are in the UK. In the United States you can find it at 4 PM Eastern on GolTV.

Recent History

Over the last 10 seasons, Barcelona have 9 wins, Valencia 6 wins and they have played 5 draws. In Mestalla, Valencia have 4 wins, Barcelona 5 wins and they played to 1 draw. It's been pretty close to even when the two have played in Valencia.

Form

Because La Liga is only 6 matches into the season, I'll just go over league results to this point.

Barcelona are an impressive 6-0-0 with 17 goals for and 3 against. Their biggest win so far is probably a 5-2 win over Atletico de Madrid at home. The teams they have beaten aren't impressive, but the scorelines are: 3-0, 0-2, 5-2, 1-4, 0-2 and 1-0. Almeria were the nil in that last scoreline. They played a very defensive style and clearly went out going for the point. I don't see Valencia trying that.

Valencia are 3-2-1 with 12 goals for and 9 against. Their biggest win was 2-0 over Sevilla the first weekend of the season. After beating Sevilla, they won 2-4 in Valladolid and since haven't been all that great with 1 win, 2 draws and a 3-1 loss at Getafe in there.

Injuries

Barça have some injury problems, but on the bright side have some players recovering. Henry is likely still out with a muscle strain. Ibrahimovic took a hit to the knee in Sweden's qualifying match Wednesday, but is expected to play. Iniesta seems to be fully healthy now after having had thigh problems all season. It looks like Bojan is back from a similar problem. The biggest issue for Barça is probably the effects of traveling and playing in the midweek in internationals for many of their players.

Valencia have two big injuries. Carlos Marchena, who has been a fixture in either central defense or as a defensive midfielder, is still out. Up top they may be without star striker David Villa. Villa tore a thigh muscle two weeks ago against Racing de Santander. There are rumors that he will and others saying he won't start. I think most likely he'll play some role but won't be 100%.

Playing at Home and Away

Valencia last season were a very strong team at home, third only behind the big two. They went 12-4-3 and had an impressive +21 goal differential. In contrast they were not very good away, only 8th best in the league. Playing in Mestalla was very important for them. Needless to say, Barcelona rolled last year both home and away. They were almost as good on the road as at home - 14-3-2 at home and just 3 points back at 13-3-3 away.

Scoring and Conceding Rankings

Last season, Barça were the best team at both scoring and defending. They scored 105 goals and only conceded 35. Valencia were the 4th best attacking team, with 68 goals. Their problems were more at the defensive end - they conceded 54 goals, good for 8th overall. This offseason not too much changed for the two sides, so I don't see any big reason why those figures should be a lot different.

Once again, I'll give the usual caveat that only 6 matches into the season these rankings aren't fully reliable because there is a lot of variance in the results of such a short period. At this point in the season, Barcelona rank second overall. The problem for the Catalans is really just the schedule; Real Madrid have scored the same number of goals and only conceded one more against a tougher schedule. Barça rate as second best in both scoring and defending - scoring they are behind Real Madrid, defending behind Sevilla. Valencia are significantly further down in both. The Che rank 6th best at scoring and are near the bottom, 16th, defensively. The model puts them there because the number of goals they've conceded, 9, is right in the middle but their schedule has been relatively easy in terms of how good at scoring their opponents have been. Overall Valencia rate as the 7th best team in the Primera.

Predictions

The PLM gives Barcelona an edge, saying the Catalans have a 57% chance to win, Valencia have just 17% with a 26% chance of a draw. Again, this may not be reliable due to the small sample sizes involved. Looking at the two clubs, it seems just about right. Without Marchena, Valencia haven't looked strong defensively and that's certainly bad news when playing Barça. While sitting back and maybe hoping Mata or Silva can create something on the break seems like a good strategy, I'm not sure Valencia are willing to go negative, especially at home. No result would surprise me in this one, but I think Barcelona are certainly favorites to win. I think they will by something along the lines of 1-3.

Wednesday, October 14, 2009

WCQ - Last Matchday Report and Thoughts

For the most part the day was pretty boring considering it was the last day of full qualifying in Europe and the Americas.

UEFA

The schedule was unfortunate for Europe and there was little drama. I watched Poland - Slovakia and it was entertaining and tense at a couple points. There was a lot of snow on the pitch and it snowed throughout the match. Slovakia needed the win to qualify automatically. They went out front on an early own goal. The rest of the match Poland dominated. The second half Slovakia camped in their own end and they looked like a weak team trying to hold out for a win or draw against a much superior opponent instead of a team trying to win to take first in their group. Poland had multiple great chances but Slovakia held on largely due to great play by their central defenders and a couple nice saves by their goalie.

That was it for drama. Switzerland and Israel played to a 0-0 draw. I caught a bit of it and from what I saw it looked as boring as the score would indicate. I like good defensive football more than the average fan but I'll opt out of watching Switzerland next summer if there is another match going on at the same time.

CONMEBOL

Uruguay - Argentina was a pretty big disappointment for me. I expected a lot of chances at both ends but it wasn't a particularly entertaining game to see. Chile actually playing and beating Ecuador took a lot of excitement out as it meant the loser of this match (Uruguay in the case of a draw) would still make the playoff, which they'd be favored to win over Honduras or Costa Rica. I thought Uruguay outplayed Argentina but it's not like they got chance after chance that they failed to convert or anything. The most amusing thing was Diego's post-game press conference where he told reporters to "keep sucking it", with "it" referring to a certain body part that only half the population has. Curiously the most common slang term for "it" in Spanish is feminine. I've never understood why that would be.

CONCACAF

CONCACAF was the only confederation that delivered when it came to drama. To be honest I wasn't expecting it at all. Coming in Costa Rica needed to win in Washington DC or have Honduras fail to do the same in El Salvador. Before yesterday I thought it possible that the Ticos could get a win over the US due to a lack of motivation on the American side. After Charlie Davies was seriously injured in an accident that killed another passenger and split the SUV in two, I thought the US would be more motivated and that Costa Rico would have no chance. I was apparently wrong and I think it went the other way. Other than Jozy Altidore, the US looked asleep out there at the start. They got a couple chances, but allowed Costa Rica two goals with bad defending on both of them. Meanwhile in El Salvador, the Hondurans took care of business and won 0-1. In the second half, Costa Rica went defensive and the US looked stronger, pulling one back. After a crazy sequence in which the manager and assistant were both sent off by the referee, he added five minutes of stoppage time. In the fourth, the US scored on a header.

There was a lot of emotion in RFK stadium even though the match result didn't affect the US, but that was nothing compared to what was going on in Honduras and the stadium in El Salvador. Because the US scored the last-minute goal, Honduras went from the playoff to automatically qualifying for South Africa. For Honduras it's the first time they've played in the World Cup finals since Spain '82. Here is a Honduran radio clip. It starts at the end of their match, the US goal comes 21 seconds in. The pure joy coming through the mic is impressive.

Now What?

These 23 teams have qualified for the finals:
South Africa (hosts)
Ghana
Ivory Coast
Australia
Japan
North Korea
South Korea
Denmark
England
Italy
Germany
The Netherlands
Serbia
Slovakia
Spain
Switzerland
Honduras
Mexico
The United States of America
Argentina
Brazil
Chile
Paraguay

There are three African groups that have not been decided yet, 4 UEFA places go to the playoff winners, one spot goes to the winner of the Uruguay - Costa Rica playoff, and the other the winner of New Zealand - Bahrain.

For UEFA, the 8 teams are paired up and play a two-legged tie with the winner advancing to the finals. The draw for that is seeded, meaning the 4 best teams (according to FIFA rankings) are put in one pot and the remaining four in the other. The pots will almost certainly be Russia, France, Greece and Portugal in one and Ukraine, Ireland, Slovenia, Bosnia in the other. Friday the FIFA rankings come out so we'll know for sure and the draw is Monday.

I have pretty seriously neglected Africa in this blog. They use a group format with 5 groups of 4 in the final round. To qualify one has to win the group. The Ivory Coast, Côte d'Ivoire if you like, and Ghana have already clinched their groups, but the other three are at least mathematically open going into the final match.

New Zealand and Bahrain already played the first leg of their playoff tie. It was a 0-0 draw in Bahrain. 0-0 in the first leg is a good result for the away team; historically teams that play to a goalless draw away tend to win the home-and-home over half the time. While New Zealand has an edge at the moment, it's obviously far from over at this stage.

Uruguay and Costa Rica are likely both disappointed as both teams could have clinched automatic berths with a win today.

The final African matchday, the second leg of New Zealand - Bahrain and the first leg of the other playoffs will take place November 14th. The second leg of all playoffs except New Zealand - Bahrain happen four days later on the 18th.

CONMEBOL Final Matchday Preview

Unfortunately I won't be able to give this region, and in particular Uruguay - Argentina anywhere near the writeup they deserve.

I'll go backwards and start with the PLM prediction and its validity. The model predicts the following for each team:

Argentina:
4th - 34.9%
5th - 61.0%
6th - 4.2%

Uruguay:
4th - 65.1%
5th - 32.5%
6th - 2.4%

Ecuador:
4th - 0.01%
5th - 6.55%
worse - 93.44%

In other words, it gives Uruguay about a 65% chance to win tonight and Argentina a 35% chance at a result. In the other match, it gives Ecuador about a 7% chance of getting a win.

I'll start by discussing the last prediction. I think it's way off but it's difficult to say exactly how much. Chile's motivations are at best going to be somewhat lacking and at worst they may actually prefer to lose! While I'm sure it's not the entire Chilean population, I think it's safe to say that most Chilenos aren't too fond of Argentina and wouldn't mind making their life difficult by letting Ecuador have these three points. The players will talk about being professionals and the like, but even if they don't actively try to lose their mixed emotions will surely play some role.

Some may be surprised that the model predicts a win for Uruguay nearly 2/3 of the time. Personally, I think it's pretty reasonable if you remember that the model is based only on results. In other words, it doesn't see the apparent gap in talent in favor of Argentina, it just sees their struggles through their bad scorelines. Because this is the last matchday, the two teams have played the same schedule and have played each other once. That match was in Argentina, so Argentina have actually faced a slightly easier schedule than the Uruguayos. Uruguay have scored 28 goals to 22 for Argentina, a difference of more than a third of a goal per match. That's a big difference. In a full league season like in the English Premier League or Spanish Liga, that would translate to over 13 more goals scored. On the defensive side, Argentina have conceded 20 and Uruguay 19. That's not much of a difference, but again keep in mind that Argentina's schedule is a bit easier as well.

So looking at just the results of this qualifying campaign we have a team that is a lot better at scoring and very close but maybe a bit better defensively than their opponent. Also important, the team that is better at both ends of the pitch is playing at home. In that kind of situation, the home side is a big favorite not only to get points but to win outright.

The natural objection to the above is that Argentina will be better this match because it matters so much. My initial thought was similar - "they have to turn it around at some point!". Thinking about it more rationally though, there isn't much evidence of this. Their last two matches have been very important and they failed to impress in either. Against the worst team in the confederation they won in the dying seconds on a goal that should have been waved off for offside. Granted, the conditions were absurd because of the weather and they were the better team on the day but failing to dominate and win by 2 or more goals there is a pretty bad sign. A month ago they got thoroughly outplayed by Paraguay and were lucky to lose by only a goal. They certainly have the skill, and therefore the potential, to take over and dominate against Uruguay, but I see no reason to think that Argentina should be a favorite to get any result. It's certainly not there when looking at the results alone and less so from watching them play.

My Prediction

As I said above, I agree with the PLM that Uruguay win this game more often than not. I also expect Ecuador to win suspiciously comfortably in Chile. Combining the two, if you want a single prediction out of me I'll say that Uruguay go through automatically and Ecuador get to the playoff against Costa Rica or Honduras. That leaves Argentina out and further tarnishes the reputation of the greatest footballer ever to live.

Whether that all comes about or not, you can expect a lot of drama.

UEFA - Final Matchday Preview

I meant to get this out sooner but yesterday had car trouble and today had to scramble to get a battery as the old one had died.

To be honest, tonight doesn't look to be very exciting in Europe. The way the schedules worked out, with one exception every important match is a team in contention playing a team that is eliminated, effectively if not mathematically. The only exception is Switzerland - Israel.

Here I will only discuss the groups that have something at stake.

1.
Portugal - Malta
Sweden - Albania
Denmark - Hungary

Portugal are in the playoff with a win. If they somehow don't win then Sweden are in if they win. Hungary have a mathematical chance but would need to crush Denmark and have Malta crush Portugal. Portugal should have an easy win.

(correct!) PLM Probabilities:
1st: Denmark 100%
2nd: Portugal 95.88%, Sweden 4.12%

2.
Switzerland - Israel
Greece - Luxembourg
Latvia - Moldova

Switzerland clinch the group with a result against Israel. Greece guarantee themselves at least a playoff with a win. Israel could still make the playoff with a win and a Greece draw or loss. Latvia are still technically in it but they would need to win, have Greece lose, have Isreal draw or lose and have their scoring margin and Luxembourg's over Greece add up to at least 7.

PLM Predictions:
1st: Switzerland 78.91%, Greece 21.09%
2nd: Greece 77.04%, Switzerland 21.09%, Israel 1.86%, Latvia 0.01% (1 out of 10,000 simulations)

3.
San Marino - Slovenia
Poland - Slovakia
Czech Republic - Northern Ireland

Assuming Slovenia beat perhaps the worst national team in the world, Slovakia will need a win in Poland to secure the group, otherwise it's the playoff for the Slovaks. I don't expect motivation to be an issue for Poland because they are playing at home. The Czechs are in with a win if San Marino can pull off a miracle and get a point against Slovenia.

PLM Predictions:
1st: Slovenia 70.45%, Slovakia 29.55%
2nd: Slovakia 70.45%, Slovenia 25.66%, Czech Republic 3.89%

6.
Andorra - Ukraine
Kazakhstan - Croatia

See Portugal and Sweden above. If Ukraine win then they are in the playoff, if Croatia win and Ukraine fail then Croatia go instead. Ukraine should win but stranger things have happened.

PLM Predictions:
1st: England 100%
2nd: Ukraine 91.48%, Croatia 8.52%

Sunday, October 11, 2009

A Correction and Apology (edited)

After a discussion about the CONMEBOL predictions (which I'll post shortly) on a message board, I discovered a programming problem that made some of my predictions off. When I gave individual match predictions they were correct as I just used the spreadsheet directly, but the simulations were off. That meant that qualification percentages, for automatic or the playoff, were incorrect. The MLS article used a different version of the program that was correct, this applies only to the WCQ articles.

In the future I will put more effort into testing a new program when I change models. I am sure that the version I am now using is correct, so my predictions for Wednesday will be fine.

My apologies.

UEFA Playoff Seeding

With much controversy, two weeks ago FIFA announced that the draw for the playoff round for UEFA would be seeded according to the FIFA ranking this month. This is quite beneficial to the countries with a high ranking and bad for those that are ranked lower.

Here is a list of teams that could be in the playoff round, by FIFA ranking. I'm eliminating those that need crazy scenarios to make it. Teams in bold are certainly in, FIFA ranking within UEFA in parenthesis.
Russia (5)
Croatia (7)
France (8)
Greece (9)
Switzerland (11)
Portugal (13)
Israel (16)
Ukraine (17)
Republic of Ireland (23)
Sweden (24)
Slovakia (26)
Bosnia-Herzegovina (27)
Slovenia (29)

Switzerland, Greece and Israel are in the same group. Croatia aren't very likely to make it, they would need for Andorra to get a result against the Ukraine. Similarly, Portugal are in with a win over Malta so they likely will make it. This makes the pots likely to be:

Seeded:
Russia, France, Greece/Switzerland/Israel, Portugal

Unseeded:
Ukraine, Ireland, Slovakia/Slovenia, Bosnia-Herzegovina

If Croatia manage to qualify then they would be seeded. Portugal would be the odd team out if either Greece or Switzerland make the playoff, Israel would be if they make it. Ukraine need for Portugal to fail to qualify for a seed. That's all assuming that the rankings don't change. Even then the only way the seeding could change barring a massive shift in the rankings, which I don't see a reason for, would be for the Ukraine to jump ahead of Portugal. Right now the Ukraine are 22nd overall and Portugal 17th with Ukraine 36 points behind. Teams do have shifts that high so it's possible.

North and South American Scenario Updates

As I said in the UEFA article, I'll write more on both of these confederations in the next couple days.

It was a big day for both CONCACAF and CONMEBOL. North of Panama, the two big countries clinched spots in South Africa with impressive wins. Mexico rolled with a 4-1 win over El Salvador. The United States didn't look great, at least from what I read as I only caught the last five minutes due to traveling, but still managed to win away from home over a decent opponent. Costa Rica came up huge with a 4-0 win over Trinidad and Tobago. Wednesday will decide which of Costa Rica and Honduras go to the playoff or automatically qualify.

CONMEBOL is more messy. Chile joined Brazil and Paraguay as automatic qualifiers from South America with their 2-4 win in Colombia. The loss officially eliminated the Colombians. In two very exciting matches, last-minute goals were extremely important. Argentina beat Peru 2-1 on a goal by Palermo (offside?) in the 93rd minute. Uruguay won 1-2 on a penalty kick at the very end that came as a result of a breakaway. Here are all the goals. Those goals really changed things.

Here are the scenarios:

CONCACAF is pretty simple. If Costa Rica get a win in Washington DC or Honduras fail to win in San Salvador then Costa Rica automatically qualify and Honduras face a playoff against the 5th South American team (see below). If Costa Rica get a draw or loss and Honduras win then the Hondurans are in automatically and the Ticos face a home-and-home with the South American team.

In the CONMBEBOL, 3 teams are battling for 1 automatic spot and 1 playoff spot. Two of them play when Argentina cross the Río de la Plata to face Uruguay. The third team is Ecuador, who play in Chile. Ecuador are currently the odd team out but are guaranteed a playoff spot if they win. They theoretically could qualify automatically, but would need to beat Chile by five or more goals and have Uruguay - Argentina end in a draw. Barring a ridiculous scoreline like Uruguay 0 - 13 Argentina, Ecuador are eliminated with anything less than a win. A win for Uruguay puts them in automatically. With a draw or loss, they will need for Ecuador to fail to win in Chile. For Argentina, a win or draw puts them in the finals. With a loss, they would need for Ecuador to draw or lose to make the playoff.

In summary, here's what has to happen for each to be eliminated:
Ecuador - fail to win in Chile
Uruguay - fail to win vs Argentina and Ecuador win
Argentina - lose to Uruguay and Ecuador win

I'll write more on this later, but I don't think that last scenario is unlikely at all and I'd be worried if I were a fan of Argentina. It seems likely that Chile doesn't put up much of a fight and Uruguay have looked better than Argentina lately.

edit: Like Norway, Venezuela are still alive. They make the playoff if they win in Brazil, Argentina beat Uruguay, Ecuador draw or lose and the number of goals that Uruguay lose by and those that Venezuela win by add up to more than 15, or exactly 15 with Venezuela scoring at least six goals more than Uruguay.

Saturday, October 10, 2009

UEFA - Post-Matchday 9 Update

I'll go more into detail on the groups where there is something to play for in the next couple days. Here is the current situation:

Already Qualified:
Denmark
England
Germany
Italy
The Netherlands
Serbia
Spain

Could Go Automatically, Guaranteed at Least Playoff Round
Slovakia
Switzerland

Certainly in Playoff Round:
Bosnia-Herzegovina
France
Republic of Ireland
Russia

Groups 4,5,7 and 8 are completely done. Group 9 has finished playing. Here is the situation in the other groups:

1. Denmark are in. Portugal make the playoff round with a win over Malta at home. Sweden can still make it if Portugal fail to win and Sweden get the win versus Albania.

2. Switzerland are in with any result at home against Israel. If they lose and Greece win versus Luxembourg then the Greeks win the group. Greece are guaranteed at least second if they win or Israel fail to win. If Greece draw or lose and Israel win then Switzerland finish first in the group, getting a spot in South Africa, and Israel finish second and go to the playoff round.

3. Slovakia at least have the playoff round locked up. They qualify automatically if they win in Poland or Slovenia fail to win in San Marino. If Slovakia draw or lose and Slovenia win then Slovenia qualify automatically and Slovakia go to the playoff. The Czech Republic and Northern Ireland are technically still alive, but effectively out. The Czechs make the playoff with a win at home over Northern Ireland and anything less than a win for Slovenia. Northern Ireland get in with a win and a loss by Slovenia where the sum of victory margins is more than 7, for example San Marino 3 - 0 Slovenia and Czech Republic 0 - 5 Northern Ireland.

6. As a result of their win today, the Ukraine make the playoff round with a win at Andorra. If they get a draw or lose then Croatia can make the playoff with a win at Kazakhstan. In the unlikely scenario where the Ukraine lose to Andorra and Croatia get a draw then the Ukraine would make the playoff as long as they lose by 3 or fewer goals.

Norway's chances are slim to say the least. The only scenario where they could make the playoff round is if Sweden draw against Albania and Portugal lose by 4 or more goals to Malta. To cover all bases, they would also make it if Portugal lose by exactly 3 goals and Sweden get a draw where they score 5 or more goals than Portugal. Finally, it goes to the drawing of lots or a playoff game (for the playoff round) if Portugal lose by 3 goals and Sweden get a draw where they score 4 more goals than Portugal, so Norway could get in if all that happens and Sweden win the random draw or playoff game.

Thursday, October 8, 2009

UEFA Group 4: Russia - Germany Preview

I may be overreacting to Euro 2008, but Russia - Germany is likely to be the best match in all of qualifying. We're getting a possible final in the qualifying round. While neither would be a favorite, I think both teams are in the second tier of countries that have an outside shot at winning the World Cup.

The Situation

Germany currently sit one point ahead of Russia. Both teams are expected to win their last match. Germany host Finland while Russia travel to Azerbaijan. Assuming they both get the win Wednesday Saturday's match will decide the group. If Russia win then they win the group and qualify automatically. If Germany get a result then they win the group. Both teams are guaranteed a playoff spot even if they lose their last two matches.

History

History is on Germany's side. Russia, as Russia or the USSR, have never beaten Germany or West Germany in a competitive match. I personally think that means nothing as it relates to how this match will turn out, but it is an interesting anecdote given the nonfootball history of the countries. More relevantly, Germany are up a point because they beat the Russians 2-1 in Dortmund a year ago. Here are links to the goals from that match: Podolski Ballack Arshavin. (Sidenote: what's up with the shhh celebration when you score to put your team down a goal?). In that match, Germany dominated the first half, going into the break up 2-0. While Russia were the better side in the second, it wasn't enough to overcome their deficit.

The Ground

A big story related to the match is that it will be played on artificial turf in Moscow at Luzhniki Stadium. That should help the Russians because they will be, at least in theory, more accustomed to the surface plus they play more of a short passing game. I personally think that artificial turf should not be allowed, but maybe it is cheaper so some smaller countries have to go with it. Russia is not such a country and it seems bizarre that they are playing on an artificial surface. I suppose it could have something to do with the snowy and icy conditions in the Russian winter but the forecast for Moscow is for above freezing temperatures.

Injuries

For Russia, there seem to be no injuries that will keep a player out of the match. The biggest injury story is Andei Arshavin, who has been battling a knee injury. Based on his recent performances for Arsenal, including a goal in the 6-2 thrashing of Blackburn last weekend, I'd say the knee is fine.

Germany are less lucky. They will not have the services of starting goalkeeper Robert Enke who is sick. René Adler is expected to replace him. Adler has little experience at this level, though he did play in the previous match against Russia and recorded clean sheets in qualifiers against Wales and Azerbaijan. Given the intensity of the match and the likelihood of plenty of chances for Russia, it'll be interesting to see how the young keeper handles the pressure.

Predictions

I'll mix it up and give my personal thoughts before going with the objective predictions of my current best model. Russia is a very good attacking team that plays a pretty aggressive style to begin with. They have now mathematically locked up second place can count on a German win Wednesday over Finland so they a draw does them no good compared to a loss. This will cause them to push even harder to score. Germany are solid at both ends of the pitch so I expect the match to be extremely entertaining with plenty of chances for both teams. For a scoreline prediction, I'll say that Russia win 2-1.

The PLM predicts a lower-scoring, less-exciting affair. Again the usual caveat applies. The model just uses results, it does not include differences in motivation and incentives compared to previous matches. As I argued above, Russia get little from a draw and a lot from a win so they should be more attack oriented than they have been in previous matches. With that said, the model predicts a 36.5% chance of a win for Russia, 27.7% chance of a German win and a 35.8% probability of a draw. The most likely scorelines are 1-0 and 1-1, each with about a 15% likelihood. This goes in line with the overall prediction, which is Germany to go through top 74% of the time and Russia 26%.

I don't have strong feelings either way about either of these national teams, but Russia is in my opinion one of the most entertaining teams to watch play so I am hoping for a home win.

CONMEBOL - Matchday 17 Preview

Like the other confederations, it's getting down to the wire in the CONMEBOL with the final two matchdays this week. Here's the current table.



Here's a quick breakdown for each team:
Brazil and Paraguay are 100% qualified.

Chile need a win or two draws to get there mathematically. They could also get in Saturday if both Argentina and Venezuela fail to win their matches. In practice, it may be possible for Argentina to catch them if Maradona's boys get two wins and Chile a draw and a loss but it would depend on the scorelines. Right now Chile are up 6 in goal differential so if their loss and the Argentina wins are by multiple goals then they could drop. In this bad scenario, Ecuador would also have to beat Uruguay Saturday to drop the Chilenos down to the playoff spot. The playoff or better is all-but guaranteed. Chile are certain to finish above either Ecuador or Uruguay so they could only be eliminated in sixth if Argentina and Venezuela catch them. Right now Chile are 6 points ahead of Venezuela and up 12 goals in goal differential so Venezuela passing them isn't realistic. In summary, Chile are effectively in the playoff round right now and one more point quite likely will do the job and two more points or any draw or loss by Argentina puts them through automatically.

Ecuador are in a great spot but have a tough schedule ahead. They host Uruguay and then travel south to Chile on Wednesday. Their fate largely rests in their own hands. If they lose both then they will almost certainly be eliminated, not even making the playoff round. With two wins they are obviously in. With a draw and a win they are guaranteed at least a playoff spot but would depend on the Uruguay - Argentina result to make it automatically. Two draws or a draw and a loss and they do a lot of scoreboard watching. With two draws they could be anywhere from automatically in (if Argentina get a draw or loss Saturday and Uruguay - Argentina ends in a draw) to completely out (if Argentina beat Peru Saturday and Uruguay beat Argentina).

Argentina can guarantee at least a playoff spot with a win in Uruguay Wednesday. To qualify automatically they need for Ecuador to get at least one point fewer in their last two matches and to keep Uruguay from passing them. In other words, they can get in with two wins if Ecuador slip up at all and could get in with a draw and a win if Ecuador get two draws, a win and a loss, or two losses.

Despite being in 6th right now, Uruguay can guarantee a spot in South Africa with two wins. Of potential importance, Uruguay should easily win the tiebreaker with any combination of Ecuador, Argentina, Venezuela and Colombia because they are currently +8 in goal differential. Therefore, a draw against Ecuador and a win over Argentina would put the Uruguayos in at least fifth, fourth if Ecuador fail to beat Chile. A win against Ecuador and a draw against Argentina is not nearly as good; Uruguay would be out if Argentina beat Peru and Ecuador beat Chile.

Venezuela and Colombia can possibly get in but will need a lot of help even to make the playoff. Both face a brutal schedule: Venezuela host Paraguay and then travel to Campo Grande, Brazil while Colombia play at home against Chile and then travel to Paraguay.


Group Prediction

The PLM gives the following probabilities:

Top 4:
Brazil - 100%
Paraguay - 100%
Chile - 99.9%
Argentina - 44.3%
Uruguay - 32.2%
Ecuador - 21.8%
Colombia - 1.4%
Venezuela - 0.4%

5th:
Ecuador - 32.2%
Argentina - 31.5%
Uruguay - 27.2%
Colombia - 6.1%
Venezuela - 2.8%
Chile - 0.1%

Top 5 (automatic or playoff):
Brazil - 100%
Paraguay - 100%
Chile - 100%
Argentina - 75.9%
Uruguay - 59.4%
Ecuador - 54.0%
Colombia - 7.6%
Venezuela - 3.2%

Keep in mind that these predictions do not take into account teams having nothing to play for. As a result, I think Ecuador will be more likely than the model predicts because their match in Chile may be a friendly for the home side.

Match Predictions

Colombia - Chile:
Colombia win: 34.7%
Draw: 39.1%
Chile win: 26.3%
Most likely scoreline: 0-0

Ecuador - Uruguay:
Ecuador win: 42.1%
Draw: 30.8%
Uruguay win: 27.1%
Most likely scoreline: 1-1

Argentina - Peru
Argentina win: 81.3%
Draw: 15.4%
Peru win: 3.3%
Most likely scoreline: 2-0

Venezuela - Paraguay
Venezuela win: 31.4%
Draw: 34.1%
Paraguay win: 34.5%
Most likely scoreline: 1-1

Bolivia - Brazil
Bolivia win: 7.7%
Draw: 20.0%
Brazil win: 72.4%
Most likely scoreline: 0-2

Wednesday, October 7, 2009

World Cup Qualification: UEFA Group 1 - Matchday 9

Current Situation

Denmark are top of the group with 18 points. Sweden trail with 15 followed by Portugal and Hungary with 13 each. Albania and Malta have been eliminated. In the first tiebreaker, goal differential, Denmark are +11, Sweden +6, Portugal +4 and Hungary +4.

Saturday Denmark host Sweden while Hungary travel to Portugal. Wednesday's matches are very important as well - Denmark play at home against Hungary while Sweden and Portugal host Albania and Malta respectively. We can expect Sweden and Portugal to win, probably big, Wednesday. It would not even be out of the question for Sweden to win by 5+ goals given the potential importance. Unlike some of my previous scenarios where I assumed high scorelines didn't happen, I don't think that can be done.

Here are the scenarios. They are pretty messy at this point, obviously it will become much more clear Saturday night.

Denmark will automatically qualify if one of these happens:
- they win against Sweden
- they draw against Sweden and get at least a draw against Hungary

They could also get in other ways with some help. If they draw against Sweden and lose to Hungary then they could finish top if their goal differential remains higher than Sweden and, if necessary, the winner of Portugal - Hungary. Same goes for a loss to Sweden and a win over Hungary - they would then only need to beat Sweden in the tiebreakers.

The Danes could actually be eliminated from the competition. That can happen only if Portugal - Hungary does not end in a draw. If that happens and Denmark lose both matches then they will be eliminated assuming Sweden and Portugal beat their weak opponents. It can also happen if Denmark get a draw and a loss, in either order, and lose the tiebreakers to Sweden and/or Portugal and/or Hungary.

Sweden can only make it automatically with help or by winning tiebreakers with Denmark. If they win Saturday then they can get in with a better result than Denmark get Wednesday or by winning the tiebreakers. Right now they are down 5 goals on goal differential, but they go to down 3 with a 1 goal win and down just 1 with a two-goal win Saturday. That plus the much weaker opponent Wednesday means it is far from impossible to catch Denmark in goal differential. A win Saturday plus a 5-goal win over Albania would likely be good enough for automatic qualification. The Swedes could also get in with a draw against Denmark and a Denmark loss Wednesday. Again, they would need to beat Albania by a large margin.

If Portugal-Hungary does not end in a draw then Sweden may need two wins or 4 points in their two matches plus winning the tiebreaker in order to make the playoff. Should Sweden get a draw and Portugal win Saturday we will have what I think is an unfortunate situation Wednesday where Sweden and Portugal will be trying to pile on as many goals and win by as much as possible trying to get the edge in the goal-differential and goals-scored tiebreakers. That possibility is certainly the downside of basing the first set of tiebreakers on results from the group as a whole instead of head-to-head results, though in this case it wouldn't matter because both matches between Portugal and Sweden were 0-0 draws.

Portugal need to win and get some help. They can still win the group with two wins but it's not likely. The most likely way for that to happen would be for Denmark - Sweden to end in a draw, for Hungary to win in Denmark next Wednesday and to win the tiebreaker over both Sweden and Denmark. Portugal's fate depends a lot on the Denmark - Sweden outcome. If Denmark win then Portugal would make the playoff round with two wins. If it's a draw then Portugal could get in with two wins, but would likely have to win the tiebreak with Sweden. In this scenario they would be going into Wednesday at least even on goal differential with more goals scored. If Sweden win then things are more complicated. Even with two wins they would need either for either Sweden or Denmark to get a draw or loss in their last match. In both cases the opponent would have nothing to play for. If you are Portuguese you should definitely be cheering for Denmark.

Hungary's situation is very similar to that of Portugal. It is slightly better if you just ask what happens if certain results happen - the Hungarians are at least in the playoff round with two wins if Sweden beat Denmark Saturday.


Group PLM Predictions

The PLM gives Denmark an 83.7% chance of winning the group, Sweden 11.8%, Portugal 3.4% and Hungary 1.1%. The most likely second-place team is Sweden according to the model. The Swedes have a 45.6% chance of finishing second and making the playoff, Portugal 33.1%, Denmark 11.5%, and Hungary 6.1%. Adding them up, Denmark have a 95.2% shot at either winning the group or reaching the playoff, Sweden 57.4%, Portugal 36.5% and Hungary 7.2%. This group has just under a 4% chance of having the worst second-place team.


Saturday's Matches

Denmark - Sweden (20:00 local, 2:00 PM Eastern United States)


First off, if you didn't follow Euro 2008 qualifying, you definitely want to check out what happened the last time these teams met in Copenhagen. The hotly contested match was 3-3 in the 88th minute when the referee awarded a Sweden penalty and sent off Christian Poulsen after talking to the linesman, who saw Poulsen strike a Sweden player in the midsection. A fan then ran onto the pitch in an apparent attempt to attack the referee, though he was stopped by a couple nearby Danish defenders. The four referees got together and decided to abandon the match. Here is a youtube link, here is the wikipedia page about the incident. I'm sure both of these sides would love to severely damage the other's qualification chances.

Form and Injuries

Form probably doesn't mean much given that these teams last played nearly a month ago. Sweden have won 3 matches in a row since losing 0-1 to Denmark in June. Denmark have two draws in their two matches. Their last match was particularly bad as it was a 1-1 draw at Albania. They could have all-but sealed a trip to South Africa with a win so the draw is certainly bad news.

For injuries, there is little to report. Daniel Agger has not played a match since playing in Sweden for Denmark due to a back injury. He was expected to play, but sat out training yesterday. It seems unlikely that he'll go 90 minutes if he does play.

PLM Predictions

The PLM predicts a match that is likely to be low-scoring. It gives Denmark a 48% chance to win, Sweden a 17% chance with a draw happening 35% of the time. The most likely scoreline according to the model is a 1-0 win for Denmark which will happen with 20% probability. I'm inclined to think that Denmark will struggle to keep Sweden from scoring, so I'll make a 1-1 draw my personal prediction.


Portugal - Hungary (19:45 local, 3:45 PM Eastern US)

The obvious place to start is with injuries, which unfortunately Portugal have a lot of. Cristiano Ronaldo missed the epic Sevilla win over Real Madrid because of an ankle injury suffered in the Champions League match against Olympique Marseille in the midweek. News reports indicate that he will be ready to play, but we shall see. By the way, he may be cursed and never play again. There are similar injured but should play reports for Bosingwa and Tiago. Despite his name, Duda is less doubtful and much more likely to both play and be closer to 100%.

I have been unable to find any sort of injury report for Hungary.

Predictions

The PLM predicts that the Portuguese should win. They are given a 70% chance, Hungary a 7% chance and the remaining 23% being the likelihood of a draw. The most likely outcome is 2-0 to Portugal. I personally think the model overrates the probability of a draw. The reason for that is that a draw serves neither team. If the score is even near the end I think we can expect to see a lot of attacking football from both sides. It is much more likely than usual for there to be a goal that breaks the deadlock. The injury situation for Portugal is certainly not good, even if they get all four players mentioned above to play. The model does not take that into account, and I think in particular overrates them offensively for that reason. I think these factors somewhat cancel out on the Portugal side so they are probably around that 70%. I'd guess Hungary are more like 20% to win, with about a 10% chance of a draw which would practically eliminate both teams. I'll make 1-0 Portugal my just-for-fun prediction.