Monday, July 5, 2010

Diego in Europe

Addendum to previous post. Maradona's numbers in European teams. Only regular league appearances, no cup games. On the left are Diego's statistics ( darker fields in win and draw columns are approximations ), on the right team's numbers.



Leo

That was only one game. Dont worry, you are GOOD.

Points are the mix of goals, wins, draws, red cards, shutouts and blowouts, as explained.
Win shares are number of team points directly produced by the player. 3 WS points = one win contributed. Percentages are the same thing but expressed as % of team's final points count.

Minimum grade is 0 points, theoretical maximum is 5. To get you a sense of perspective, Pele's best season was 4,39.

Anyway, Lionel Messi's career in Barcelona so far:



England's other goalkeepers

As explained here. You can find David James there too.










Hand of God and God himself

A few words.
Calculations were done as explained here. In short, goals, wins, draws,shutouts on the positive side; games lost without scoring a goal on negative. All that divided by number of games played.
Only regular national championship matches, no all star games, no cup games, no national squad games, no Puerto Rico 12:0 blowouts.. Only meaningful games played with balanced competition.

Pele's statistics are perfect. Every game, every goal, every squad. So, calculations for his career are as good as they can get. Spreadsheet can be found here. Though, keep in mind that soccer was a different game then. Substitutions were rare if even allowed, games were played all year round, Brazil didn't have unified league, so results are from the state of Sao Paolo league (Campeonato Paulista )...

Maradona's statistics are accurate for his Spain and Italian career, not so much for Argentinian part. But when in doubt, err on the side of the player. If I was biased in my approximations, I was biased in favor of Diego. Future exact calculations for the Argentinian part of his career ( when data will be available ) will not stray much from these. He was what he was. A wasted talent. So great, but so immature.

The graphics are straightforward: middle column represent age, years are colored in accordance with achievements for that year; squads in witch they played are next and calculated grade for that year on the outside columns.  


They both had same career span, started and finished them at same age, played same position and wore the same number.
Here they are:







Sunday, July 4, 2010

Pythagorean formula for soccer, the European one

As we have learned during this edition of World Cup, predicting the outcome of the single soccer match is hard. But, that's the excitement of once in a four year tournament.
In our regular, every day soccer, we have a little larger set of data to play with. The simplest method used in other popular sports, Bill James's Pythagorean formula, is not really accurate in soccer. There are attempts to tweak the formula by changing exponents as it was done for basketball, but swings of fortunes, in relatively short seasons are common and you have to deal with three possible outcomes on top of that.

So, let's tilt at the windmill again.

Few observations first. The main difference between soccer and pretty much any other sport is possibility of a draw. Since all win estimator formulas work on the principle of a clean cut winner, usual straightforward relation between runs/goals/points scored and the number of victories, falls short in the case of soccer. Usual way to compensate this is by counting points won, not victories.

In the Premier league or any other elite league with ,more or less, just dispersion of talent among teams, tie games deduct around 8 % of the maximum points per season. For EPL it's 20 teams, 380 games and 1140 points to distribute. Usually around 100 points are lost every year. If it's a win, then it's a three point game; if it's a draw, it's only two point game then. In a tie game both teams win and lose the same amount of points.
Let's start with simplified Pythagorean formula with exponent 1 and see how it plays out.

                                   Win % = ( goals scored / goals scored + goals allowed ) X 0,92

Calculation is for last year's champions Chelsea with 8 % deduction. For 2009/10 season it's exactly 8,42%

GF = 103

GA = 32

Win % = 0,699 ( or 69,9 % of possible points won )

Multiply Win % with maximum amount of points for a single team.

Win % X 114 ( for 20 club league; for Bundes league it's 102 )

In the end, we get 80 points for Chelsea. They won actually 86, but you would expect it for first place team to outperform expectations.




That is ,of course, 20/20 hindsight.

And now the future.

Calculations were done taking in account last three seasons ( weighted 60/30/10 ). Some numbers, yellowish ones, are averages or estimations for Championship or any other lower league. Results from the Championship are simply reduced by about 1/3 ( 2/5 to be precise, assumption is that Championship is 60 % the strength of the Premier League ). Same goes for other lower leagues. After we calculate PP ( predicted points ), we adjust them with three year average of over/under performance ( for example, in last three seasons Everton overachieved by 2 points and Fulam underachieved by 3 ). In the end we have adjusted points and predicted ranking of teams in accordance.
As things are today; with no knowledge of starting lineups, amount of money poured into clubs, injuries, bad/good shape of key players.... Here are predictions for the final standings of England Premier League for 2010/11 season.







I would be stunned if all this plays out as above, but some interesting questions are popping out. Some of the points allotted to bottom teams will end on the accounts of top teams, so disparity will probably be larger than shown. We'll see if 87 points will be enough for the title and 39 good enough not to get relegated. We'll see how fewer draws, if any, we'll get; since prediction is 1055 points aggregate for the season. That is, only 7,46 % of lost points or around 85 tie games. Three year average for Premier League is a little less than 100 per year.
Here are the others:





Blue corrective points for Freiburg and Mainz indicate that it is not average but last season achievement, since they played in second league previous years. Also, like in the above case of Blackpool, correction for promoted teams with no history in first league ( in this table, last three ) is 0.
So title goes to Bayern and the cut off for relegation will be 35 points. Dynamics of the season as predicted is pretty much in line with previous years, 75 to 80 draws.




For Calcio, things will remain mostly the same. Around 100 draws, get over 80 points and title is yours, don't get over 40 and it's Seria B for you.


And, last but not least:





Fewer tie games next year, almost 90 points for the title and above 40 points to avoid Liga Secunda.

And, that is it. This method shows good results in retrospect and now it's time for the test drive. There are visible flaws since we treat every league the same, but in many ways they are very alike. Each has one or two ultra dominant teams with few dark horses in wait. These are not exporter leagues; very few outstanding players play outside their country and if they do, they do so in some of the leagues mentioned ( Ballack, Luca Toni …). Since creation of UEFA Champions league in 1992, only three teams won that are not from one of these four leagues ( 3 out of 19 ), which speaks of superior quality of competition. And so on.
Now, all we have to do is wait for the summer of 2011 to see how horribly wrong these predictions were.