Sunday, August 8, 2010

Defense

Before I begin I'll say a few words about the state of soccer statistics today.

Not long ago we didn't have any. Today we have it in amounts that are overwhelming. Just visit FIFA's World Cup page.
We are beginning to hoard more and more data. These are all valid data but fragmentation is great problem for grasping the big picture. There are some indexes that can help, like Castrol Index and alike but the problem with them is that they are not intuitive. For example, Frank Lampard has 776 index points for last season on the official EPL site. On Castrol he has 845 and is 4th in his league.
Meaning? 
What is the lowest possible number? What is the highest possible number? Can the values be negative...? Questions that can not be answered off-hand.
Higher is better but in comparison to what “natural” measure ( percentage, scale with upper and lower limit, some zero point...)? 
You get the picture. Not very intuitive. So, read the raw data instead, some might say...
Too much data can have the same effect. We still remain clueless. Example, until this World Cup we never had passing data in our statistic sheets, at least in Europe we didn't. Now?
As I said, FIFA's stat page. Dive in.

Enough rambling. My attempt with win shares and points is little more intuitive but emanate from a small set of data. For the historic purposes it's OK. For the future, why not use everything that we can? But since this enterprise of ours have for target audience a common fan, my aims for synthetic soccer statistics I'm developing are for them to be:

                                                                        Intuitive  and  Comparable


On the foundation of previous post about saves, I'll try to build a gauge that can capture the soccer defense.
If we add the number of saves to goals conceded during the season, we come up with the number of times the defense collapsed and allowed the opposing strikers to take their shots. Obviously, the lower number of collapses equals better defense. Until I come up with the better name, let's call it the average number of negative events per game. Catchy, right? Here is the table for last EPL season:



Or like this; strength of defenses in England’s Premier League for 2009/10 season



One other thing that indicate better defense is lower average number of fouls committed. Roughly speaking, in the case of EPL ,teams in lower part of the above graph tends to commit larger number of fouls, since their defenses are slower in reacting and positioning. There are exceptions of course, some teams have that style of play, some teams have better goalkeepers, other teams gave up trying, hence the lower number of fouls then expected. But to be on top, after the smoke clears, you need to be nice to your opponents. That's why they invented the fair play to begin with.  Here is the last season order:



Finally, combo of these two rates, in my opinion, can give somewhat accurate measure to individual defensive quality. I have excluded cards and penalty kicks since they are rare and dependent on circumstances.
On the other hand, over the span of a season we accumulate enough numbers for saves, goals and fouls to give us some level of certainty in trends that we observe.
And here is the formula; D stands for defense, FC/G for individual foul rate per game, N/G ( or NEG/G,  the program for writing formulas has an issue with NEG so I can not write it in the formula, but you'll understand it anyway ) team's sum of saves and goals conceded per game. Other mathematical operations are for cosmetic purposes only. Calculated like this, D number looks like a percent, goes upwards to indicate better performance and irons the wrinkles that occur in the cases of small sample size.



Math can be scary so let's visualize. Here are some gentlemen from EPL' s last season in no particular order ( median, on the bottom of the table, in this case works better than the average because of the big gap between the elite and average teams ); reds are below the league median:



Enough for today. That's defense. Next time offense. Data, as always, from ESPNsoccernet.

No comments:

Post a Comment