A couple days after the Arkansas game, I got a call from my dad. (You know, once he’d calmed down.) And he said, “So, I guess you’ll be doing your charts again this week.” “Yeah,” I said, having already completed the first steps thereof. “Should be pretty interesting this week.” He paused. “Have you got a minute? I was hoping you could kind of tell me what they mean.”
In the late 70′s / early 80′s, my dad trained as an industrial engineer at Auburn and he kind of gave me the spreadsheet bug in the first place. He’s certainly no stranger to lines and charts and graphs of this like. So I figure, if the lines and colors were mumbo-jumbo to him, they might be to a good portion of y’all as well. I’m taking his good suggestion : along with this week’s regular installment of Spread Sheets, this post will serve as a reference as to, well, what the Spread Sheets even mean.
Perhaps it’s best to start with why we want to do this at all.
There are many metrics by which one can measure a football game, but the two basic ways we establish a football team’s progress are its yardage and the score. The more important, of course, is the final score (SCORE-BOARD! SCORE-BOARD!) However, the final score may often bely the reality of how the game itself was played – how many times is a football game played closer than the final score would indicate? And the better team does not always win the game. How many times does a clearly superior team struggle to establish a strong point differential? Likewise with total yards, average yards, etc. – how often is a team outgained by fifty to a hundred yards and still emerges the victor? These metrics offer little basis for evaluation of an entire team.
And especially in this hyper-modern era of college football, they offer little basis for comparison between teams. Say a team averages six yards a rush. Does that team have a star-studded backfield and a mammoth offensive line? Or does that team throw the ball so much that their eight-or-ten rushes a game – every one of them a gut draw – go for good yardage? And which one is preferable? In this hyper-modern era of college football, balance no longer indicates a team has spread the yards and play calls evenly between rushing and passing – instead it indicates a team has achieved an equilibrium between pass and run that maximizes the total effectiveness of the offense. For some teams, that means throwing the ball fifty times a game. For others, hardly throwing at all.
Granted, this is just another feather in NCAA football’s cap – in what other sport is there such a wealth of competing strategies, such fertile ground for creative game-planning? But it sure makes things hard to analyze.
More esoteric methods have been employed, which attempt to apply statistical concepts. Take, for instance, the Sharpe ratio, which compares the average yards gained by the offense on a certain play to the variance of the yards gained by that play. The theory is, if you run a play A four times and get five, four, five and six yards (Sharpe ratio = 4.5,) this is better than play B that gets zero, zero, zero and twenty yards in four attempts (Sharpe ratio 0.3) and play C that gets four, three, four and five yards (Sharpe ratio = 3.) Some would go as far as to suggest that you should run play A 1.5 (4.5 / 3) times more often than play C – the Sharpe ratios allow you to prioritize your play-calling. Seems reasonable, sure, but this method, too, is flawed. For one, what of the play that gets five, five, five and ten yards – its Sharpe ratio is 0.68, which is even worse than play C, though even the little man under Les Miles’ hat could tell you that this last play is the best one of all.
And even if we ignore all that, these metrics want to take the offense as distinct from the defense, to attempt to measure them independently. As fate has painfully reiterated to our Auburn Tigers of late, this assumption – that neither has any bearing on the other – is flawed. A football team is, well, a team. And the defense’s role – the defense’s performance in that role – is not separable from the offense’s. All four modalities (your offense, your defense, your opponent’s offense, your opponent’s defense) have some degree of interaction through and with each other.
Small wonder that the BCS computers are so thoroughly distrusted, when all attempts to rank and compare teams are based on obscured, esoteric and essentially flawed concepts and computations.
My hope, therefore, is to establish a basis of evaluating a team and comparing it to others, uniting the scoreboard and the yardage total, and also the offense and defense. And to do so in such a way that is appealing, offers insight, and is perfectly transparent. And (eventually) offers us a basis not only of comparison, but of prediction. Keep in mind that while I did receive my undergraduate degree in engineering, minored in math / chemistry / physics, I do not do this for a living – I’m a pediatric neurology resident by trade. Spread Sheets is just as much of a public experiment as it is an attempt at genuine analysis. I’m open to any and all suggestions, corrections, scathing critiques, etc.
There are two main things I try to do: game progress lines, and topographic maps. Each starts with the same data set and the same calculation.
Every non-special-teams play in football has one of four results : a penalty is committed that (like holding or a false-start) determines the yardage total by itself, the ball carrier is tackled for some yards lost or gained (or a pass is incomplete,) the offense scores a touchdown, or there is a turnover. It’s fairly easy to track the team’s yardage throughout the game. The key to establishing a single metric is to somehow translate scores and turnovers into an equivalent number of yards. I honestly don’t have the time to figure out how many yards driven down the field is worth the same as a touchdown or a turnover, but there are helpful people out on the interwebs who have. Turns out that, when you examine the outcome of events in a game, a turnover is statistically just as bad as losing fifty yards on a single play. And coincidentally, scoring a touchdown is just as good as getting a fifty-yard gain. This is the conversion factor that allows us to go from yards and points simply to yards: if there is a touchdown, add fifty to the yardage gained on the scoring play, and if there is a turnover, the yardage result is negative fifty.
(I wish I could point to a single place that describes the derivation thereof. If you lurk through the comments at Smart Football, though, you’ll find it before long. And again, correct me if I’m wrong.)
The other thing I like to do is to model each play, very simply, as a decision. This is a concept I borrowed from the aforementioned Sharpe ratio. When you compute the Sharpe ratio, you subtract two from the yardage gained by every play – IE, a five-yard gain becomes a three-yard gain, a two-yard loss becomes a four-yard loss. Why would you do that? On nearly any play, it is hypothetically possible for the offense to gain at least two yards without fail. Maybe it’s the QB draw, or the quick slant, or the screen or what have you, but there’s always a way to pick up two yards with minimal risk. Granted, to adopt that as your strategy would be pretty idiotic, as 2 + 2 + 2 + 2 does not equal 10. But it is worth pointing out that calling any other play is in some sense betting that you can do better than two yards.
Thus, the two yards are a sort of ante. Every blown-up flanker reverse that goes for minus-five is actually a loss of seven, because you could have just pulled the ball and plowed ahead for two. Every incomplete pass is minus-two. Every three-yard gain is only one yard better than simply falling forward. So we take the raw yardage result of the play and we subtract two, every time. This tells us not only if our plays work, but if they are genuinely worth the risk of running them at all.
A few other things I try to do : first of all, I don’t adjust pre-snap / deadball / non-additive penalties like I do yardage totals, because that’s a matter of pure and basic execution, not of play calling. So, a false start results in -5 yards, plain and simple. If a penalty gets tacked onto the end of a play, I take the total yardage gained and subtract two like I otherwise would. I don’t include special teams (punt returns, kickoff returns, field goals) and other matters of pure field position. That’s because I’m primarily concerned with how the offense and defense are playing, and issues of field position will become evident in how they do their jobs. To include special teams yards in my analysis is basically counting it twice. And turnovers returned for touchdowns don’t get counted double (IE, they are counted for -50 not for -100.) This is because the defense has little-to-no control over whether they can return turnovers for touchdowns consistently, and it really can’t be part of game-planning. A win for the defense is a win.
So how’s it all shake out? Let’s say our running back takes the pitch and runs for five yards: obviously, the raw yardage total is five. We subtract two for the decision ante and get: three yards gained. Let’s say that he instead takes the pitch and runs five yards for a score. We start with five, add fifty for the touchdown, and subtract two for the ante: fifty-three yards. What if that touchdown run gets called back due to a holding penalty? The yardage total is minus-10. What if that five-yard run gets a little boost from a personal foul against the defense? Five yards, minus two for the ante, plus fifteen for the penalty: eighteen yards gained. Or let’s say he runs for five yards, but then has the ball stripped and the other team recovers. Because this is a win for the defense, we start with negative fifty yards, and again subtract two for the ante: negative fifty-two yards.
Clear as mud?
Moving on to the visual representations, IE, the actual charts:
The game progress line is an attempt to chart our team’s progress through the game, play by play. We can calculate the adjusted yards (as I explain above) for each play of the game, both for our offense and for the opponent’s offense. Then we can make a running total, IE, you add each play’s results to the sum of all the plays before them. To get the opponent’s plays added on to the same line, we multiply all the opponent’s yards by -1. That way, things that are good for Auburn take the line higher and things that are bad for Auburn take the line lower, while things that are good for our opponent take the line lower and things that are bad for them take the line higher. Then, we can keep doing the running total. For clarity’s sake, I color Auburn’s drives navy blue, and use another color for the other team’s drives. This is a graph of the first Gamecock drive from this year’s game, which didn’t amount to much:
Here’s the same game with Auburn’s first drive, which resulted in a touchdown:
The Gamecocks responded with a touchdown drive of their own, and the game was on :
We can to the entire game like that, and we end up with – essentially – a drive chart. (The vertical line indicates the start of the third quarter.) I call it the game progress line. It purpose is pretty self-explanatory: represent the progress of the game as a single line. Here’s the whole shebang from the Carolina game :
And just as you can string plays together to show an entire season, you can also string games together, giving you a game progress line for the entire season to date. This, I think could form a basis of comparison between teams. Here’s Auburn’s game progress line, with our most recent opponent’s and our upcoming opponent’s (with diamonds indicating the start of games):
How would you rank those teams?
The topo charts are slightly different. Their purpose is to help identify, visually, the downs and quarters in which a football team or football player was most and least effective. First you have to make subsets of the data based on quarter and down. This is pretty easy to do with a nifty little spreadsheet trick, and computing all these yardage totals allows you to make a chart like so:
|D 1||D 2||D 3|
This chart displays the total adjusted yards gained by South Carolina versus Auburn in 2010, split out for each down and quarter. IE, the gamecocks managed to rack up 138 adjusted total yards on first down in the first quarter, but got -75 on first down in the fourth quarter (thanks, Connor!) Now, this is pretty informative in and of itself – you can see what downs we were winning and what downs we were losing, and also see the change in the trend over time.
But it’s much cooler (and much easier to understand) if we make a topographic map of that data. What is meant by that? You’ve no doubt seen maps (like, actual maps of places) that use colors to indicate elevations:
So, imagine if instead of elevations on a map, we used colors to represent yardage totals, with hot colors representing higher totals and cool colors representing lower totals. Take a look at a topo map of that same data :
Each corner or intersection is a data point. For instance, the left border of the map is the first quarter, with the top left corner being first down in the first quarter, the middle point being second down, the bottom corner being third. The first vertical line to the right of the first quarter is the second quarter (same points for downs) and so on across. The computer makes the highest values red, the lowest values purple, and assigns an intermediate color to the ones in between based on a regular scale. Then, it colors the area in between the points like one might color the elevations on a topographic map.
The purpose of this kind of map is to create a picture you can glance quickly at and understand. So, you can see how USC was killin’ us on first downs in the first quarter, but slowed down as the game wore on until Connor Shaw gift-wrapped the ballgame.
Another note: why don’t I include fourth downs in the topo maps? Because success on fourth down is less a matter of how many yards you gained, and more a matter of how many yards you had to go. Did you succeed, or did you not succeed? And that kind of binary value isn’t really compatible with this analysis. Plus, teams seldom try to convert fourth downs; it just doesn’t yield much useful data.
Also: I make maps for each backfield player, for each team’s offense, and for the game as a whole. On each map, the hottest colors indicate maximum effectiveness. As in, when was Auburn’s as an entire team effective? When was Stephen Garcia effective? When was the Clemson offense effective? And just as portions of the game progress line that point downward indicate things that went bad for Auburn and good for their opponent, portions of the graph that are purple / blue / green indicate things that went bad for the offense, but good for the defense.
Hope that’s a useful explanation, and hope that makes Spread Sheets a little more easy to understand at a glance. As always, this column is just as much a public experiment as it is an attempt at thoughtful analysis : any questions, comments, suggestions or corrections are thoroughly welcome.