College Football Rankings, Attempted
Posted on 2017-10-10 in Posts
The activity of ranking teams has played an important role in college football through the post-war and modern periods. This comes about for several reasons: short seasons, a large field of teams, largely regional scheduling, and lack of a playoff. Together these lead to a poorly-connect graph of teams and significant uncertainty in the relative qualities thereof. Sports writers and coaches, among others, have been polled to produce rankings, supposedly based on the opinions of those who would know best. But non-performance-related characteristics have been known to affect rankings at times (cough Nebraska 1997).
Enter the algorithms. The Bowl Championship Series was established with the intent of pitting the top two aspiring national champions at the end of the year. Computerized rankings were used as a component for selection of the teams. And since then computers and math have wormed their way into the backwoods of Wisconsin, the hollers of West Virginia, and the deltas of Louisiana.
So, in an attempt to see what one might accomplish with scant data and some linear algebra, I set about producing a plausible ranking of my own.
I decompose each team into two components: the part that scores points, and the part that prevents the opponent from doing so. These are fairly correlated with the offense and defense, but are not one-to-one (see, e.g., the pick-six).
The quality of team \(i\) then, can be summarized by the scalar quantity
where the measures \(o\) and \(d\) are defined such that the expected score of a game between teams \(i\) and \(j\) would be \(o_i-d_j\) to \(o_j-d_i\). I make the assumption in this that scoring and scoring prevention are linear in team quality.
So in any game, two observations are made: the number of points scored by one team \(p_i\) and the number of points scored by the other \(p_j\). And given the decomposition above, these are each noisy observations of the difference between the two teams' quality metrics
where the noise \(n\) has an unknown zero-mean distribution that we'll assume is Gaussian or close to it.
The task, then is to estimate the quality of each team, based on a set of observed game outcomes. Limiting oneself to game-level statistics is woefully out-of-date in a world where others are employing drive-level and play-level data to perform ranking. But this approach has several advantages: it is simple, it does not require fine-grained data, and it could be applied in almost any sporting context.
In subsequent posts I will more fully lay out my approach, my data set, and some results. As a preview: here is my Top 25 as of today:
=== OVERALL ===
1. UGA (6-0): 42.00
2. ALA (6-0): 39.77
3. PSU (6-0): 39.64
4. CLEM (6-0): 39.45
5. ND (5-1): 35.31
6. OSU (5-1): 34.77
7. WASH (6-0): 33.83
8. AUB (5-1): 32.47
9. WIS (5-0): 30.44
10. MICH (4-1): 29.61
11. UCF (4-0): 28.88
12. TCU (5-0): 27.59
13. OKLA (4-1): 27.38
14. WSU (6-0): 26.43
15. ISU (3-2): 25.84
16. USC (5-1): 25.71
17. WAKE (4-2): 25.46
18. IOWA (4-2): 25.43
19. MSU (4-1): 25.18
20. MIAMI (4-0): 25.14
21. VT (5-1): 24.19
22. SDSU (6-0): 24.04
23. TEX (3-2): 23.82
24. OKST (4-1): 23.41
25. FSU (1-3): 22.77