College Football Rankings, Continued

Posted on 2017-10-22 in Posts

In a previous post, I had outlined a college football ranking construct. In it, I assume that the quality a team is made up of two parts: the part that scores and the part that keeps others from scoring. The quality of team \(i\) then, can be summarized by the scalar quantity \(q_i = o_i+d_i\). And in any game, two observations are made: the number of points scored by one team \(p_i\) and the number of points scored by the other \(p_j\). And given the decomposition above, these are each noisy observations of the difference between the two teams' quality metrics

$$p_i = o_i-d_j+n$$
$$p_j = o_j-d_i+n.$$

Suppose we have \(n\) teams and \(m\) games. It is fairly straightforward to construct a linear system

$$\mathbf{y=Ax+n}$$

in which:

  • \(\mathbf x\in\mathbb R^{2n}\) is a vector containing the offensive and defensive quality estimates for each team.
  • \(\mathbf y\in\mathbb R^{2m}\) is a vector of game scores, with each game having two entries (one for each team)
  • \(\mathbf A\in\mathbb Z^{2m\times 2n}\) is an indexing matrix in which row representing team \(i\)'s offensive output against team \(j\) has a \(1\) at the index \(2i\) and a \(-1\) at index \(2j+1\)

The process for estimating the scores is then:

  1. Scrape all the game data into files
  2. Set up the systems of equations
  3. Clip the observed scores in \(\mathbf y\) to the 1 and 99 percentiles to reduce impact of outliers
  4. Use pseudo inverse (with some Tikhonov regularization) to generate a least squares estimate
    $$\mathbf{\hat x = A}^+\mathbf y$$
  5. Sum the offensive and defensive components for each team to estimate total quality

So given the results of this calculation, one can predict the score to a future game and likely estimate win probabilities. I'll hope to elaborate further on these and provide some predictions. But here are the current rankings, as I have them, with almost all the games completed for week 8:

    === OVERALL ===
 1.     ALA (8-0): 45.20
 2.     PSU (7-0): 44.59
 3.     OSU (6-1): 44.10
 4.      ND (6-1): 41.88
 5.    CLEM (6-1): 40.41
 6.     UGA (7-0): 39.76
 7.     TCU (7-0): 36.05
 8.    OKST (6-1): 35.84
 9.     WIS (7-0): 35.47
10.     AUB (6-2): 35.12
11.      VT (6-1): 35.11
12.     UCF (6-0): 34.10
13.    OKLA (6-1): 33.24
14.    WASH (6-1): 32.25
15.     ISU (5-2): 29.97
16.   MIAMI (6-0): 29.29
17.      GT (4-2): 28.01
18.    MSST (5-2): 27.64
19.    STAN (5-2): 27.31
20.    NCST (6-1): 27.15
21.     TEX (3-4): 26.23
22.    WAKE (4-3): 25.06
23.    MICH (5-2): 24.81
24.     FSU (2-4): 24.71
25.    IOWA (4-3): 24.52

    === DEFENSE ===
 1.     ALA (8-0): 11.53
 2.     UGA (7-0): 10.98
 3.    CLEM (6-1): 9.20
 4.     TCU (7-0): 9.16
 5.     PSU (7-0): 8.37
 6.     AUB (6-2): 7.50
 7.     OSU (6-1): 6.96
 8.    WASH (6-1): 6.86
 9.      VT (6-1): 6.70
10.     WIS (7-0): 6.41
11.     MSU (6-1): 6.34
12.     FSU (2-4): 5.29
13.     PUR (3-4): 5.03
14.     ISU (5-2): 2.94
15.    MICH (5-2): 2.84
16.      ND (6-1): 2.79
17.      SC (5-2): 2.78
18.     TEX (3-4): 2.57
19.     UCF (6-0): 2.11
20.   MIAMI (6-0): 1.57
21.    IOWA (4-3): 1.32
22.    WAKE (4-3): 1.25
23.    SDSU (6-2): 0.83
24.      NW (4-3): -0.39
25.     EMU (2-5): -1.06

    === OFFENSE ===
 1.      ND (6-1): 39.08
 2.    OKLA (6-1): 37.98
 3.     OSU (6-1): 37.14
 4.    OKST (6-1): 36.91
 5.     PSU (7-0): 36.22
 6.     LOU (5-3): 34.07
 7.     ALA (8-0): 33.67
 8.     UCF (6-0): 31.99
 9.     WVU (5-2): 31.99
10.    ARIZ (5-2): 31.96
11.    CLEM (6-1): 31.21
12.     SMU (5-2): 30.96
13.     TTU (4-3): 30.58
14.      GT (4-2): 30.41
15.    STAN (5-2): 30.29
16.    MSST (5-2): 30.03
17.    NCST (6-1): 30.00
18.     WIS (7-0): 29.06
19.     UGA (7-0): 28.78
20.      VT (6-1): 28.41
21.     MEM (6-1): 28.06
22.     TOL (6-1): 27.99
23.     USF (7-0): 27.90
24.   MIAMI (6-0): 27.72
25.    UCLA (4-3): 27.64