THIS COLLEGE FOOTBALL season certainly has had its share of thrilling Saturdays, but as someone who believes in the smart use of analytics, its Tuesdays are driving me crazy. In 2014, the College Football Playoff selection committee ably seeded the country's strongest teams into its brackets. But this year, with tougher decisions to make, we're back to the "eye test," BCS-style politics and a very real chance that deserving teams will find themselves stranded outside the top four -- all because the committee has avoided the best statistical methods of measuring schedule strength.Any ranking seems to force a fundamental choice: Are we interested in finding the "most deserving" team, which has compiled an impressive record, or the "best" one, which has the greatest chance to win? I would argue, however, that these two questions are actually the same. The most deserving team is the best team, provided you adjust for clubs' dominance and strength of schedule. In 2012, for example, Alabama (12 -- 1) blew the doors off its regular season, outscoring opponents by an outrageous 361 points. Meanwhile, Notre Dame went undefeated by eking out five wins by a touchdown or less, including three against unranked opponents. That record (and history and their fan base) likely landed the Irish a title shot, but most analytics systems saw them, even at 12 -- 0, as unworthy of a championship bid. The Crimson Tide pulverized Notre Dame in the BCS final 42-14.
As fans, our lizard brains want to claim that wins and losses are all that matter, but you know that's not true. Take a moment and see just how many teams you can name off the top of your head that are better -- or, if you prefer, more deserving of a superior bowl bid -- than the Houston Cougars. Past performance is predictive, but only if we interpret it properly.
Last year the CFP selection committee seemed to not only understand but explicitly state the importance of schedule strength by booting TCU from its final four. This season, however, gaudy "W" totals have blinded the committee. Its first two releases ranked Baylor No. 6 in the country even though the Bears built their 8 -- 0 record against the five worst programs in the Big 12, plus Rice, SMU and Lamar. (What, Incarnate Word wasn't available?) Baylor then lost to Oklahoma, a result deemed an upset by most of the world but not by anyone paying attention to strength of schedule.
Oklahoma State is in the CFP's top 10 (through Week 11) despite opposition nearly as weak as Baylor's. Ditto for Iowa. Ohio State, which spent most of its nonconference schedule stifling yawns against Hawaii and Western Michigan, is in the CFP top four even though committee members haven't been particularly impressed with the Buckeyes. "We think they're a team that probably hasn't played its best yet," chairman Jeff Long said after the second week of rankings. "We think their best games are in front of them." Translation: The committee is grading Ohio State not on results but on its hope that another Urban Meyer team will have a great finish.
Here's a thought: The selection committee should adopt some honest-to-goodness analytics to assess strength of schedule. The simplest way to do this is literally called the simple rating system (SRS), which mathematically answers the question of how strong every team in the country would have to be for all the scores, week by week, to come out the way they did, then expresses the results in points above or below average. SRS is part of ESPN's Football Power Index, and its strength-of-schedule ratings are also broken out at sports-reference.com. There are other valid standards too, such as asking how likely it is that an elite team would go undefeated against a given club's schedule, the method adopted by Brian Fremeau of ESPN Insider and Football Outsiders.
Any reasonable, comprehensive measure of schedule strength would be better than cherry-picking quality wins and dubious losses. Such a measure would suggest the committee is overrating Iowa and severely underrating Stanford. Statistics don't have to trump observation, but we all need data to check the eye test and hedge against favoritism.