METRICS 101 AND A METRICS MANIFESTO
METRICS! METRICS! METRICS! Seems to be all anyone has talked about the last couple of seasons. Whose metrics are good? Whose are bad? Why are they that way? Why did team X with shitty metrics get in, while team Y with great metrics got sent to the NIT? I am really, really far from an expert. In fact, I would bet I have made some errors below or assumed too much. If you think I’ve made a mistake, hit me up on Twitter @frankingeneral or frank@onhighhoops.com. In any event, I will use my lay knowledge of metrics to attempt to answer most of the questions I laid out above.
THE MAJOR TYPES OF BASKETBALL TEAM METRICS
Efficiency Metrics
When we talk about team metrics in college basketball we are dealing with 2 main types: efficiency metrics and resume-based metrics. Broadly speaking, efficiency metrics measure how many points you score per 100 possessions versus how many points you give up per 100 possessions. Leaving you with an offensive efficiency number, a defensive efficiency number and a net efficiency by subtracting defensive efficiency from offensive efficiency. This is purely a statistical measure. You calculate the number of possessions in a game—since there’s no stat for possessions a formula has been created—and you take the actual points you scored or surrendered across those possessions, and extrapolate to 100 possessions. There is a small component of the possessions formula—the factor applied to FT attempts—that makes this portion slightly different from metric to metric.
The two major efficiency metrics—KenPom and Bart Torvik—both apply an adjustment based on the efficiency of the opponent: i.e. if you play the 300th most efficient offense and hold them to 50 points, no one is going to be impressed. So the metrics gurus account for that with their adjustments, which are generally proprietary portions of the efficiency formula. This is likely where the largest divergence is seen among efficiency metrics.
Generally, efficiency metrics do not care about wins and losses, they strictly care about efficiency as a means of predicting future success. It makes sense: the more you score and the less your opponents score the more games you are going to win moving forward. Further, when you have offensive and defensive efficiency metrics, you can calculate a projected score, something both Torvik and KenPom do. Efficiency is how Vegas calculates line, and it’s how a lot of sharp bettors make their picks. I have not tried it, but if you just tracked KenPom’s lines and bet every time there was a 2+ point gap between KenPom and the Vegas line you’d probably do ok.
There is a 3rd relevant efficiency metric and that is ESPN’s BPI. Functions generally the same, with the notable addition of a controversial altitude adjustment. While it would seem to make sense to apply such an adjustment—altitude is a huge advantage for the home team—based on research by a twitter user last season, the BPI’s altitude adjustment is far too harsh, and overly penalizes teams playing at altitude. This issue also taints ESPN’s resume-based metric, SOR, discussed below. When looking at BPI or SOR, it is important to keep in mind the altitude penalty.
Generally speaking, efficiency metrics respond best to blowout road wins. Emphasis on “road” and “blowout,” but they also consider strength of opponent. It just seems that too many teams are able to goose their efficiency metrics by blowing out bad teams. Some efficiency metrics—Torvik and non-team sheet metric Haslemetrics—attempt to overcome this issue by ignoring garbage time, or what Eric Haslem refers to as a game going “analytically final.” Meaning for all intents and purposes the game is over, and anything occurring after that point will not be counted metrically. KenPom, to my knowledge, contains no such guardrail.
Results-Based Metrics
On the flip side we have the results-based metrics, also called resume metrics. These consist of ESPN’s Strength of Record (“SOR”), the Kevin Pauga Index (“KPI”) and new for 2025 Wins Above Bubble (“WAB”). These are not meant to predict the future, but to simply evaluate each team’s resume in an objective way. Each functions slightly different, and all function behind a layer of opacity, but I will endeavor to describe them the best I can.
KPI—a metric I have tweeted about endlessly—is a deeply flawed resume metric. It takes into account the opponent’s winning percentage, opponent’s strength of schedule, scoring margin, pace of game, location, and opponent’s KPI ranking. However, if you follow me at all or have spent any time digging through KPI results, you know that the opponents KPI ranking and opponents SOS are much, much smaller components than opponents’ winning percentage. Using those components, KPI awards each game a score from 1.0 to -1.0, with wins scoring 0.0 to 1.0 and losses scoring 0.0 to -1.0. Those scores are ultimately averaged to determine the team’s KPI score.
SOR, per ESPN’s website, considers “opponent strength, pace of play, site, travel distance, day's rest and altitude, and are used to simulate the season 10,000 times to produce season projections.” Do with that information what you will. My belief/educated guess is that “opponent strength” is measured by ESPN’s own BPI, another efficiency metric. The “day’s rest” and “travel distance” components are certainly unique to SOR, but definitely make sense. If a team traveled across country for a game—especially in the national conferences era—that should be accounted for. If a team is playing, oh I don’t know it’s 3rd game in 4 days at 11:00am on a Sunday, a results metric should account for that.
Finally, the best results-based metric: Wins Above Bubble—WAB. As the name implies, WAB is attempting to calculate how many more games you’ve won than the average bubble team (usually the 45th ranked team) would have won versus the same schedule. Bart Torvik has calculated WAB for years on his site, but the NCAA has set their own formula for WAB based on NET (whereas Bart’s is based on his own efficiency rankings). But as Torvik (and KenPom for that matter) and NET converge later in the season, when more data is available, the WAB metrics should be fairly similar as well. In essence, for WAB, every game is scored -1 to 1, and those scores are cumulative, so all 31 WAB game scores get added up to come to a final number. The score for each game is basically 1.0 for a win or -1.0 for a loss, minus or plus the probability (calculated by efficiency) that the average bubble team would win that game. So if an average bubble team has a 70% chance to win and a 30% chance to lose, if your team wins that game it is +0.3 WAB and if you lose that game it’s -0.7 WAB.
For results-metrics, the best results vary by metric. For SOR and WAB beating good teams is the key. For KPI the key is to beat teams that will win a lot of games. Frequently that is high quality teams, but a lot of lower conference powerhouses win a lot of games, and can be used to help goose one’s KPI when scheduling. In other words, if you’re a high major and your goal is to maximize your KPI rating, you want to beat a lot of mid- and low-major opponents that will win a ton of games in their conference.
What About NET?!
What about it? It’s by far the most talked about metric owing to its status as the NCAA’s own proprietary metric. It is also the most misunderstood. At its most basic, the NET is a hybrid efficiency and results metric. However, in a sign that NET either hews closer to efficiency or provides evidence for the predictive ability of efficiency, Torvik, KenPom and NET rankings all tend to converge by the end of the season with some differences.
One of the biggest myths in metrics is that the NET “ranking” is an actual ranking. Meaning people persist in their belief that the number 1 NET team is the 1 number 1 team in the country and the number 100 NET team is the 100th best team in the country. Or at least most people think the committee is supposed to think that way. That’s why people frequently wonder “how did team A get into the tournament/get left out of the tournament their NET is X.” So whenever this blog refers to a NET “ranking” picture it in scare quotes.
The NET was never intended to be a ranking tool. That is simply not the case, and it never has been. If the NET was a ranking tool you could dispense with the committee and just run down the NET to fill out the bracket. The NET is a ranking tool as the committee likes to call it. It is used to place Ws and Ls into quadrants for subjective record evaluation, and it is now used as the backbone of the NCAA’s proprietary WAB formula.
It’s very easy for people to criticize and write-off the NET as irrelevant. The NET is simply an easy target. It is owned and released by the NCAA and as the NCAA’s own metric, it is the most high profile metric—although perhaps KenPom gives it a run for its money—which means it’s frequently cited by bracketologists, analysts, coaches, fans, etc. as a measure of how good a team is. While the NET certainly attempts to do that, it was never meant to be precise, or to rank the teams in an order that corresponds to their resumes—that’s why it has a predictive efficiency component as well as a results component.
METRICS MANIFESTO
So how does one use all of these team sheet metrics to effectively discuss teams? It’s pretty simple. When discussing the quality of a W or L, refer to NET, and the quads setup thereunder, which is what the committee does. When evaluating bids to the big dance, use the 3 resume metrics. If you want to evaluate possible bets or what the seeding might look like or which teams have the highest chance of success in the dance, look to an efficiency metric.
Following the snubs of Seton Hall, St. John’s and Providence last season I spent untold hours crunching numbers and trying to figure out what made the committee tick. As part of that, I ran a correlation function between each metric (the ones discussed above except Torvik, which was not on the team sheet, as well as “metrics” like Q1 win%, Q1&2 wins, and strength of schedule.
The most correlated metric was the “All Metrics Average” which included KenPom, BPI, NET, SOR, and KPI, with a .938 correlation, followed by an average of strictly resume metrics, which was .925 correlated. Of the individual metrics discussed herein, KPI had the highest correlation, .875, followed by the other resume metric on the team sheet last season, SOR, at .868 correlated. SOR’s correlation was likely harmed by how badly it undervalued most of the Mountain West teams that got in due to the altitude adjustment.
In other words, we need to shift the way we discuss metrics, tournament bids, bubble teams, etc. The focus is always NET and KenPom, and this year Torvik will slot in beside KenPom as the 2nd most-talked-about efficiency metric. Instead, bracketologists, bubble watchers, coaches and the like need to focus on the resume-based metrics for bids, and then the efficiency-based metrics for seeding.
I pledge to do my part. When discussing a team’s tournament prospects, I will focus predominantly on the resume-based metrics. When I discuss seeding or gambling, I will focus on efficiency metrics.