Dec 1, 2019

New Metric: Free Throw Advantage

Over the summer, I spent a lot of time reflecting over the Four Factors model of college basketball. If you have ever read the book Basketball on Paper by Dean Oliver, then you know what I'm talking about and it is probably a good explanation as to why you are reading this blog. One of the big concerns I have always had with the Four Factors model is the Free Throw Rate (FTR). In fact, most advanced metrics analysts, including Dean Oliver, have conceded that FTR is the least significant of the Four Factors. Since I always want my understanding of the game and the numbers to be at the highest level, I sought a different solution to the Free Throw Rate component, which brings us to this article. I'll start with a crash-course on advanced metrics, then elaborate on the details of the FTR, then introduce my new concept of Free Throw Advantage.

Crash Course on Advanced Metrics

Advanced metrics in basketball measure the quality of play through the fundamental unit of a possession. As Dean Smith was once quoted as saying, "I don't really know what a possession truly is, but I know when one has ended." Possessions are completed in one of three ways: a score (FGM or FTM), a defensive rebound, or a turnover. Two of those are good (bad) for the offense (defense) and two of those are bad (good) for the offense (defense). In all, you get four factors to measure how a team uses its possessions. The formula for college basketball looks like this:
  • POSS = FGA - OR + TO + (0.475*FTA)
On the surface, this formula lacks Field Goals Made (FGM), defensive rebounding (DR), and Free Throws Made (FTM). Through some mathematical substitution, you can derive this same formula if you start with POSS = FGM + DR + TO + FTM. The 0.475 coefficient is an estimator derived from play-by-play data that translates FTM into FTA by taking into account the relative occurrence of two-shot fouls, three-shot fouls (fouls on a missed 3PA), one-shot fouls (a.k.a. - and-1s), and one-and-one fouls (non-shooting fouls on the 7th, 8th, and 9th team fouls).

How a team uses its possessions is measured by efficiency. Since wins and losses in games are determined by points, there should be a metric that connects the fundamental unit of advanced metrics (possessions) to the determinant of game outcomes (points). The formula is simple:
  • Offensive Efficiency = Points / Possession
  • Defensive Efficiency = Points Allowed / Possession
Each of the four factors has a corresponding advanced metric, which allows a direct comparison of the factor to efficiency metrics.
  • Shooting: Effective Field Goal Percentage; eFG% = (FGM + (0.5 * 3PM)) / FGA
  • Rebouding: Offensive Rebounding Percentage; OR% = tOR / (tOR + oppDR)
  • Turnovers: Turnover Percentage; TO% = TO / Poss
  • Free Throws: Free Throw Rate; FTR = FTA / FGA

Effective FG% quantifies how many points a team's typical shot attempt will get them. Since points are directly calculated into the equation and the majority of possessions result in a shot attempt, this is why eFG% is most correlated with efficiency ratings than the other four metrics. OR% tells you how many extra shots a team generates per possession. It does require an additional formula to make it compatible with efficiency metrics since neither points, shot attempts, or possessions are involved in the computation, but extra FGAs per possesion should result in the possession ending with points. TO% quantifies how many possessions a team typically wastes (or gets nothing out of the possession). It is typically the second-best metric in terms of correlation to efficiency metrics because possessions are directly involved in the calculation. Finally, we have FTR, which attempts to estimate how many possessions result in free throws.

In Depth: Free Throw Rate

Looking at the FTR, it has two components: Free Throw Attempts (FTA) and Field Goal Attempts. Simply put, it shows how often a team gets to shoot a free throw per shot attempt. Since shot attempts are the typical result of a possession, there is a connection to efficiency metrics. Unfortunately, this is as strong as the relationship is going to get.

FTR also runs into the problem with how the rules of the game affect FGA and FTA. If a player gets fouled in the act of shooting and the shot goes in, the player get one FTA and one FGA added to the boxscore. If a player gets fouled in the act of shooting and the shot does not go in, the player gets two FTAs and zero FGAs added to the boxscore. The problem, as you recognize fairly quick, is the missed-FGA scenario inflates FTR more than the And-1 scenario, even though the And-1 scenario is far better for efficiency (points per possession, or PPP) than the missed-FGA scenario. (Writer's Note: Please do not confuse this statement as a justification/defense for And-1s in the game of basketball. Your humble writer would greatly prefer And-1s being eliminated from the game entirely.)

FTR doesn't make any connection on the points-side of the efficiency equation. For example, a statistic like FT% would have to be added to quantify how many of those free throw attempts actually put points on the scoreboard. Dean Oliver insisted that more attempts was better for a team than higher accuracy, and the math (which I will not do in this article) actually proves him right (a 10% increase in FTA holding FT% constant will result in more points than a 10% increase in FT% holding FTA constant). I have put together a table below (well, actually a few tables) that further reflects this inherent problem (lack of any connection to the points-side of efficiency). I will focus my attention of the first table (on the left) because it uses a base number of 50 possessions which makes the math easier, but all the tables are the same calculation with the only difference being the base number of possessions (the middle uses 60 base and the right uses 70 base, both of which may be more approximate to the average college basketball game). Each table assumes a game with an X number of possessions with no turnovers, no offensive rebounds, no And-1s, no fouls on 3PA, and no missed FTs on the first of 1-and-1s. In simple terms, these tables show that a possession will either result in one FGA or two FTAs. Let's look at the results.


What the table implicitly shows is the trade-off between eFG% and FT%. Let me explain. Since we know that 3PM is worth three points, 2PM is worth two points, and FTM is worth one point, then an efficiency of 1.00 PPP comes from a 33.33% average on 3P%, a 50.00% average on 2P%, and a 100% average on FT%. As our hypothetical table shows, free throws come in pairs on possessions whereas FGA come in singles on possessions, which means FT% only needs to be 50% to achieve an efficiency level of 1.00 PPP. In reality, we know most teams average higher than 50%. Theoretically, any good FT-shooting team that can trade out FGAs in exchange for FTAs (a high FTR) will drastically improve their efficiency (assuming that team hits those extra free throws). Thus, FT% should be involved in the calculation when examining efficiency through FTR.

Last of all, FTR is very erratic on a game-by-game basis. A team may have a season-long FTR of 0.42, but looking at that team's game log, one night may be as high as 0.72 and the next night may be as low as 0.12. As stated above, FTR is the least important of the four factors, and the reason may be due to the lack of correlation to game outcome (or margin of victory). For example, the poor FTR of 0.12 may be a blowout win (a clearly superior team takes and makes high-quality shots with no fouls from the defense) and the incredible FTR of 0.72 may be a close loss (the large number of FTAs kept the game closer than it should have been). With all of the aforementioned problems with FTR, I sought a new solution to the free throw component from the ground upwards.

Free Throw Advantage: Explained

Let's start with the formula, and then I'll explain my rationale.
  • Free Throw Advantage: FTV = (FTA/2) - (oppFLS - 12)
As for the components of the equation, FTA means Free Throw Attempts and oppFLS means opponent's fouls. For FTA/2, it is the scenario described above where free throws usually come in pairs, unless it is a And-1, a one-and-one, or a foul on a 3PA (and this formula produces a 0.5 increment when this happens, so it provides a little extra information on that front). For oppFLS - 12, this may appear as a nuance at first. In college basketball, each team gets six free fouls per half. Theoretically, a team can commit twelve fouls in a game (six per half) and never give up a single free throw. Since FTAs are being reduced by the opponent's foul count in this equation, FTV is calculating how many extra free throws are being created by the offense. FTV can also be calculated on the defensive side by using oppFTA in place of FTA and teamFLS in place of oppFLS.

First off, what does the resulting value mean?
  • The first and obvious meaning is extra free throws, which is why it is called Free Throw "Advantage." It is measuring a team's advantage in getting to the free throw line versus their opponent.
  • In particular, it measures what a team does with the opponent's free twelve fouls and the nuances of And-1s, 1-and-1 attempts, and fouls on 3PAs. If those free twelve fouls do not result in free throw attempts by the offense, then FTV will be zero (assuming no And-1s, no misses on 1-and-1s, and no fouls on 3PAs). Likewise, if those free twelve fouls each result in free throws attempts by the offense, then FTV will be twelve (again, assuming the same thing as before). And-1s and fouls on missed 3PAs will generate an extra FTA, which translates into an additional 0.5 in FTV, whereas missing the first free throw of a 1-and-1 attempt will result in an additional -0.5 in FTV.
  • If FTV is positive (negative), it is good (bad) for the offense and bad (good) for the defense.

Second off, what advantages (haha, see what I did there) does FTV have over FTR?
  • The main difference between the two metrics is FTV has far more meaning on a game-by-game basis. Since it is strictly dealing with extra free throw attempts, FTV is not subject to wide ranges of values like FTR. Using 2019 season-end data, the highest and lowest offensive FTV values for all teams were +12.0 and -7.5, respectively (these extremes were from two different teams). This narrower range is much better than the 0.12 to 0.72 range that one particular team experienced in 2019. Whereas season-long metrics like FTR are better for producing probabilistic models for single-game prediction, game-by-game metrics should be better for single-game prediction (which should make FTV more applicable to predicting the tournament than FTR).
  • Since FTV is strictly dealing with extra free throw attempts, the offensive and defensive calculations can be merged to create a FTV margin value.
  • FTV scales with its off-setting measure, unlike FTR. When FTAs go up, FGAs typically go down (as shown by the model). With FTV, more fouls result in more free throws, but FTV does not change because it only looks at what results from the free twelve fouls.
  • Theoretically, averages and standard deviations should be applicable to FTV, in order to give it a season-long metric.
  • Finally, it incorporates fouls into the calculation, and fouls are an important component of the game. As advanced metrics move into the territory of foul theory (the value from drawing fouls on key players) and V.A.R. metrics (value above replacement - how much value a key player is above their replacement), FTV will have a role to play as fouls are already in the equation.
Finally, what disadvantage does FTV have?
  • Like FTR, FTV does not use FT% in its calculation, which we know is important. However, the calculation of FTV provides one benefit that FTR doesn't. When FT% is combined with FTV or even FTV margin, a point value is created, and points matter when looking at efficiency (or PPP). When FT% is combined with FTR, then essentially the equation for the metric becomes FTM / FGA, which is a new (and somewhat useful) metric in its own right.
  • Since fouls are involved in the calculation, fouls add a randomness to the equation not present in the FTR. In my experience, fouls may be one of the toughest statistics to predict in the game of basketball. Just from data alone, an data scientist can predict with reasonable accuracy the score, the shooting percentages, the rebounding margins, and the turnover rates of a game involving two teams. What is near impossible to predict is the foul totals, or even more importantly, who accumulates the fouls. Some teams like UNC and MIST are more prone to drawing fouls on the opposing team's post players whereas dribble-drive teams like NOVA and UK are more prone to drawing fouls on the opposing team's guards. It is an element of randomness that could potentially make the metric invalid, but for now, FTV is in test-mode.
Conclusion

Well, I hope you have enjoyed this foray into my summer project, although this metric was just a piece of the larger project. If you're interested in FTV, I have left all the details of my work in this article if you want to replicate them, and if you want to build upon the work, you can pick up right where I left off. Anyways, I hope our understanding of the game has increased as well as our predictive accuracy. Depending on how things go IRL, I will either produce an article in the middle of December or use all of that time to figure out what is going on in college basketball this season for the January QC Analysis. Until then, thanks for reading my work.

No comments:

Post a Comment