Dec 23, 2018

Return and Improve Model: 2018 Revisited

I was unsure about the topic I wanted to discuss for this last article before the January Edition of the Quality Curve Analysis. Of the four articles I have written for this current season, three of them have focused on the 2018 tournament and the lessons learned from it. Since the January QC Article will pivot our entire attention to the 2019 tournament and 75% of the articles leading up to it have been 2018-centric, I think one more article about 2018's wild ride would be fitting. It's not like it could hurt.

Anyways, I'm going to take a second look at the Return & Improve Model. I first revealed this model for the 2017 tournament (Link to the article if you wanted to refresh your memory). I wanted to do a quick article on it during 2018's Crunch Week, but I.R.L. things popped up on that Wednesday and forced me to put it aside. It may have been for the better since some of the findings in this article could only have been discovered ex post facto. So let's jump right into it.


2018 Return & Improve Qualifiers

Let's start with some definitions for the Return & Improve (R&I) model. If you read the 2017 article, these definitions will be repetitive. In a given tournament year, if a team in that tournament also went to the previous year's tournament, that team qualifies as R&I participant. For example, if a team in the 2018 tournament also went to the 2017 tournament, that team is a R&I qualifier. A table of all 2018 R&I qualifiers is below. The teams are sorted high-to-low by the team's arithmetic mean of their statistical percentages (the arithmetic mean is not shown in the table).


Quick Definitions for Understanding the Table
  • The percentages in the table reflect the percentage of the given statistic returned by the 2018 team from the 2017 team. For example, the 2018 HALL team returned 80.61% of their game starts (GS) from the 2017 HALL team. 
  • The SC column represent Seed-Change from 2017 to 2018, where a value greater than zero means the 2018 team received a better seed than the 2017 team. 
  • The Wins column represents the number of wins accumulated by the 2018 team in the 2018 tournament. 
  • The WC column represents the Wins-Change from 2017 to 2018, where a value greater than zero means the 2018 team went further in the 2018 tournament than the 2017 team went in the 2017 tournament. 
  • If the R&I model was being used in a predictive capacity for 2018 (before the tournament started), the values for the Wins and WC columns would be unknown.
  • The row labelled "corr" shows the correlation coefficient between each statistical category column and the Wins-Change column.
The purpose of the R&I model is to find a relationship between "returning" production in specific statistical categories and "improving" tournament performance (WC > 0). The theory behind this model posits returning higher percentages from the previous year's team should result in this year's team winning more tournament games than the previous year's team.

Thoughts in General on the Table
  • First, it should be noted that 2018 Providence is an auto-improve. In the table, PROV has 0 Wins for 2018, yet they still show improvement with a WC > 0. In 2017, PROV lost in the play-in game to USC, which counts as a -1 in the Wins column. Simply by making the tournament field in 2018, the worst that PROV can do is 0 wins, which is still an improvement over 2017.
  • Second, I don't think the correlation coefficient is an accurate test of our model's hypothesis.
    • First, the "return" data and the "improve" data are two different types of data. The return data is ratio-scale data ranging from 0.00% to 100.00% whereas the improve data is interval-scale data ranging from -7 to +7. The correlation coefficient calculation is only reliable when using two data sets of ratio-scale data. In reality, the improve data should actually be nominal-scale data since improvement implies a binary outcome (improved or failed to improve). This incompatibility in data sets explains why the range of values for all stats lies between 0.008 and 0.179. For the correlation coefficient to indicate a positive relationship between two variables exists, the value should be at a minimum 0.400 or greater, and for that relationship to be qualified as a "strong positive relationship", a minimum of 0.700 or greater would be required.
    • Second, though the correlation coefficient may be an invalid test of our hypothesis, the pattern within the range of values is interesting. The highest performing correlations are 3PM, 3PA, Pts, FGM, FGA, and Mins, all of which have correlation values within 0.023 (13.45%) of the range's maximum value of 0.179. Returning players who have logged a lot of your team's minutes, points and shots sounds like a really good recipe for going deeper into the tournament.
Testing the Thresholds of the R&I Model

As I did in the original article, let's look at the R&I model historically and examine how a particular statistical category predicts improvement when a certain threshold for the category is met. Before I look at the table, I first need to explain the notations in the table.

Quick Reference for Understanding the Notations
  • This table does not include 2018's results. Only data up to the 2017 tournament is used.
  • 90R# = Number of teams that returned more than 90.00% in the particular stat column. For example, 21 under GS means "21 teams returned more than 90% of their game starts from the previous year."
  • 90R&I# = Number of teams that returned more than 90% in the particular stat column AND improved their tournament performance the following year. For example, 15 under GS means "15 teams returned more than 90% of their game starts from the previous year AND improved their win total from the previous year's tournament."
  • 90R&I% =  "90R&I#" divided by "90R#". For example, 71.43% of teams that returned more than 90% of their Games Started (GS) saw an improvement in their tournament performance the following year.
  • 80 means greater than 80% but less than or equal to 90%.
  • 70 means greater than 70% but less than or equal to 80%.
  • 60 means greater than 60% but less than or equal to 70%.
  • 50 means greater than 50% but less than or equal to 60%.
Hopefully, the notations make sense from looking at the table below.

Thoughts in General
  • It should be very obvious from the chart, but I'll state it anyways: As the "returning percentage" decreases, the probability of improving upon the previous year's tournament performance also decreases. In the key statistical categories of points and minutes, returning more than 80% of either your points or minutes from the previous year implies a 2/3 probability of improving your tournament performance, returning 70-80% of either your points or minutes from the previous year implies a 1/2 probability of improving your tournament performance, and returning less than 40-70% of your points or minutes from the previous year implies an approximately 3/10 probability of improving your tournament performance. Short and sweet, experience matters.
  • When it comes to bracket-picking strategies, I prefer to keep my odds as high as possible. I would recommend using the 80%-and-higher return-percentages when making picks based upon the R&I model. While this strategy is safer, it may only result in four or five games being picked out of a tournament of 63 games. For example, if you take a look at the first five teams in the qualifiers list above, those five teams either improved upon or matched the previous year's win total. For bracket pickers, this would have picked five games correctly.
Comparing 2018 R&I to the Averages

While 2018 was the third-wildest tournament in history according to the Mad-o-Meter, how did the R&I Model fare in 2018 compared to the averages? The chart below shows the R&I model in 2018, which can be visually compared to the historical averages (the above chart).

Thoughts in General
  • I did not include the 90%-and-higher because only four categories qualified, and all four belonged to HALL, which improved their tournament performance.
  • The 80-90% threshold held to form, even though most categories had only one qualifier. In a year as crazy as 2018, I want my model's "sure-fire picks" to be solid as gold.
  • The 70-80% threshold took the brunt of 2018's damage. For a threshold to have a 50-50 probability historically no matter the statistic in question, the 70-80% threshold in 2018 was at best 1/2 probability and at worst a 1/4 probability. In my opinion, this makes the 70-80% threshold too risky to reliably predict bracket games, especially since we accurately foresaw the craziness in 2018.
  • The 40-70% thresholds deviated from the averages in their own unique ways. The 60-70% thresholds under-performed the historical averages, the 50-60% thresholds met or slightly under-performed the historical averages (depending on the stat category), and the 40-50% thresholds over-performed the historical averages.
    • It is likely that this result is the product of 2018's craziness. It may be something to monitor for future tournaments that show likely craziness.
    • These results also give us an alternative approach to employing the R&I model, one that I will call group-based predictability. If the average probability of improvement for the 40-70% threshold is between 3/10 and 4/10, then a given year's crop of teams within this threshold could theoretically be predicted based on the group-average. If we look at the 2018 qualifiers list, all teams from CIN to FLA (15 total) would fall into this threshold range. Based on group-average predictions, 4.5 (15 * 0.3) to 6.0 (15 * 0.4) teams would be predicted to improve their performance. In actuality, five teams (KSU, NOVA, MICH FLST, and KU) within this threshold improved, which is in-line with the group-based target. Identifying which five will improve and which ten will not is the harder task.
Conclusions

As a bracket scientist, I do think this model has a very valuable use in bracket-picking. Any model that can produce sure-fire picks in a tournament is a must-have, even if it is only a handful of picks. However, I don't think the R&I model has reached this level of reliability yet, but the insights it can produce, especially when given the right tweaks, is invaluable. Besides the group-based predictive targets described above, here are some of the other tweaks to the R&I models worth considering:
  • Return-and-match: Just as the name would suggest, this tweak to the model would lower the bar from "improving upon" to "matching" previous tournament success.
  • Weighted probability for individual teams: As it was stated in the Qualifiers section, some stats (3PM, 3PA, Pts, FGM, FGA, and Mins) showed higher correlations to improvement than others. By weighting the historical probability of the stat category to the team's corresponding value, an exact probability of improvement for the individual team can be calculated.
  • Data relevancy: In this one-and-done era (or NBA arms-race era, as I like to call it) of college basketball, returning players year-over-year is more of a challenge than improving tournament performance year-over-year. The data in this model dates back to the 2001-2002 season, yet the one-and-done era started with the 2006-2007 season. To be even more specific, the arms-race didn't pick up until the 2009-2010 season. Nonetheless, the data and the probabilities should reflect this paradigm, and the earlier data act as a control group.
Though these changes might improve the versatility of the R&I model, it is best suited in its current state as a complementary tool, one that can break a tie when the primary tools produce contradictory or inconclusive predictions. Anyways, thanks for reading and the January QC Analysis will be the next article, expected around Jan 3-6.

No comments:

Post a Comment