Project: Perfect Bracket: Welcome to the 2021-2022 College Basketball Season

Welcome back to the greatest challenge of all time. And what a season we just finished!!! I don't think that seven-word sentence truly captures what we witnessed, but I've got an entire article (well, maybe half) to expound upon it. I also want to get some go-with-my-gut predictions for this season, so there's a lot to take on without writing an entire book.

2021 NCAA Tournament and Lessons Learned

The 2021 Tourney was one for the record books. It produced a 23.46% M-o-M rating, an all-time high surpassing 2014's mark of 21.35%. It also produced a record-tying 15 upsets, matching 2014's count. The PAC-12 came out of purgatory with three teams in the E8 after barely getting three teams in the field in the previous two tourneys. It also produced five new coaching contenders: Andy Enfield (USC), MICH (Juwan Howard), Mick Cronin (UCLA), Eric Musselman (ARK), and Wayne Tinkle (ORST). Since 1985, only two coaches have won National Championships without reaching the E8 in a previous season (a nice little homework assignment if you don't already know the two).

In the Final Results article that I published at the end of last season, I said "the high-rising tail of the QC was the key to understanding the outcomes of 2021." For most of the season, the QC remained constant in the #26-#50 positions. Most of the fluctuations occurred among teams in the upper-half (#1-#25 positions). From the time of the February Edition (Feb 3) to the Final Edition (Mar 14), the entire QC had shifted upward along all parts of the curve (yes, with the exception of a few spots). I theorized in the Final Edition that the high-rising tail was probably the result of these teams "catching-up" to their true potential, and I thought this way because the efficiency improvement wasn't coming at the expense of the upper-half of teams. When the tail of the QC is elevated, it portends a higher quantity of upsets in the R32 and S16. It simply means these teams are playing better and the difference in quality between them and the upper-half is thinning. In simple terms, there would be very little difference between team #24 (a true 6-seed) and team #44 (a true 11-seed), and if you catch a weak 3-seed, you have an 11-seed with S16 likelihood. I understood exactly what the QC was telling, but where I failed was application (more on this in a later section).

I also stated in the Final Results article that Seed-curve & Seed Displacement were not as important to the outcomes of 2021. The seed-curve approach takes strength and weakness in the SC compared to the QC and projects over-performance and under-performance accordingly. I explained why it can run into problems with the 1-8-9-16 groups. 1-seeds, 8-seeds, and 9-seeds were all expected to out-perform their expectations, but this is completely impossible when all three groups play each other in the first two rounds. The hypothetical maximum is 16 wins for 1-seeds (in line with expectations instead of over-achieving: 4-0 in R64, 3-1 in R32, and the remaining three advancing to the F4), 3 wins for 8-seeds (1-3 in R64 with the lone winner advancing to the E8), and 3 wins for 9-seeds (3-1 in R64 and 0-3 in R32). It's a near-perfect dream scenario with very low probability of happening. The seed displacement approach uses over-seeds and under-seeds to predict outcomes. Under-seeded lower-seeds should defeat over-seeded high-seeds. When it is really obvious, it works, but these situations are few and far between. When it appears like it works, it is nothing more than a coin flip. For example, WISC and LOYC were the largest under-seeds in 2021 (They were as strong as 3-seeds but received 8- and 9-seeds). Both won their R64 games with relative ease and met accurately seeded 1-seeds in the R32, but only LOYC pulled the upset. This method should not be 50-50 if it is practical (either both should win or both should lose). For the record, I've come to rely less and less on this approach for individual games. I'm far more comfortable with over-seeds and under-seeds being mapped into the bracket to find pods/octets/regions with upset potential, and even this approach is still under development.

One lesson I learned from the past season, but will probably never get a chance to apply to future seasons is the impact of no home-court advantage. For most of the season, fans were prohibited from attending games (depending on the state laws and restrictions of the two teams, and yes, some restrictions were lifted near the end of the season). This lack of home-court influence meant road teams were more likely to win if they were the more skilled teams. Even advanced metrics ratings systems adjusted their margin of error for home-court advantage. For example, Sagarin ratings was using a 2-pt margin of error for home-court advantage when other years it was usually 3- to 3.5-pts. When it comes to the NCAA tournament, least distance traveled to site locations is a key factor in wins and losses historically, which is why the Selection Committee tries to give closer-to-home sites to the higher seeds. In a year with no home-court advantage, distance traveled might be an incentive instead of a detriment. You have more to lose if home is further away from Indianapolis. The E8 aligns with this notion: GONZ (Washington state), USC and UCLA (California), HOU and BAY (Texas), ARK (Arkansas), and ORST (Oregon), with MICH (self-explanatory) being the lone exception. I thought teams closer would have an advantage, which is why I liked 4-seed PUR (Indiana), and they didn't even win a game. Most of the close-by teams lost early: 1-seed ILL was the first to go home (albeit to an Illinois-based LOYC, who ended up losing to ORST), 2-seed OHST (Ohio) failed to win a game, 2-seed IOWA lost to ORE , 3-seed WVU and 5-seed TENN are a stone's throw from Indianapolis but lost to further-traveled teams in 11-seed SYR and 12-seed ORST. My bracket probably would have scored higher from this strategy than curve-fitting.

Grading the Predictor

Failure of Application: Earlier, I mentioned I correctly read the QC but failed to apply it. When the QC was predicting a higher quantity of upsets in the R32 and S16, I should have looked to round-by-round upset history to see exactly what "higher quantity of upsets" means. On average during the advanced metrics era (2002-present), the number of upsets is 2.83 upsets in the R32 and 1.17 upsets in the S16. When you factor in recency bias, the averages are slightly higher. In other words, a higher quantity of upsets in these rounds would mean 4-5 in R32 and 2 (maybe 3) in S16. Though I never made any predictions public, I went with a 4-2-1-0-0-0 upset-by-round model for my own bracket. A failure to properly apply knowledge and insights produced this failing grade: F.

Seed Guides: All in all, I give a B+ or B. My seed-group analysis was pretty spot on, but far from perfect. I'm not going into full detail on this since there is an entire article you can check for yourself. The one glaring miss keeping me out of A-range for a grade is my comment on the 15-seed group when I said "Just like the 16-seed group, nothing to see here." That was a huge oversight.

Meta Analysis: I did try to apply meta-analysis to my personal bracket. I favored teams with the trading strategy (Top 50 2P% Def with either a higher rank in EFG% or a higher 2P% rank). It's why I had BYU making an E8 run. I also mentioned in the article that ORB was an anti-meta play as long as they fit some of the other anti-meta criteria. This was the gold mine call. Three E8 teams (BAY, HOU and USC) featured Top-10 ranked teams in ORB%. All teams in the E8 featured Top-128 stats in ORB%, and no other stat boasts this proficiency level (the next closest is TOR with all teams in the Top-150). ORB% may be in decline in college basketball, but BAY demonstrated how important it was in the NC Title Game. Their entire lead in the game due to 2nd chance points. Again, another B+ or B in these predictions, and this is pretty good considering how untested this approach is.

Gut Predictions: These were not perfect, but they were as good as it gets and they were insights you would not get from any leading sports news, stats or analysis outlet. First, all year I said the tournament was likely to feature a forfeiture. Of course, it was a 2nd-day match-up between ORE and VCU, which is why I missed it. I also stated that one fortunate team could get to play 5 games in Lucas Oil Stadium and one in Hinkle Fieldhouse. The consistency in venue can do wonders for shooting. BAY was the fortunate team in playing four games in Lucas Oil and the other two in Hinkle Fieldhouse. For a team that was very dependent and proficient on 3P%, I'm sure venues played a role. Unfortunately, the NCAA and CBS doesn't pre-release venue and tip-off time for all 63 games before the tournament starts (SOMETHING THEY SHOULD DO!!!!!!!!). Finally, I made a list of teams that preemptively traveled to Indianapolis. With the exception of GONZ, the rest of the list was Cinderellas, two of which advanced to the S16 from stunning upsets. 15-seed O-ROB won over both 2-seed OHST and 7-seed FLA, and 8-seed LOYC stunned 1-seed ILL. The only miss from this article was mentioned above about home-field advantage, but for this article I get an A.

Thoughts on the Upcoming Season -- Are We Back to Normal???

I worded it this way because it depends on how we define normal. Structurally, this season should be normal as we get back to full schedules and full venues, normal locations for tournament games, and normal mistakes by the selection committee for easy bracket picks.

In terms of bracket sanity/insanity, I believe 2022 may be abnormal (and abnormal implies a sane/calm tourney). The 2021-22 season features a one-of-a-kind scenario for college basketball: The 5th-year player. Due to Covid-19 cancelling the 2020 post-season, seniors were granted a fifth year. I'm not going to debate the merits of it, but I will say it adds something that college basketball always seems to lack: Experience. As I've said many times, I believe experienced talent wins national championships (This article explains experienced talent). I do intend to look at an updated Experience Talent Model at some point in the season, but whether or not I get around to doing all of the research for it is another story. Back to the discussion, experienced talent adds stability (low M-o-M ratings and low upset counts) to the tournament because experienced talent features players who can play the game (talent) and understand all of its detailed intricacies (experience). Compared to last year's record M-o-M rating and record-tying upset count, I think this year will be a lot calmer with this retained experience. If you believe in mean-reversion like I do, that's another reason why a calmer tournament is expected.

One of my newest tools, the Seed-Group Loss Table, was not included last year. The tool implements winning percentages among seed-groups and uses these to project seed-group performance. When a season features game cancellations throughout the season, this affects the reliability of that season's win percentages. For example, GONZ and BAY were scheduled to play during the regular season, and this missing game affects the eventual win-percentage of the group, making its results less reliable and less comparable to other years. For a fun article this year, I may hypothetically attempt to project 2021 with this tool and see what the results would be, but it should be back for the 2022 tournament.

Not only do I get to take my newest tool for a spin this year, I should get to write more than just QC articles for this season. With games starting on November 9, I can't wait to get this season underway. As always, thanks for reading my work.

Project: Perfect Bracket

Nov 1, 2021

Welcome to the 2021-2022 College Basketball Season

No comments:

Post a Comment

Nov 1, 2021

Welcome to the 2021-2022 College Basketball Season

No comments:

Post a Comment

Subscribe To