Nov 5, 2023

Review of the 2022-23 CBB Season (Updated Mar 1)

If you are a regular reader of the blog, the title speaks volumes, but more on that at the end of the article. Let's dive into the final article and pop-out some grades.

OS/US Predictive

Model Grade: B          Scientist Grade:C+

In my opinion, this model was a very good starting point for filling out a bracket. By "starting point," I mean it would have got you in the ballpark of the final outcome.

  • ALA: Projections for E8, NC, E8, and S16, and actual result was S16.
  • ARI: Projections for R32, R32, and E8, and actual result was R64.
  • PUR: No projections due to lack of historical equivalents, but history was made when there was a lack of historical equivalents (2022), and the actual result was history being made (16 def 1).
  • MARQ: Only one projection for S16, and actual result was R32.
  • HOU: Projections for E8 and NR, and actual result was S16.
  • TEX: Projections for S16, S16, R32, and R32, and actual result was E8.
  • KU: Projections for R32, R32, R32, and R32, and actual result was R32.
  • UCLA: Projections for E8 and E8, and actual result was S16.

In 5 of 7 projections (ALA, ARI, MARQ, HOU, UCLA), the statistical mode was one round further than the actual result, which is why I used the phrase "in the ballpark." As for the other two, 1 out of 7 was spot on (KU) and the other was an over-achievement (TEX). The PUR projections were indicative of a bad omen (another 16-seed over a 1-seed), but in all honesty, I wouldn't have predicted it just like I wouldn't have predicted a 15-seed to the E8 in 2022 from no comparable history. I gave the scientist a lower grade than the model. I did use this model in combination with another to form a big picture prediction, while filling in the gaps with other models. If I had paid attention to the statistical mode of each octet and built upon that starting point, my picks could have been much better than they were. This is something the bracket scientist should notice.

National Champion Profile

Model Grade: D          Scientist Grade: C+

This model bombed for the 2023 tournament. The eventual NC was a Tier 3 choice, the two Tier 1 choices did not advance past the S16, and five of the six Tier 2 choices (MIA being the exception) combined for 4 total wins in the tournament. I had a feeling this model would break (for a lot of reasons), but I stuck with it for my national champion pick (nothing else). I gave the scientist a higher grade than the model because I didn't give this model any weighting in my bracket beyond the NC, and that still cost me four games. I also made this quote in the Taxonomical Approach to post-production:

WFs have three of their four since 2016 (the beginning of the PPB era). In the same mind as Rule #4, newer identities and newer pairings are very good reasons to do this overhaul and stay up-to-date on the ever-evolving profile of National Championship contenders.

As innuendo as I could have made it, just because two WFs have never won a national title together doesn't mean it will never happen. New pairs happened more frequently since 2010, so it was only a matter of time. I will talk more about 2023 CONN and the evolution of the game in a later section.

OS/US Poll-Based

Model Grade: C          Scientist Grade: C

As always, this is my tie-breaker model (it's not one of the first ones I look at or use). If I don't have a lot of reliable model picks (which was the case in 2023) or a lot of conflicting models (less the case for 2023), this model breaks those ties. This model only had two picks for 2023 and was 1-1, which I would consider B or B- range for a tie-breaker model. However, the one miss cost an eventual F4 participant, so I had to tick it down a few grades, and I only felt it fair to give the scientist an equivalent grade, as if the scientist and the model walked off the cliff together (which sounds like the setup for a good joke).

OS/US Conference-Based

Model Grade: B+ (maybe even A-)          Scientist Grade: B-

My best advice was "do not pick your bracket with this model, but some of the domino-less predictions seem safer". If you followed this, the only missed pick it produces is a TXAM win. While the scientist's advice was worthy of a grade in the A-range, I gave the scientist a lower grade because I did not follow the rules of this model in my personal bracket.

QC and SC Analysis

Model Grade: B+/B          Scientist Grade: C

The real bright spot of the QC/SC Analysis was the comparisons to 2014. Most models were showing similarities to 2022 and 2018 (and rightfully so), but 2014 had more relatable elements to 2023: No 1-seed in the NC title game and an entire octet practically mapped by it (2014 WICH vs 2023 KU). The short-comings of the model have to do with the prediction of under-performance by 3-seeds and 7-seeds. Not only did 3-seeds out-perform their seed-group expectations (9 actual wins versus 8 expected wins), but they won more games than the 1-seed group (9 vs 5). The key takeaway is any tourney's QC/SC analysis with similarities to both 2014 and 2022 should annihilate 1-seeds early and often. 2014 lost a 1-seed in R32, S16 and E8 while 2022 lost a 1-seed in R32 and two in S16 (similar to 2023).

Return-and-Improve Model

Model Grade: B          Scientist Grade: B

Overall, this model did pretty well. I reduced the scientist grade from the A-range because I deeply regret not discussing group-based probabilities in the final write-up (I think I discussed this in one of the articles on the R&I Model). Overall, my concerns were mostly validated. TCU statistically could have been an R&I team, but I was worried about their chemistry and cohesion. My concerns with AUB and IOWA falling in seed-line proved to be accurate, as neither hit the "improve" threshold. The 40% group was the biggest shocker. On average, teams in this R&I-range improve about 38% of the time (approximately 3 out of 8 times), and coincidentally, this group had eight teams. The shock came as 5 out of 8 improved instead of 3 out of 8, especially considering one of these teams had to make a F4 run in order to achieve "improvement." For my personal bracket, I picked a lot of these games right, but I don't remember if I picked them specifically because of this model.

Meta Analysis

Model Grade: B-          Scientist Grade: C+

The meta plays as determined by the scientist were elite 3P% and elite TOR, which called for GONZ as a S16 minimum, and COLG, PNST, and ORAL as R32 minimums. In my first scrubbing, I had 10-sd PNST advancing to S16 and losing to 11-sd PITT, but I had to dial back PNST to bring my bracket closer in-line with the Aggregate Model projections. I ultimately went with PNST, ORAL and GONZ to R32, giving more weighting to data-driven models over theoretical models like this one. ORAL was a total miss and GONZ exceeded minimum expectations, even though I had them losing to TCU based on the R&I model.

The scientist got a lower grade for coming up short in two areas. First, the determination of meta plays was missing a few elements. Elite TOR was probably the best call, as five of the E8 teams were in the Top 70 TOR (GONZ, TEX, CREI, FAU, and MIA). Elite 3P% was a decent call, as four of the E8 teams were in the Top 75 (GONZ, CREI, FAU, and MIA), but elite 3P%D was just as good if not better as three of the F4 teams were in the Top 70 (CONN, SDST, and FAU) along with an E8 team in KNST. Theoretically, if shooting is down across the entire field of 64 like it was in 2023, then elite 3P%D should slam the door shut on an opponent getting hot from downtown (the so-called great equalizer effect, something I truly hate in the game of basketball). Another missing element is elite DRB, as five of the E8 had Top 77 rankings (CONN, GONZ, SDST, CREI, FAU). Again theoretically, if shooting is down across the tournament field, then preventing easy points by rebounding the opponent's missed shots at a high rate should be another key factor in meta picks. The second area of short-coming for the scientist was terminology/thresholds. In the Final Analysis, I used the word "Elite" very loosely but settled for "Top 50 ranking" as the defining threshold for "Elite" (I've also used it pretty heavily in this section as an attempt to keep consistency between this article and the 2023 final article, but ultimately, the goal is to bury the word "Elite"). In past tournaments as well as 2023, Top 64, Top 75, and Top 80 have been key statistical thresholds for E8 teams. In stronger quality years, there should be higher thresholds. In weaker years like 2023, the thresholds can be lowered from Elite (Top 50) to tournament quality (Top 64 or Top 68 depending on how loose you want to define it) or historical quality (Top 75-Top80). Even looking at CONN, they were 88th in 3P%, barely missing the aforementioned cut-off for historical quality. Properly defined thresholds is definitely a lesson learned from the wildness of 2023.

Seed-Group Loss Table

Model Grade: N/A          Scientist Grade: B-

Simply put, I should have followed the key takeaways from this model. Every model was suggesting weakness. This model in particular suggested no more than one 1-seed would make the F4, and based on historical matches with 2014 and 2016, a 66.7% chance that a 1-seed wouldn't be the national champ. I foolishly went with two (ALA and HOU). I could have significantly minimized the damage done to my bracket, but I couldn't figure out how to apply the matching model or the regression model to the rest of my models. Since this model is still in the works, I relegated it to the sidelines for predictive purposes (thus, no grade given).

Final Predictions

Model Grade: B to C+          Scientist Grade: C+ to C-

In all honesty, the models were very insightful this year, but not very helpful. Excluding the implosion of the Champ Model and the tie-breaker status of the poll-based OS/US model, the rest of the models held up pretty well, if only the scientist used them more wisely. The 5-4-3-0-0-0 Agg Model was a bad call, as the final count was 4-3-3-1-0-0. Not only was it incorrect to predict five R64 upsets, but four of my five R64 upsets were 3- and 5-seeds, who were 8-0 in 2023R64. Imagine throwing out a NR, a F4 and an E8 team all in the R64 just to hit five upsets. This is why I've abandoned curve-fitting strategies because you are double-punished when you are wrong. As for the conference-based strategy, I simply chose the wrong conference to out-perform: I should have chose the BEC instead of the SEC. If you are interested in the conference-based strategy, look at the record-by-conference section in every tournament year page on Wikipedia. It gives the round-by-round wins for each conference along with a winning-percentage. (IMPORTANT NOTE: The Wikipedia pages account for wins in the Play-in games, but in my rendition of the model, I do not count these wins in order to maintain statistical comparability with the pre-2011 years that did not have play-in games). For the most part, the winning percentages of the power conference teams stay in the same range, so it provides a maximum/minimum quality check on your bracket. Likewise, the number of teams from a conference is usually an indicator of performance: Conferences with 3 or less bids and 7 or more bids usually under-perform (bottom three in win%), whereas those with 4 to 6 bids usually out-perform (above 60% win%). The most important detail of a strategy is that it isn't meant to be predictive, it is meant to be a guideline (a way to examine/control going too far or not far enough with certain teams to which you may have a bias). As a hypothetical example, after you produce your first bracket for scrubbing, you realize that Power Conference X has five bids and you only have one win for all five teams (Win% = 20%). Very rarely does a power conference win collectively less than 40% of their games (only 7 times out 90 attempts since the 2008 tournament). Of these seven times, only one time (2013 B12) has been a 5-bid power conference (two were 3-bids, two were 7-bids, one 1-bid, and one 4-bid in 2023). These numbers demonstrate the higher likelihood of under-performance of conferences with 3 or less bids and 7 or more bids (5 of the 7 under-performances).

The Curious Case of Crashing Connecticut

As I foreshadowed earlier in the article, I want to talk a little bit about the National Champ. First, there was not a clear favorite in 2023. CONN was a Tier 3 contender according to the Champ model because they did not win a conference title, they did not have a coach advance to the E8 in a previous tourney, and they had a taxonomy mismatch plus deviations in shot distribution and post-production from previous champs. However, I did write an in-season micro-analysis article about them because I thought they were the only title contender midway through the season. If you saw the way they walked through the Phil Knight Invitational, especially the utter dismantling of 1-sd ALA, you would have thought the same thing. When they hit conference play, they hit a wall of resistance, part of which was self-inflicted and part of which was opponent personnel. I did an excellent job in the article of analyzing the first part: The sudden onset of the turnover bug during that losing stretch was a key factor. While I noted the fall in shooting percentages, I never looked for an explanatory factor: Opponent's personnel. CONN ran 20% of their shots through SG Jordan Hawkins, which is a good idea if you have a player like him who eventually becomes a lottery pick. However, the BEC was loaded with players who could match-up with Hawkins: 

  • XAV Colby Jones was 34th overall in 2023 NBA draft and XAV swept CONN.
  • MARQ Jones and Joplin helped MARQ win 2 out of 3 matchups vs CONN, as both are 6'5 and 6'7, but I can't remember which was the primary assignment to Hawkins.
  • CREI Trey Alexander helped CREI split the regular season with CONN and he is similarly built like Colby Jones and will likely get drafted in the same range as Jones.
  • PROV Devin Carter is a quick guard able to navigate screens to maintain contact with Hawkins.
  • JOHN AJ Storr (a 6'6" 200lb guard with height and length to frustrate Hawkins)
  • HALL Richmond, Odukale and Ndefo were rotated in guarding Hawkins to keep fresh legs on him, and they could switch interchangeably onto Hawkins when their assignment screened for him.
  • Even GTWN and NOVA had matchable players to guard Hawkins, although these teams weren't quite good enough to match CONN across the board or get a win against them.

Match-ups matter! In CONN's path to the title, guards like these did not exist. The best possible match-up was ARK Ricky Council, who had the size (6'6) and athleticism to stay with Hawkins plus four days to prepare for the match-up (although I did not watch much of the 2023 tournament for personal reasons). It's no surprise that Hawkins's worst performance in the tourney was against ARK's length and athleticism. No one else -- GONZ Julian Strawther, STMY Logan Johnson, MIA Miller or Poplar, or SDST Bradley -- had the measurables or the individual defensive skills to contain Hawkins. If I could have perfectly forseen these five opponents (especially MIA and SDST in the F4 and NC games), I could have easily predicted that Hawkins would go 21 out of 42 from 3-pt land in the tourney. In fairness, I did have CONN in the E8 through STMY and ARK because I thought they were a lock to be National Champions in 2024 if everyone stayed and they made an E8 run in 2023.

PUSH & PULL

Now, we come to the difficult part of the article. If you are a regular reader of the blog and you have keen attention to detail like myself, the title of the article implies a significant change for PPB. Past season-opening articles were titled as "Welcome to the {insert current season}" whereas this one says "Review of the...." Yes, you guessed it: There will not be a 2023-24 season of PPB (for subsequent years, I don't know yet). This decision was actually simple to make, but the reasons are two-fold: A push factor and a pull factor.

The push factor is none other than the current state of the game. College basketball has been changing rapidly over the last ten years, probably more than any ten year period. So much so, the product on the court is almost unrecognizable from the same product on the court when I started in 1998. I absolutely despise the way the game is played today (especially at the NBA level), and I despise it so much that I even gave it a name: Trashketball. It is better known as ball-screen offense: One man dribbles, one man ball-screens, and three men stand outside the 3-pt perimeter and do absolutely nothing. The NBA affectionately calls it "creating spacing," but when you stand in one spot and only move if the ball is passed to you by the one man dribbling, then you are affectionately "doing nothing." When you've seen basketball played through Dean Smith's motion offense, Pete Newell's reverse-action offense, John Wooden's high-post offense, Pete Carril's princeton offense, Bob Spear's shuffle offense, Rene Herrerias's flex offense, and the variety of off-ball screens and cuts that these systems have introduced to the game, then today's game looks, feels and plays far inferior to these. Even NBA practitioners prove my point when they argue the superiority of ball-screen offense: "Ball-screens are the highest form of basketball because they force the defense, particularly the ball-defender and the screener-defender, to make a choice." If you accept this argument as fact (and yes, there is some truth to it), then how much more dangerous will your offense be if the remaining three offensive players ran 1 off-ball screen (3rd and 4th player) and 1 off-ball cut (5th player)?

  • With trashketball, off-ball defenders have a two-option choice: 1) Stay attached to the shooter on the perimeter or 2) help protect the paint against dribble penetration from the on-ball action.
  • With off-ball action supplementing on-ball action, off-ball defenders now have to make a choice with more than two options, and these choices have to be in coordination with all other defenders. By the NBA logic that claims forcing the defense to make a choice results in a superior form of basketball, then each off-ball defensive player being forced to make a multiple-option choice which also must be in coordination with the other choices of the other four defending players should produce a form of basketball superior to the binary choice of plain-vanilla ball-screen offense!!!
  • By also running off-ball action instead of just on-ball action, you are running an offense that looks more like a motion offense or a reverse-action offense instead of trashketball. So yes, the illogical foundations of the current basketball ideology is taking basketball backwards, not forwards.
  • Not that one example fully proves my thesis, but it is rather proof-positive when considering the fates of 2023 CONN and 2023 UNC. 2023 CONN had a offense mixed with set plays (usually Hawkins running through multiple off-ball screens to get an open 3pt shot), ball-screen continuities (slightly more advanced ball-screen offense than simple trashketball), and dummy sets (alignments and actions that appear to be one of their set plays/continuities in order to bait the defense into over-defending something that isn't there or isn't going to happen). 2023 UNC ran trashketball: The ball-screen offense initiated through RJ Davis or Caleb Love with Armando Bacot ball-screening and the remaining three players "doing nothing." CONN won the tournament while UNC was left out of it. Yes, I am fully aware that the 2022 UNC team also ran trashketball and made a run to the title game. When you compare the game logs of the two UNC teams (22 vs 23), you will see that 2022 UNC warmed up from 3pt land and 2023 did not (35.4% vs 30.4% on approximately 24.6 3PAtt). This is the great-equalizer effect that I mentioned earlier: Made 3s are a good way to cover-up the blemishes of playing low-quality basketball.

Trashketball is inferior to other forms of basketball for another big reason that I have yet to mention up to this point. It requires a specific set of rules in order for it to be feasible. Horrendous rule sets like the "freedom of movement" philosophy reward Dribble-based offenses (like trashketball) with free throws by calling fouls on defenders from contact that was initiated by the offensive player, yet passing-based offenses are only rewarded with the fruits of their labor because they don't seek out contact. The rules dictate that if you play 'this way', you are more likely to get free throws, but if you play 'that way', you are less likely to get them. That is not how fouls should be determined, but it is a big reason why teams are shifting away from older, more skill-oriented offenses to low-quality, low-skill, low-output offenses like trashketball. For all of the college basketball geniuses that can't figure out why scoring is down year after year, it is down because low-quality basketball is being incentivized and rewarded. Trashketball also needs an extended three-point arc to "create spacing." If the arc is 19'9" from the basket, off-ball defenders only have to go as far as 19'8" from the rim in order to protect the basket (and really they don't even have to go out that far until their assignment gets the ball passed to him). When the arc is extended, defenders have to stay further out if they want to disrupt stationary shooters. Whether it be the motion, the reverse-action, the princeton, or the high-post offense, none of them needed this artificially created spacing to win NCAA championships. They created the spacing themselves and highly skilled players executed them to produce high-probability scoring opportunities. After all, scoring was high when these systems were in play, and it has been falling since the game has shifted away from these higher-quality styles to the current lower-quality style of trashketball. When rule sets have to be designed and implemented in order to make this style functional, then the current style clearly is an inferior style of play to those styles that didn't need these rule sets and also scored better without them! Yes, this long-winded rant explains why I've felt like college hoops has been pushing me to other avenues of interest, but this push factor has been around for a while, it wasn't until the arrival of the pull factor that I made this decision.

The pull factor that is pulling me away from the blog is real life priorities. 2023 was not a good year for me in real life (or with my bracket to say the least). If this tells you anything about life, I'd rather have 2020 back than have to go through 2023. On the Friday before Selection Sunday, I had close family member pass away and the funeral was on Thursday's opening games. Then a week later, an immediate family member was diagnosed with cancer and has been on chemo since April. These two events were the reasons I didn't watch much of the 2023 tourney. Then in early October, another immediate family member was diagnosed with cancer, and they started chemo last week. As a result, my free-lance as a college basketball writer/analyst/philosopher has to be replaced with care-taking duties. If not for the pull factor, I probably would have continued for another year.

FINAL PREDICTIONS

Well, if you've made it this far in the article, I do have something left for you: My premonitions of the 2023-24 season.

  1. The Return & Improve Model may be a very good tool for next year. Going by the preseason top-25, DUKE, PUR, MIST, MARQ, and TXAM return close to 80% of their minutes from last year (which probably means they will return 70-80% of their production). This means DUKE should be good for 2 wins, PUR should be good for 1 win, MIST should be good for 3 wins, MARQ should be good for 2 wins, and TXAM should be good for 1 win, as long as they don't suffer injuries, drop in seed line from last year, and don't run into critical match-ups in the bracket.
  2. FAU returns 90% of their F4 team from last year. I am very skeptical of smaller conference teams when it comes to Return & Improve, which is why I don't include their results in the model. They move to the AAC from CUSA where MEM and UNT present as their only challenges. This year, they aren't a surprise team, they will be the hunted instead of the hunters, and I'm not sure how much more potential they have to unlock as players and as a team (especially compared to power conference teams). If you think Dusty May is the second coming of Brad Stevens (2010 and 2011 BUT), then pencil them into the F4, but I would watch them closely all season before doing so.
  3. If you dig deep in my blog, I've said Tom Izzo is one of the best tournament coaches, if not the best because every four year player that has played with him has reached a F4. I'm not sure if this achievement still holds true given the fifth-year Covid rule as well as the transfer portal, but we are due for another Izzo F4 run (last was 2019) and his team has already been mentioned in Point #1. I'm proclaiming it before the season starts that MIST goes to the 2024 F4.
  4. CREI has a good chance to be the best shooting team in the NCAA next year, with high-percentage shooters at every position 1-4 and from the bench. They return approximately 60% of their minutes and reached the E8 last year, which removes a disqualifier from their NC profile for this year. Their defense is my concern, especially at the point of attack since Nemhard moved to GONZ and will likely be replaced by UTST's Ashworth. If the NBA pays up for elite shooting percentages, then it's a good enough reason to pay attention to shooting numbers (as I've done for years).
  5. If there was one model that I would consider changing or implementing better, it would be the National Champ model. The simple rule, as I've jokingly stated in a few National Champ Profile articles, is take a Tier 1 contender or take CONN. In years that the model has worked, a Tier 1 contender has won it all, but in years (2011, 2014, and 2023) that the model has 'broke', CONN has won it all. This simple rule isn't scientific at all, but I still like the principles on which the model is based. Instead, I would focus on cleaning out non-contenders (Tier 4 or lower), and let some of the other models have a say in the process as well.
  6. Just a friendly reminder, defending national champs do not advance past the S16 unless they return close to 90% of their roster. CONN is returning just 40% of their minutes and production from their 2023 title team. You know the drill.
  7. Finally, I wouldn't post a long-winded rant about trashketball without leaving some advice on how to bracket pick around it. Again, match-ups matter! Even though I suggest these teams may employ a specific style, the coach may think otherwise. The key is personnel, and if personnel can fit a lot of different builds, then they can match-up with a lot of different teams.
    1. Tower defense: Teams with a tall big man that don't want him navigating screens on perimeter will leave him in the paint and force the guard to chase the ball-handler over the top of the screen in order to semi-contest any 3-pt attempt by the ball-handler. 2021-22 ARI employed this style with Christian Koloko. Top 25 teams that could fit this build include KU (Dickinson), PUR (Edey), CONN (Clingan), CREI (Kalkbrenner), GONZ??? (Gregg), ARI (Ballo), UK (Bradshaw), TEX (Shedrick), USC (Morgan/Iwuchukwu), STMY (Saxon), and ILL (Dainja). These teams would be susceptible to play-making PGs, much like 2022 HOU's Snead did to ARI in the S16 with 21 PTS, 6 AST, because the roll-game in pick-n-roll becomes more of a threat with the guard chasing over the top. (I marked GONZ with question marks because I don't know who their 5th starter will be: If it is Gregg at the 5-spot, then the tower defense may be the best option.)
    2. Hedge and Recover: Teams with athletic big men that can show a presence on the other side of the screen and then recover to their defender (the screener). The hedge should deter any vision or dribble penetration while allowing time for the guard to re-establish defensive position. 2023 SDST employed the hedge-and-recover as their primary ball-screen defense in route to a NR run. Top 25 teams that could fit this build include MIST (Sissoko), ARK (Mitchell), SDST (Ledee), UNC (Bacot), BAY ("Everyday John"), and ALA (Pringle). These teams would be susceptible to ball-handlers who can exploit the spatial gap between the hedger and the recovering guard either with a quick-release, off-the-dribble shot (CONN Hawkins) or a pass between the two defenders to a cutter in the high post area (Jackson Jr.).
    3. Ice Defense: Teams with length in the off-ball positions will want to freeze the ball on one side of the court and use the off-screen defenders as help-side defense. They "ice the screen" by having the ball-defender aggress over-the-top of the screen before the screen arrives and use the screener's defender to deny a dribble-drive rim attack. 2018 LOYC used this defense in their infamous F4 run. Top 25 teams that could fit this build include DUKE, TENN, HOU, and TXAM. These teams would be susceptible to ball-handlers that can shoot off-the-dribble (MICH vs LOYC in 2018 F4, BAY vs HOU in 2021 F4, NOVA vs HOU in 2022 E8) or back-screens against help-defenders with a cross-court baseline-to-opposite-wing pass.
    4. Trap Defense: Teams with length/reach in the guard spots and athletic big men can trap the ball screen where both the ball-defender and the screener-defender chase the ball-handler in double-team fashion. Their length can deny vision and the trap can intimidate the ball-handler into picking up their dribble, further aiding the defense. 2011 VCU employed this trapping strategy (called "Havoc") into their F4 run. Top 25 teams that could fit this build are MARQ and MIA. These teams would be susceptible to taller ball-handlers with passing ability who can see over/around the trap as well as mobile big men who can flare away from the trap to provide a passing outlet and 4-vs-3 counter-attack to the trap. Rubs are another good method to counter the trapping defense, where the ball-handler passes to the screener before the screen is set and then rubs his defender off of the screener in order to receive the ball again.
    5. Switching Defense: Teams with interchangeability in players can simply switch assignments to defend the ball-screen so that the screener-defender now becomes the ball-defender and the ball-defender becomes the screener-defender. It allows the defense to keep pressure on the ball to deter outside shots as well as lock the screener to the perimeter to prevent the roll-game. 2023 FAU rode this style of defense to the F4. Top 25 teams that could fit this build are FAU and NOVA. These teams would be susceptible to back-to-the-basket post-players if a smaller guard (PG or SG) were to switch onto the post-player. In the NBA, they punish switching teams by setting a screen with a slower screener-defender and then attack the switch off-the-dribble which essentially becomes a foot-race to the rim. Even if a team isn't fully switchable for all 5 positions, they can still employ this defensive strategy on the spots which are interchangeable, but it does require experience and communication.

If there was one advice that I could give that would out-weigh everything else, re-read the blog and take notes. I've made and documented a lot of my correctable mistakes in seven years of PPB, and there's no reason for you to make them either. Though I've been wrong a lot, I've been right a lot more, so there should be plenty of wisdom in these countless paragraphs. Likewise, if you want to try to emulate my work, I've left plenty of details on how to do it. Though I've yet to pick a perfect bracket, I hope my work played some role in whoever eventually does. But for now, life calls upon my services, and like the tournament, we'll just have to wait and see how it plays out. As always, I greatly appreciate the time you take out of your life to read my blog. Farewell!

Follow-up to Comments

I was looking up some information on a few models and noticed the comments. First off, thanks for all of the well wishes. However, I do believe I've written my last article for PPB. My dad passed away five days before Christmas and Pete Tiernan moved on from college basketball analysis years ago. For me, it just doesn't bring the same enjoyment without my two biggest influences still around. Since this is too long to post in the comments section, I've just added it to the bottom of the article.

As for the questions, I'm still pretty confident CONN won't make it past the S16. Teams with athletic middle-men (SG/SF/PF/CF) give them fits (HALL & KU) and their defense is atrocious especially when compared to their 2022 counterpart. However, they are good enough to win two games in the first weekend, so I would hesitate to bet against them vs an 8/9 like I did against KU last year unless the 8/9 matches that description. No national champion in the Kenpom-era has advanced past the S16 in the next tournament except 2007 FLA who returned >95% of their 2006 production. The last reigning NC to actually make it to the S16 was 2016 DUKE, so it seems we are overdue.

MIST reminds me of 2023 UNC in many ways, so I am far less confident in that pick. While MIST is likely to make the tourney, their qualitative and statistical similarities to 2023 UNC having me doubting a 4-game winning streak in March. Tyson Walker needs to get healthy and better-than-his-true-form (if that's possible). Pathing would also be a welcome blessing: The East Region (Top 16 reveal) with UNC & IAST would be a good landing spot for MIST, KU would also be a favorable match-up in the West but DUKE not-so-much. I would also assume MARQ and BAY would be good match-ups too based on recent games. It just depends on how all these teams get seeded. There was nothing scientific about the prediction, just one of the many qualitative guarantees that haven't been guarantees in the last few years (1-seeds perfect against 16-seeds, Roy Williams always winning a R64 game, a National Champion not having transfers in the starting lineup, etc).
As for the models, you are welcome to the data which I can legally give out (i.e. - I don't think I can give out any subscription-based data like Kenpom/Torvik/Sagarin). Keeping them up-to-date will feel like a full-time job, but I enjoyed doing it so it didn't feel like work. Knowing how to apply them, knowing when they work and when they don't, and getting them ready before Bracket Crunch Week has been a trial-by-fire over the last seven years.

As for my thoughts on potential champs, I wouldn't look anywhere other than PUR, TENN or HOU. I don't trust PUR's backcourt, but statistically (except for the negative TO rate) they look like the favorite. I don't trust TENN's shooting numbers, but if Vescovi, Zeigler and Mashack can find the mark from the perimeter, they would be my favorite. I don't trust HOU's reputation or analytics. They've ran up the score on bad teams (30+ Marg of Victory against 9 clearly-not-in-tourney teams), and I believe this has inflated their metrics. As fundamental and dedicated as HOU is, their shooting numbers are as atrocious as CONN's defense and HOU still loses to the same recipe: Spread 'em Out, Shoot 'em Up and Send 'em Home (ALA and MIA from previous years). This year, they are 1-3 against defensive PGs (KUs Dajuan Harris, 1-1vs IAST's Lipsey, and TCU's switching defense and have 1 game vs KU remaining at home with KU's McCullar unlikely to play, likely HOU wins) which makes sense given how much of their offensive system runs through Snead, so look out for that element in their match-ups (like 2022 NOVA). If this year matches 2013, HOU best-resembles 2013 Champion LOU in many respects. 2013 was the last year without freedom-of-movement, so in my opinion, a team/style that won without freedom of movement is far less likely to win in this era of basketball with freedom of movement.

As for my other thoughts on the season, 2023 was historically bad for ratings. Even though the teams were from major viewing markets (NE U.S., FLA and Cali), no F4 team was in the top 2 teams from those regions (SYR/JOHNS, FSU/FLA, and USC/UCLA). I expect a lot more blue bloods in this year's E8 and F4 for ratings, and I wouldn't be surprised if the NCAA sent out a memo to officiating to ensure more blue bloods in the F4.
I doubt I could predict potential seeds of each team, but based on stats/metrics and the games I've seen, my list of R64 upset victims (1-6 seeds) would include ARI, IAST, BAY, OKLA, AUB, SCAR, UNC, and WISC. This doesn't mean all of them will lose in the R64 even if they get a top-6 seed since Cinderella-like opposition also matters. Likewise from the few models I've "short-cutted" so far, I would expect more upsets in later rounds (R32 and S16) than R64, similar to 2023 (4-3-3-1-0-0), so probably no more than 3 or 4 of the 8 teams.
As for the one thing I haven't studied (or want to study and probably still won't),  3P% and 3PR seem like an important indicator of tourney advancement potential with 3P% being slightly more important than 3PR. My hypothesis would be teams with high 3PR and low 3P% could be potential upset victims (aka poor shooting teams or poor shot selection), and teams with high 3P% and balanced 3PR (not top 30 but not bottom 250 either) could be potential deep-runs (+PASE). If I had to postulate an ideal metric, it would be one that best predicts the NBA Finals as if it were a 1-game single elimination tournament (the NCAA structure). In theory, if the college game becomes more like the NBA style of play, the only difference is the tournament structure. On any given night, a bottom-standing NBA team can knock-off a top-standing NBA team, but in a best-of-7 format, the better team will win four games faster than the worse team. I believe the 3-point shot will explain a lot of this variability in outcomes.

I've always wanted my work to play a role in predicting a perfect bracket, but with everything that's happened IRL, I don't think it would matter as much to me as it used to. I can't promise that I'll be back on here (unless I need something else from one of the articles), but in case I don't, good luck to you both Toz and Boston.

Mar 13, 2023

2023 NCAA Tournament - FULL ANALYSIS

The grand finale to the 2022-23 College Basketball Season is here. From this point forward, there's nothing left to it but to do it.

Over-seed/Under-seed -- Predictive

Let's start with this model. All of the previous years' models can be found on the Bracket Modeling tab above, as well as the explanation for the O/U/A notations.



This model is a relative comparison exercise to get a feel for the passivity or resistance of a region. Even though comparative octets are given, it does not mean the 2023 octet should match its equivalents in a pick-for-pick manner. There are slight differences between the matching octets, and other models may suggest different projections.

  • ALA octet: A lot of similarities to 2014 ARI, 2021 BAY, 2021 MICH, and 2022 GONZ. In all four examples, the 4-seed is stronger than 2023 UVA, so that means less resistance this year.
  • ARI octet: Compares to 2014 KU, 2017 DUKE, and 2019 UK. Two of these matches had key injuries (Embiid for 2014 KU and Washington for 2019 UK), whereas ARI does not.
  • PUR octet: This is the only region that I could not find historical equivalents. The closest was 2014 WICH, which was a stretch, plus it matches much closer to another 2023 1-seed octet. Last year, there was no historical equivalents for 2022 BAY or 2022 UK............and history was made.
  • MARQ octet: The best example is 2021 ALA, but I'm always skeptical of 2021 metrics due to the plethora of incomplete games/schedules. Other examples include 2017 ARI and 2016 XAV, but the lower seeds of 2023 are somewhat stronger than these two matches.
  • HOU octet: 2016 KU or 2017 GONZ with a weaker 4-pod are the best comparisons. In every year except 2018 UVA and 2022, the top-rated 1-seed advances to the E8. HOU only fits one upset criteria for 2018 UVA, which is the Sasser injury, but since it looked more like a strain than a tear, his highly likely return will uncheck that box and any chance at a 16-seed over a 1-seed.
  • TEX octet: The octets that best match are 2019 MICH, 2018 PUR, 2017 LOU, and 2014 NOVA. It was very difficult to find historical 2-seed octets where the teams are accurately seeded according to quality metrics like this TEX octet.
  • KU octet: 2022 BAY, 2017 NOVA, 2014 WICH, and 2018 XAV if it had slightly stronger lower seeds. None of these comparisons bode well for KU's chances, and I posted a micro-analysis on this year's KU team, which explains why I think they are in trouble.
  • UCLA octet: 2015 ARI and 2017 UK fit decently with this octet. The biggest difference is both of these matches ran into accurately seeded 1-seeds, which UCLA does not have to worry about.

National Champion Profile

Let's start with the initial rules: 1) Must be an 8-seed or better, 2) Must not lose their first game in the conference tournament (failures are red-texted), and 3) Must not have double-digit losses (failures have strike-through).

  1. ALA,        HOU,          KU,              PUR
  2. ARI,         TEX,         UCLA,         MARQ
  3. BAY,         XAV         GONZ,          KNST
  4. UVA,        IND          CONN,         TENN
  5. SDST,       MIA         STMY,          DUKE
  6. CREI,       IAST         TCU,              UK
  7. MIZZ,      TXAM        NW,            MIST
  8. UMD,       IOWA        ARK,           MEM

I have my reservations about keeping non-power conference teams (HOU, GONZ, STMY, SDST, and MEM) in this list, but since there are no restrictions involving conference affiliation (at the time), I have to include these teams in the model. Here are the 17 contenders in the National Champion Profile:



This table should be very familiar from the many articles I've written about the Championship Profile model. The only new column is 'Transfers,' located to the right of DQs. If you count horizontally for each team, you'll discover that transfers are not included in the DQ counts. Since I just added this metric to the CPM this year and I'm nervous it could break in its first year (especially considering how prevalent the transfer portal is in CBB), I have left it to the side. From this table and the disqualifier counts, here are the champion tiers:

  1. HOU and UCLA. If UCLA hadn't loss Jaylen Clark to injury, they would have had zero DQs because they would qualify as a 5O team. In fact, if UCLA were to win (with or without Clark), I would probably have to re-do the Post Production component again because several teams could be reclassified into a 6th archetype (2018 NOVA would be one team that could move in with UCLA). Since I'm really doubtful about post-Clark UCLA, I have to say the front-runner is HOU, with their only DQ being lack of competition. Yes, they have played ALA, UVA, STMY and MEMx3, but they weren't tested as much as 2014 CONN, who is the only non-P6 conference team to win a title. You may also notice an asterisk in their post-production qualifier. They actually have the taxonomy of 2012 UK, the ShDs and RbDs of 2007 FLA, and the post production of 2008 KU (all of which were 2W archetypes). As I teased earlier, I do have a metric available for conference affiliation of national champions, but at the moment, it is not an official component of the model (and if it was, HOU would have another DQ). Finally, HOU and UCLA had 13- and 12-game winnings streaks snapped with losses in their conference tournament finals. No eventual national champ entered the tourney with a winning streak that long (those streaks would be 14- and 13-game streaks had they won). 2013 LOU entered on a 10-game streak and 2008 KU entered on a 7-game streak. All others are 5-game streaks or smaller, with most title runs starting on a 0-game win streak (which both currently have). Good thing they both lost!
  2. ALA, KU, ARI, MARQ, UVA and MIA. Instantly, I would ignore UVA and MIA because they are lower in the efficiency ranks than previous champions (not a component in the model). I'm also worried about the injury status of MIA post-player Omier. KU has a tough path to the bracket, according to the OS/US model above. MARQ also has a difficult path but easier than KUs. ARI has a 2.5 because I only gave half of a DQ for being a 1-seed falling to a 2-seed. Only two NCs have fallen in seed-line from the previous year (1998 UK) and (2016 NOVA), and both fell from a 1-seed to a 2-seed like ARI. ALA checks off all the boxes except PG production. Only three NCs had fewer points than ALA and none had less than 3.5 AST, but I like them as a next-best option to HOU because they are one of only three teams to quality under post-production rules (HOU also being one). To be a champion, I think you have to be built like one.
  3. PUR, GONZ, CONN, and STMY. Instantly, I would write off GONZ and STMY as potential champs because of conference affiliation. If GONZ didn't win it in 2021 when they had every advantage in their favor, they may never win one. I don't like PUR's youth (and lack of athleticism) on the perimeter. CONN was playing like a national champion at the beginning of the season, and their flaws caught up to them. I feel like they've learned from their flaws, but that time spent in reverse may have cost them a title. I'm not sure about the fifth year rules for next season, but I think CONN can bring everyone back (assuming Jordan Hawkins doesn't go pro) except one bench player. If they make it to at least the E8 this year, then it removes a DQ for next year's team. The only thing remaining would be a conference title and team composition.

Over-seed / Under-seed -- Poll-based

Now, let's look at another OS/US model, but I warn you, this one is a little more risky than the other two. I've not been able to put my finger on exactly what this model is evaluating, but when nothing else works or something is needed to break a tie between two teams, the Poll-based OS/US model serves that purpose. Last year, it was a perfect 4-0 in R64 match-ups, and if it doesn't produce a lop-sided count in OS/US identities, it is fairly reliable. Let's see the model.

First things first, the model produced 18 OS identities and 10 US identities, which is an acceptable balance. The thing that catches my eye is only two OS/US match-ups. I'm always hoping for a lot of OS/US match-ups so that I don't have to work as hard figuring out the 50-50 match-ups. Anyways, the model favors CHRL over SDST and AUB over IOWA. Basketball speaking, you need to shoot good to beat the stingy defense of SDST and CHRL does shoot well (very few teams do this year). IOWA also shoots well, but AUB has good length and athleticism, which you need to contest shooters. They both have a margin of differential equal to 7 (4 -- -3 and 5 -- -2), so I'll probably go with both these predictions in my personal bracket. There are four match-ups where the teams are both OS or both US. This is what I've defined as a critical match-up, where one team has to win and one team has to lose, even though their US (OS) identity says they should both win (lose). There's nothing predictive about them, other than it says their R32 opponent is guaranteed to have either an over-seeded opponent (good news for PUR and GONZ) or an under-seeded opponent (bad news for KU and KNST). For the record, 2022 BAY, who was accurately seeded according to this model, was paired against two OS teams (UNC and MARQ) and lost their R32 match-up, so maybe not as good as assumed for PUR and GONZ.

Over-seed / Under-seed -- Conference-based

Next up, let's look at the conference-based OS/US model. Although I haven't had an opportunity to investigate the effect of unbalanced conference schedules on this model, it is a fairly reliable model when the number of OS/US predictions are under twenty. From the current count, it has eighteen predictions with a potential of two more depending on the outcomes of the Play-in Games (PIGs). Last year, the model was 14-3 with one no-decision due to a critical match-up (US USC vs US MIA) and all three misses came from the B10 conference.



The most important thing to remember about this model is predictions are based on achieving or failing to achieve seed expectations. For example, if a 1-seed is projected as a potential OS, then failing to achieve the F4 is a correctly predicted OS. Likewise, if a 1-seed is projected as a potential US, then reaching or exceeding the F4 is a correctly predicted US. The column headed E(W) is that seed's win-expectation. 1-seeds are expected to win at least four games, 2-seeds are three wins, 3- and 4-seeds are two wins, 5- and 6-seeds are one win, and 7- thru 10-seeds are 0.5 wins (the coin-flip seeds). You must also pay close attention to the wording of the notes. Most are worded as "OS Team A or US Team B". If either accomplishes the prediction, it doesn't matter what the other does. If Team A fails to achieve seed-expectations, then Team B can fail too. Or, if Team B achieves seed-expectations, then Team A can also succeed. Thankfully, there are no group-based OS/US scenarios like last year. Here's what I see:

  • If MIA wins one game, it doesn't matter what DUKE or UVA does. If DUKE fails to reach R32 and UVA fails to reach S16, then it doesn't matter what MIA does.
  • If PITT fails to win in play-in game, then DUKE and NCST both become OS and both must lose. If PITT wins the play-in game and its R64 game, then it doesn't matter what DUKE and NCST does.
  • If CREI wins one game, it doesn't matter what CONN does. If CONN doesn't win two games, then it doesn't matter what CREI does.
  • If NW wins one game, then it doesn't matter what IND does. If IND doesn't win two games, then it doesn't matter what NW does.
  • If ILL wins one game, then it doesn't matter what UMD and IOWA do. If UMD and IOWA both lose, then it doesn't matter what ILL does.
  • Since WVU and UMD play against each other and both are OS, it is easier for the model to be correct if UMD wins (which forces ILL to win) than if WVU wins. (This scenario leaves a high probability for a mis-pick.)
  • If ARI fails to reach the E8, then it doesn't matter what happens to UCLA and USC. If both UCLA reaches the E8 AND USC wins one game, then it doesn't matter what happens to ARI.
  • If TXAM wins one game, it doesn't matter what happens to UK. If UK loses, then it doesn't matter what happens to TXAM, but TENN also can't reach the S16. If TENN reaches S16, then UK and MIZZ must both win a game. (Another high-probability scenario for mis-pick.)
  • ARK identified as an OS and ILL looking like a potential US. This is what I would call a domino game for this model. If ARK wins, then ILL can't be US, so IOWA and UMD must both lose to confirm their OS identities, which means WVU wins and can't be OS. (A third scenario with a high probability for mis-pick.)
  • Considering that eighteen predictions exist with a potential two more dependent on the PIGs, this would give a total of twenty projections, which is our fail-safe threshold for model reliability. Logically speaking, the more predictions a model makes, the more potential there is for conflicting predictions, for which we already have three examples. I would not predict my entire bracket with this model, but some of the predictions (the domino-less ones) seem safer.

Quality Curve & Seed Curve Analysis

There's a lot to look at here, so I'm going to keep the intros and the filler to a minimum. First, let's look at the QC compared to its minimum and maximum values over the games since last QC analysis.



This is honestly the first time I have seen anything like this QC. Compared to its max/min values, the current QC looks like a roller coasting, repeatedly bouncing up to the max and then following down to the min. For documentative purposes, here are some important details:

  • The max of the Final QC is higher than the max of the Feb QC at #8-10, #14, #18-19, #22-24, #27, #34, #36-40, and #42 to #44.
  • The min of the Final QC is lower than the min of the Feb QC at #1-7, #17-19, #22-25, #29, #36-40, and #49-50.
  • The over-lap of these widening regions are #18-24 and #36-40. In both of these ranges, teams are playing closer to the minimum curve, with the exception of teams #22-24 who are playing at the midpoint of the max and min curves. For the record, these teams are UTST, MEM, ARK, DUKE, UMD, IAST, and KNST for the #18-24 group and USC, IOWA, (OKST), PNST, and MIA-FL for the #36-40 group.

I don't want to read into this situation because 1) I've never seen it before and have nothing upon which to base a hypothesis, and 2) There are a variety of interpretations. For example, these could be teams that are struggling (which is why they are playing at the minimum curve) and still headed lower, or these could be teams on the rise and filling in the gaps left by teams falling in the quality rankings, or these could be teams that have hit bottom and the tournament is a fresh start to turn upwards. I think a deeper dig into the eleven tourney teams could be a waste of time and probably an over-analysis of the QC and its function. Let's look at how the final QC compares to both the Jan and Feb QCs, both of which had their own issues.



Let's try to break down this eye-sore.

  • At #2, #7-12, and #46-48, the Final QC is above the Feb QC. Everywhere else, it is below.
  • At #9-10, #27-34, and #45-48, the Final QC is above the Jan QC, Everywhere else, it is below.
  • Overall, the quality of teams are lower today than at either time of QC analysis.

For historical perspective, let's see how the 2023 QC stacks up against the last five tourneys.


As the previous two QC analysis have pointed out, 2023 matches 2018 the most, and 2022 is the next closest. Both of those years had a M-o-M rating over 20% as well as 13+ upsets (I define an upset as a seed difference of four or more). Here are the comparative details:

  • 2023 is better than 2018 at #9 and #22-50, similar at #5 and #11-14, and lower at #1-8, #10, and #15-21. In simple terms, it is flatter than 2018, which implies more chaos than sanity. I've also realized that flatter curves produce more upsets in later rounds (2nd weekend upsets) than in early rounds (1st weekend upsets). For example, the round-by-round upset count of 2018 was 5-5-3-0-0-0. It was only the fourth time since 1985 that there were three upsets in the S16 (1990, 2000 and 2002). 2022 also had three upsets in the S16.
  • 2023 is better than 2022 at #2-3, #5 and #16-50, similar at #4 and #15, and lower at #1 and #6-14. If the same logic holds, 2023 is slightly steeper than 2022, so it should be a little more sane than 2022 and could be implying fewer upsets in the early rounds. The round-by-round upset count in 2022 was 6-5-3-0-1-0, so based on both curves, 2023 could produce something like 5-4-3-1-0-0.

The only remaining piece of the puzzle: Did the Selection Committee properly appraise quality?



Here is the 2023 Seed Curve (SC). Yikes! 

  • Spikes at #2, #4, #6 and #8-10. In 2022, only the 5-seed, 8-seed, and 10-seed of the SC were above the QC, and all three met or exceeded group-based seed-expectations. 5-seed HOU knocked off 1-seed ARI, 8-seed UNC knocked off 1-seed BAY and ran to the title game, and 10-seed MIA ran to the E8 and was the only 10-seed to win. I'm not sure the same fate exists for this year's out-performers since 4-seeds, 8-seeds and 9-seeds are all competing in the same path for wins. So, let's look more into that path.
  • 1-seeds are facing better seeds disguised as 4-, 8- and 9-seeds. In 2018, the problem was weak 1-seeds, where two of the 1-seeds were 3- and 4-seeds in disguise (the first was upset by a 9-seed and the second made the F4). This problem of under-seeded competitors is more reminiscent of 2014. Two actual 4-seeds were 1-seed and 3-seed quality in disguise. Two actual 8-seeds were 5-seed and 6-seed quality in disguise, as were two 9-seeds. The 8-seed with 5-seed quality upset the lowest (UK) ranked 1-seed (WICH) (as well as the 4-seed with 1-seed quality in the next round). I hate to say it, this exact scenario exists in the 2023 bracket: 1-seed KU is the lowest-rated 1-seed, with 8-seed ARK as a 5-seed in disguise along with 4-seed CONN as a 1-seed in disguise. Good job, Committee (sarcasm implied)!!!!!
  • The seed-curve doesn't hold any regard for the 3-seed, 7-seed, 11-seed or 12-seed groups. Given the trouble that 2-seeds have had in the R64 for the past two years, maybe that curse is on the 3-seed group this year. With 10-seeds above expected quality and 7-seeds below expected quality, maybe the 1-3 record from 2022 flips around in 2023 for the 10-seeds. Likewise, stronger than expected 6-seeds against weaker than expected 11-seeds could flip around last year's 1-3 record.
  • Yes, I'll say it: The Selection Committee dropped the ball this year, but they seem to do it every year, so 'they dropped it harder than usual' is probably more accurate. 2023 has weak 1-seeds (well, weakness up and down the curve), poor shooting metrics, and an inept Selection Committee -- and 2014 had these things too (along with the beginning of the horrendous Freedom of Movement philosophy). 2014's round-by-round upset count was 6-4-2-1-2-0 and its M-o-M ratings was 21.35%, both in line with those of 2018 and 2022. Enjoy!!!!!

Return & Improve Model

The assumption behind this model is chemistry can contribute to a deep run in March. If a team returns a certain threshold for key metrics, then they should improve upon the previous season's tournament performance. Here are the historical probabilities for this model.


For the most part, points matter (and the various metrics that go into points -- fields goals, three-pointers and free-throws), with steals and minutes being the second-most important. In basketball, I would assume it is harder to generate points than defense, so returning a high percentage of your team's offense from the previous season should be of the utmost value. First, the probabilities get a little wonky at returning 40%. This may be the result of other factors (like talent, either one-and-doners or the transfer portal) influencing the improvement instead of chemistry. The data goes back to 2003 for most teams, and as far as 2001 for others, but maybe in the near future, I will re-examine the model to see the results with a recency bias. For notation, >90R# is the number of teams that return at least 90% of a specific stat category, >90R&I# is the number of teams that return at least 90% of a specific stat category AND improved on the previous year's performance, and 90R&I% is the latter divided by the former. Each successive row takes out the previous row's counts, so 80 only looks at returns in the range of 80.00-89.99%. Finally, the model does not include data from non-power conference teams, so GONZ and STMY do not have any data in the model nor are they examined below.

For documentation purposes, the only uncertainty in the 2023 data is the status of UK's Sahvir Wheeler. At the present moment, he is counted as a returner for UK, as the latest reports suggest he could play in the tournament. If it is discovered afterwards (either if this information was wrong or UK doesn't go deep enough to give Wheeler a chance to return -- similar to 2014 DUKE's Kyrie Irving) that he is not a returner, then UK's percentages will fall by approximately 12% in every stat except BLKs (no shocker there). Let's see the return and improve candidates for the 2023 season.


The data is sorted in descending order, with the emphasis on the PTS category. I did this manually because I didn't want Excel auto-sort to screw up my hard work. Here's what I see:

  • TCU is the only team to return 80% of their points and steals, and most other categories. This model suggests they have a 60% chance to advance to the S16, an improvement over the R32 run in 2022. The only concern I have with TCU is their battle with the injury bug all season. They've played several different lineups throughout the season, and just recently, they had a player leave the team for mental health reasons (or else they would be close to 90% return).
  • The next team is IND who returns 60%+ of their points and minutes and 58% of their steals. The model suggests only a 38% chance to return and improve, but since their threshold of improvement is simply winning one game, it seems doable.
  • The next group is AUB, IOWA, CREI and TEX, who return 50% of their points. The model suggests a 37% chance of improvement. CREI and TEX both need to win two games to achieve improvement. AUB and IOWA are tourney opponents, which mean one has to lose (aka - a critical match-up). AUB must win two whereas IOWA only needs one win, but if AUB defeats IOWA and then loses to HOU, then both fail to improve. I also want to point out that both AUB and IOWA fell in seed line (which is usually a bad omen in this model) whereas TEX and CREI both improved upon their seed line.
  • The final group is TENN, MIST, USC, UK, BAY, MIA, ARI and MARQ, who return 40% of their points. The model concludes that this results in a 38% chance of improvement. Since we have eight teams, this means three of these teams -- on average -- will improve upon their 2022 tourney performance. MIST and USC play in a critical game. Only MARQ and MIA improved their seed line of these eight teams. MARQ only need one win to improve (this is probably one of the probable three) whereas MIA needs four wins (this is probably one of the five failures). UK plays in critical match-up against PROV, who only returned 25% from last year. UK only needs one win to improve whereas PROV needs three.
  • The one that concerns me the most is obviously HOU. They only return 30% of their previous season's production, which is a combination of four players graduating and two starters coming back from injury-shortened 2022 seasons.
  • Three of the four 1-seeds are in this range as well, with KU barely returning 25% of last year's championship production. Only one defending champion has advanced past the S16 since 2001, and it was 2007 FLA, returned 90%+ of their 2006 championship team and repeated. This is another strike against KU.

Meta Analysis

What qualities (four factors) do tournament teams possess and which teams possess the qualities to beat them. In theory, if your opponent is bringing rocks, you want to bring paper, and so forth for paper/scissors and scissors/rock. First, I want to look at the historical comparison of four factors.


The top half of the chart is all 66 teams remaining in the tournament (the data of teams in Wed play-in games are still included). There's nothing really that stands out from the annual decline in quality metrics. In the lower half of the chart, I took out the values of the 13-16 seeds to avoid distortions by lower-quality competition. The stats in green improved significantly, meaning the Top 12 averages are better than the Top 16 averages. For 2PD% and 3PD%, the increase is actually a bad thing because it means defense is worse. With historically weak 2PD%, a potential meta play is 2P% teams. Let's take a deeper look into this tournament's meta.

This chart shows how many Top 20/40/60/80/100 of that stat is present in the 2023 tourney, and the totals are cumulative (so 21-40 is actually 21-40 added to 1-20). Now, the real matter of importance is the red and green. Green represents historical high counts, and red represents historical low counts. 2023 posting four historical lows. 

  • Even with weaker 2PD% (from the first chart), 2023 has 22 of the Top 60 2P%  teams to go along with 19 of the Top 60 3P%. As I've said all year, shooting is down this year and even fewer of the elite shooting teams made the tournament.
  • To make matters more confusing, 2P%D and 3P%D counts are close to historical lows. This leads me to think elite shooting teams are worthwhile picks. With fewer teams to match their elite shooting and fewer teams to "elitely defend" their shooting, elite shooting teams are a meta-play. These teams would include 2-seed ARI (9,14), 15-seed COLG (7,1), and 3-seed GONZ (2,12) with (2P% rank, 3P% rank) as my notations (and each of these have conflicts with other models). I'm less confident in ARI's shooting numbers because of certain Ls this season where they went cold. Also, ARI probably faces UTST (34,9) in the R32. Other (less elite) shooting teams include ORAL (9,34), MIA (21,41), XAV (37,3) and a long stretch in PNST (51,9).
  • The other noteworthy meta-play concerns turnovers. 2023 has a record of 20 Top 60 TORD, which means 1/3 of the teams in the field defend by taking the ball away from the opponent. Looking at the elite teams (Top 30), five are in the HOU regional, four in the KU regional, two in the PUR regional, and one in the ALA regional. There's no better way to stop a good shooting team than taking it away from them before they shoot it. If you want to play against the meta, good shooting teams with elite ball security (in order to get off a shot) may be the way to go. Of the shooting teams listed, GONZ (10th), COLG (21st), PNST (5th), and ORAL (1st) in TOR.
  • For reference, last year, I called for 2P% and TORD to be the meta-plays, and predicted E8 runs for TEX and MIA. MIA worked, TEX not so much, but they would have made a better opponent for St Peter's than PUR did.

Seed-Group Loss Table

Since I didn't get to do extensive back-testing on this model and since there aren't any clear and obvious signals, I'm not going to post a full analysis with this model. Here are some brief takeaways though.

  • The data does not include 2021 as many games were cancelled due to health concerns, and this has a significant impact on loss totals.
  • 418 total losses among all the 1- thru 12- seeds. This is the 2nd highest total, the highest being 2018 with 419 and the third-highest being 2016 with 417. Not good company for 2023, and the fourth-highest is 2011 with 406. Two of those three years didn't have a 1-seed national champion. 
  • Speaking of 1-seeds, they recorded their second highest loss total with 20 total losses for the year. This ranks 2nd to 2016, which had 23. These 20 losses count for 4.79% of the 418, with only two years posting a higher percentage: 2003 and 2016. Both years only had one 1-seed in the F4.
  • 3-seeds set a record year for losses with 33. However, those losses only accounted for 7.895% of the 418 total, which ranks fifth. This speaks to how weak the whole field is in 2023: The most losses ever by the 3-seed group but only ranks fifth in percentages-vs-field.
  • 4-seeds tied two other years for most losses with 44. 6-seeds set a record with 47, and 8-seeds tied a record with 45. You would think 2023 would have shattered the all-time total with this many seed-groups hitting record highs, but the 5-seeds and 12-seeds tied record lows with 28 and 21 losses, respectively. These new highs and new lows are another reason why I don't trust the model just yet because I'm not sure it is reliable in normal years, let alone crazy ones like 2023.

This is all for tonight. I will try to get on in the morning and post my final predictions, but I still have a lot of work to do on my bracket and see if there are any worthwhile contests to enter.

Final Predictions

National Champ: HOU with ALA, UCLA and UK in F4 (somebody from the bottom-half of that region gets a cake-walk to the F4, probably PROV since I chose UK).

Model Picks: 

  • CHRL over SDST and AUB over IOWA, Poll-Based OS/US
  • TCU to S16, Return & Improve Model
  • UTST over MIZZ, Predictive OS/US
  • R64 Ws for PNST and ORAL
  • KU not to advance past S16 (actually have them losing to ARK).
  • Round-by-round upsets: 5-4-3-0-0-0 (really hard finding upsets given the teams I advanced). 
  • I wouldn't be surprised if our first 14-seed made the S16, but I wasn't comfortable with any of them against 6-seeds. I also wouldn't be surprised if a 1-seed didn't win this year. But lack of quality on the top 3 seed lines suggests 1-seeds are still safest route.
  • For the first five years, I went with curve-fitting strategies to pick my bracket. This year and last year, I went with a conference-based strategy. I haven't shared any details of this strategy, but the inspiration for it can be found in the wikipedia pages of the tournament years (just in case you want to get a head start on me next year!!!). Anyways, I expect good things from the SEC. They have favorable pathings (ARK, UK, ALA), they have several over-seeded opponents (one of which you already know I have knocked out in the first-round), they are due for a bounce-back year after six teams in 2022 (all 1- thru 6-seeds) and only five wins to show (three by ARK alone). That should explain my F4.

This was the first year where I thought my models didn't provide a lot of certainty/clarity for the tournament. It's why I spent a lot of time chasing down rabbit holes to no avail. It could just be the lack of quality in the tournament, and not lack of quality in the models. Only time can answer that question. Good luck, hope I could help this year (although I don't feel like I did), and as always, thanks for reading my work.

Mar 6, 2023

Pre-BCW Teaser

After a few attempts at methodological improvements and a few rounds of deliberation, I've decided that a full review of and report on the SGLT will be near-impossible to complete in the two weeks before Bracket Crunch Week. Instead, I'll do a quick-hitter article, which ties a few loose ends together while giving me the time and flexibility to get ahead of BCW. I'll start with a status-check of the SGLT, followed by some teaser work for BCW.

Seed-Group Loss Table

This has been a pet-project of mine for quite some time. I kept the data for it since 2006 on bracket worksheets, but I didn't put it together into a predictive model until 2019 (Part 1 and Part 2). Then, 2020 happened, and 2021 wasn't any better for a loss-based model given how many games were cancelled that season. Although I spent this time converting it from pen-and-paper to digital spreadsheet, yours truly discovered a few typos from the process and also misapplied the tool. Nonetheless, it did well at identifying seed counts, but win totals for seeds was a different story. So, where does it stand today?

First, I can say that all data up to the 2022 tournament is correct and in spreadsheet format. So, everything from this point forward is methodological testing. Second, the model for 2023 is best-used as a tie-breaking model. In simple terms, if a prediction is uncertain or two reliable models predict opposing outcomes, using it is better than nothing or a coin flip. Also, it is better situation for predicting F4 and E8 targets rather than R32 and S16 targets. In 2022, I erroneously applied it to the latter and the results were disastrous. Finally, the matching method is currently the preferred method, as I've been unable to do any testing with the regression method. It requires a little bit of logic (and validity-testing as well), but its results are the only ones I would even remotely consider to be reliable. All in all, if I post the results of the SGLT to the final article, it will be a Wed night or Thurs morning submission (plus I'd probably be waiting on the results of the play-in games again).

BCW Teasers

First, let's start with an update to the Experienced Talent Model. At the beginning of every season, we use an experience estimator for the current season (every player gets +1 added to their previous season's score). This produces a value which is akin to a maximum ET score. As the season progresses, experience scores are reevaluated to reflect the true experience gained from the current season. This revision almost guarantees lower ET scores across the board. First, let's look at the changes.

First, there are a lot of swings in the rankings. For the teams that moved up, their ETM score probably didn't change. It had more to do with the ETM scores of neighboring teams falling. Falls can be the result of many things: Injuries, lack of playing time, not even playing at all, or -- in one case -- a player being assigned to the wrong team (Detroit is a good example). The only injury unaccounted by the model is UCLA's Jaylen Clark, one because it is new (after I finished the revised scoring on Wednesday) and two, it won't send UCLA too far down the list. Second, my pre-season predictions weren't up to par, but then again, this season hasn't been up to par either (spoiler). In my analysis, I did say CONN, MIST and ORE are more likely to make it, and the rest of the question marks are likely to miss. These are looking highly likely. All of my missed predictions are teams that fit the other profile: They should make it but now look likely to miss (UNC, NOVA, DAME, and FLST).

The real question: Why does this model matter for March? Let's take a look.



The ETM Top 25 has been a great tool for identifying sleepers.

  • 2016: 10-seed SYR ranked as the 4th overall ET team and ran to the F4.
  • 2017: 4-seed FLA to the E8 and 7-seed MICH to the S16, which is good for a low-upset year.
  • 2018: 7-seed TXAM to the S16
  • 2019: 5-seed AUB to the F4 and 12-seed ORE to the S16, both good for a low-upset year.
  • 2021: 11-seed UCLA to the F4 as the 9th-ranked ET team, and 6-seed USC to the E8.
  • 2022: 8-seed UNC to the NR game, 10-seed MIA to the E8, and 11-seed MICH to the S16.

Without the actual seeds for 2023 revealed, I'll have to use the bracketology projections. This means our sleeper pool contains MIST, UK, DUKE, and USC (the first three as the most likely candidates to make at least an E8 run). ORE, AZST, TXTC and NCST could also join the sleeper pool if they get invited to the tourney.

That's all I'm going to spoil for the ETM, so let's move onto one more tool in our toolbox. The Conference-based OS/US model is in for a doozy this year, which is why I'm getting an early start on it. The B10 is the biggest problem for this model. 2nd place in the B10 has a 12-8 record while 12th place has a 9-11 record. With 11 teams separated by only 3 games, their seeds have to be within 1-2 seed lines of each other or else the model will predict a ton of OS/US possibilities. The ACC and the SEC aren't much better. It could be one of those years like 2016 and 2018 where the model produces too many contingencies and the results of the model become self-contradictory. I called this situation an over-load (OS/US models can do this), and it makes the model less accurate in those years.

Before I start getting any more wordy for a quick-hitter article, this should hold us over for the next six days. I'm honestly thinking this year might be a nightmare. Until Selection Sunday, thanks for reading my work and I'll see you then.

Feb 27, 2023

Post-Player Production of Past Champions: The Archetypal Approach (Part 2)

I'm back with Part 2 of my study on post-player production of national champions. If you have not read Part 1, it is not necessary to have read it in order to understand the concepts and/or analysis presented in this article. However, I feel it is a worthwhile read as a contrast to this article, and I may make a few fleeting references to Part 1 concepts in this write-up. Let's start at the same starting point as last article, and we'll move forward from there.



The table above is the National Championship Profile model with post-production filled-in and all other components left blank. The values are the same values from that article, but since these two articles are a rebuilding job, these values will be corrected at the end of this article. Also, the table only goes back to 2001 because of data reliability (I have data back to 1998, but I'm only confident in the data up to 2001, and it's the data which will be used in both articles). The archetypal approach is represented by the left column labelled POST, and it concerns the function of post-production as a part of the team's dynamic. In simple terms, we will be looking at the whole picture of the national champion and evaluating post production's share of the whole. Instead of the O (Offensive-oriented post-production), D (Defensive-Oriented post-production), or B (Both offensive and defensive post-production), new archetype names and labels will be created.

Similar to the first article, the hardest part of these write-ups is the organization/presentation of ideas. In Part 1, I originally wrote an article that felt more like a novel in length, so at the last minute, I decided to reorganize the paragraphs and the flow of ideas. This resulted in a three-day delay in submission, but it halfed the read time. Again with this article, the organization of ideas is a nuisance, but I think the easiest layout for Part 2 is to detail the parameters first, then unveil the archetypes with depth of analysis, and finally a general wrap-up to the concept with all remaining questions answered.

STEP 1: Parameters for Qualifying Players

Like I did with the taxonomical approach, I wanted to start the archetypal approach with an minimum level of participation or utilization. If a player doesn't measure above the utilization parameters, they are not counted as part of the team's rotation. I required qualifying players to meet two utilization parameters:

  1. Play at least 10 minutes per game (I kept some players who averaged 9.8 MIN per game only because I'm not sure if it is a rounding error, and I kept one player on 2015 DUKE with less than 9.8 MIN per game because of a mid-season suspension to a rotation player and its effect on the MINs, FGAs and PTS per game), and
  2. Have a shot distribution > X%, where X = (Rotation Position -1) x 0.4. If a player is ranked 7th in order of shot distribution, then his share of his team's shots should be greater than 2.4%, which can be calculated as (7 - 1) * 0.4. 8th would be 2.8%, or calculated as (8 - 1) * 0.4. 9th would be 3.2%, or calculated as (9 - 1) * 0.4. When I show the full ordered table in Step 3, you'll see more visually why this matters.

Shot distribution (ShDs) is just a shorter way to say the percentage of a team's shot attempts that a player takes. I also calculate Free Throw Distribution (FTDs) and Rebound Distribution (RbDs), which is the share of free throws and rebounds, respectively, that a player takes. These three metrics will play a critical role in defining archetypes and post-player function in these archetypes.

STEP 2: Team Archetypes and the Function of their Post-Production

In the original National Champion Profile article, I talked in the Intro about how 2021 BAY resembled the composition of 2019 UVA, but they were a more consistent variant of 2019 UVA in a weaker 2021 field. The resemblance of the two teams is why I added point guard production and post production to the National Champ Profile model. If I had went into far more detail like this reconstruction effort is doing, I probably would have made less errors in that model in the Final 2022 write-up. That is in the past, but I wanted to start with these teams as the first archetype, and by referencing my history on the subject, I wanted to document that I've been doing this well ahead of the ESPN half-time stats analysis.

Big 3 Archetype (B3): In the old notation, most of these teams were the D-archetype (Defensive-oriented post production). Now, they will be known as the Big 3 archetype, mainly because these teams have the same metrics in common with each other:

  • With some player rearrangements (2014 CONN Daniels, 2016 NOVA Jenkins, and 2018 NOVA Bridges as SF instead of F), the Big 3 Scorers look like the PG-SG-SF roles and the remaining post-players as a Big 3 of their own in the post.
  • The highest of the Big 3 scoring at least 15.5 ppg or at least 19.8% shot distribution.
  • The lowest of the Big 3 scoring at least 11.8 ppg or at least 15.7% shot distribution.
  • The drop-off in shot-distribution from third to fourth is at least 2.8% (2018 NOVA) but can be as much as 15.7% wide (2010 DUKE). Most of the 3rd-to-4th drop-off is in the range of 4.1-8.6%.
  • The Big 3 Scorers all had 3P% >= 0.359, and for five out of six teams, they were also the Top 3 in FTDs.
  • The Big 3 Post combine for approximately 18.5-22.3% of the ShDs (if we count 2018 NOVA as an anomaly), and this averages out to about 7% per post player.
  • The Big 3 Post also combine for 3.0 - 7.5 ORBs per game, but no individual post-player averages more than 7.9 RBD per game.

Here are some other details about this archetype:

  • All occurred from 2010 and later, which is one year after the 3-point arc was extended to 20'9". Also, five of the six occurred from 2014 and later, which is the year that the (horrible) Freedom of Movement philosophy was instituted. Imagine that, a rules philosophy dictating how a team should be built and how they should play in order to win the game.
  • RbDs is not consistent among the six teams. For 2014 CONN and 2019 UVA, the Big 3 Scorers own a major share of the rebounds, which is counter to the other four teams in the group. For documentation purposes, I debated segregating these six teams into three archetypes along this discrepancy: 2010 DUKE and 2021 BAY as one, 2014 CONN and 2019 UVA as one, and 2016 NOVA and 2018 NOVA as one. I didn't like this approach because it gave the appearance as sorting by team/coach philosophy rather than metrics.
  • 2014 CONN and 2018 NOVA are the obvious anomalies in this archetype. 2018 NOVA because they had the smallest 3rd-to-4th drop-off, which was a 2.8% difference compared to the other five teams in the group (4.1%, 4.9%, 7.7%, 8.9%, and 15.7%). Only one other team not in the B3 archetype had a 3rd-to-4th drop-off higher than 2018 NOVA (2017 UNC - 3.2%), and for reasons to be demonstrated in the next group, 2017 UNC and 2018 NOVA are not grouped together. There is a good chance, depending on who wins the 2023 title, that this group alone gets an overhaul in the off-season. 2014 CONN probably has the best match to the statistical mean of the B3 parameters, they don't match the statistical results of the group (RbDs is more guard-oriented, lacking a third post-player in the rotation, and highest BLK total by a longshot). If there was another group that 2014 CONN could belong, it would be the next archetype.

So, let's look at the next archetype group.

Go-To Player Archetype (Go2): In the old system, this group contains teams from all three labels (O, D, and B). Here's what these teams have in common:

  • The leading scorer on the team took at least 23.7% of the team's shot attempts, and the next highest player took at least 6.5% fewer. The next-closest team to all of these parameters is 2014 CONN (as stated in the previous group) with only a 23.4% ShDs by the leading scorer and only a 3.8% drop-off from 1st-to-2nd.
  • Three of the Go-To players were SFs and three were SGs (2014 CONN's Go-To was a PG).
  • The Go-To player either had the highest FTDs on the team as well or had a 3PR >= 0.431 (the more jump shots a player takes, then the less likely they are to be fouled and get FTs).
  • The two starting post-players on Go2 archetypes have at least 13.2 RBDs per game, with five of the six teams having one of those two starters with at least 8.2 RBDs per game. Looking closer at the 13.6 combined RBDs per game, at least 5.7 of these are ORBs. If the Go-To Player is taking more than 1/4 of the shots and missing more than 50% of their shots, there should be plenty of ORBs to go around. (Note: Since there is little difference in minutes played and shot distribution for 2003 SYR post-players, I just counted the post-production of the highest two producers).
  • The next three highest in ShDs form a secondary Big-3 to the Go-To player, with the highest ranging from 14.5-18.5% ShDs and the lowest ranging from 11.8-14.6% ShDs. On every team in the Go2 archetype, one of the top in the secondary-3 is a post-player, and in four out of six teams, two of the secondary-3 are post players.
  • Four of the six teams have post-production that could be described as a block-party, defensively speaking, as their total combined BLK is at least 3.7 BLK per game. Ironically, all four of these teams played before the Freedom of Movement rules philosophy, which turns legitimate blocked shots into fouls on the defender for the sake of artificially increasing box scores.

If there is one anomaly in this group, it would be 2003 SYR. For starters, it is one of three teams whose leading RbDs player is not a post-player (the other two are 2014 CONN and 2019 UVA). Unlike those two teams, the next three out of four in RbDs are post-players, so they can't be grouped with those teams. Likewise, 2003 SYR's Carmelo Anthony fits the description of a Go-To player, and they had a secondary Big 3, not a primary Big 3 like 2014 CONN and 2019 UVA. Their numbers could also be a function of being the only national champion on the list to play a zone as their primary mode of defense. The biggest differentiating factor is the ShDs drop-off from 4th-to-5th and from 5th-to-6th are the largest in the group, with 2003 UMD being the next closest to either of these drop-offs.

If there is another archetype that either of these two teams could belong, it would be the next one.

Two-Way Archetype (2W): In the old system, these teams could easily be classified as a B-archetype, although I think I settled on O-archetype for a few of them. Here's what these teams have in common:

  • At least five players that each take at least 12.4% of their team's shots (in the case of 2012 UK, they have six players above the 13.8% mark in shot distribution, which almost breaks the bounds of mathematics to achieve and only the two NOVA teams come close to resembling it). With this many players getting this many shots, it is no surprise that each team has at least five players averaging 9.5 PTS per game and higher.
  • Four of the five teams (2004 CONN) have each of their Top 4 in FTDs above 14.3%. Not to mention, three of their top 5 players in FTDs are post players (2012 UK is this exception).
  • All five teams have their primary two post players collecting 40.5-54.9% of the team's rebounds. Not only is their top 2, but three of their top 4 in RbDs are post-players (again 2012 UK).
  • All have two post players averaging at least 1.3 BLKs per game, and the entire post producing at least 3.2 BLKs per game.
  • Four of the five teams have an even distribution of AST (only 2004 CONN having 11.5 AST concentrated to just two players).

The teams in this group are probably the most balanced teams to win a championship. The numbers show they can play offense and defense, both inside and outside, and with any player able to score, they should be able to take advantage of the opposing team's worst player. Both 2004 CONN and 2012 UK appear to be outliers in this group. For 2012 UK, it has a lot to do with being a 6-man rotation with a sparingly used post-player as a 7th in the rotation. It does a lot to explain why six players can achieve such a high threshold in ShDs. 2004 CONN is anomaly in its own right. They share similar stats with the Go2 archetype if not for the high ShDs of Okafor (2nd highest on team), only 2.0% less than Gordon. Likewise, 2002 UMD and 2003 SYR could be in this group if not for the wide drop-off in ShDs from 1st-to-2nd. 2004 CONN seems to have a Big 2 with a secondary Big 3 in ShDs (almost like a poker Full House Archetype), but this archetype is not shared by any other championship team. There are two other teams that are somewhat close to these metrics, but they do not share the same stat levels as 2004 CONN, or any team in the 2W archetype.

Let's look at these two teams and the next archetype.

5-Man Offense Archetype (5O): In the old system, these teams would have received the O-archetype. Here's what defines them, yet separates them from other archetypes:

  • Five players that each take at least 13.8% of their team's shot attempts (1.4% higher than the 2W group) while each scoring at least 11.2 PTS per game (1.3 PTS per game higher than the 2W group) with one player scoring at least 20.8 (only 2004 CONN of the 2W group is close).
  • Coincidentally (since two is a small sample size), both teams had their two primary post players collect 40.8% of their team's total rebounds. Both had four players each collect at least 1.7 ORBs per game, which amounts to an average of 6.8 extra shot attempts per game.
  • Both teams only have one post player averaging more than 1.3 BLKs per game (unlike the 2W group which had two). The lack of interior shot blocking is compensated by five players averaging at least 1.0 STLs per game. With five extra shots coming in transition, it's easy to get at least 11.2 PTS per game while keeping your opponent off the scoreboard.
  • Both teams have a concentrated distribution for AST, with the majority of AST shared between one or two players (2004 CONN of 2W had this trait and the ORBs trait, nothing else though).

With the exception of these schools being arch-rivals, the parameters and metrics tend to fit. In fact, only three national champs have players who PTS per game is more than their ShDs (as long as ShDs is greater than 6%), and two of those three teams are 5O-archetypes. Boozer took 13.8% of 2001 DUKE's shots but scored 14.0 PTS. Likewise, Tyler Hansbrough and Ty Lawson's PTS are above their ShDs figure. When PTS>ShDs, it suggest high efficiency in scoring (in other words, these players probably lead in points-per-shot-attempt among all players studied). Unlike their offensive prowess, they don't have the defensive stats to qualify as a 2W archetype, especially in BLKs. Likewise, they don't have a high enough difference in 1st-to-2nd ShDs (2.8 and 2.0, respectively) to qualify as a Go2 archetype. If there is one team in other groups that could resemble this 5-Man Offensive assault, it may be 2018 NOVA, but they lack the ShDs, FTDs and RbDs from post-production to resemble these two teams (probably a product of different eras of basketball).

Let's look at the last two teams and see what separates them from everyone else.

Post-Island Archetype (PI): They say that no man is an island, but these two teams had a center who  defy this wisdom. Here's what these two teams and their "island" have in common.

  • A post-player who is 2nd in ShDs with at least 17.8% of the team's shot attempts, at least 25.7% of the team's FTDs, and at least 29.0% of the team's RbDs. If I remember correctly, the only other team that could qualify on this alone would be 2004 CONN (which is their 3rd possible group).
  • Both teams also have all of their post-players in the Top 5 of their team's FTDs.
  • Both teams also have all of their post-players in the top spots of RbDs (no post player is out-rebounded by a non-post player).
  • The lack of shot-blocking is a peculiar feature, as neither center has more than 1.4 BLKs per game. This could be a product of their importance as an interior presence: By not taking risks of fouling, the BLKs stat is sacrificed in order to remain on the floor. (2004 CONN has BLKs).

Since 2005 UNC is the other team that features a player (Marvin Williams) with a higher PTS total than their ShDs, it is very likely they could transition to the 5O archetype (with some tweaking of the parameters of course). If another team were to win a national championship as a PI archetype, it would also make sense to move 2004 CONN to this group (with 2015 DUKE and the new team) and set tighter parameters to distinguish this group from all others (which means ensuring that 2005 UNC qualifies more as a 5O archetype than a PI archetype).

STEP 3: General Review of the Archetypal Approach

I would suppose that the best start to this review would be an overall view of the ShDs, FTDs, and RbDs to visualize how each archetype develops around these three parameters.

Each player is ordered highest-to-lowest (left-to-right) for each of the three parameters. To give perspective to post-production, post-players are marked with yellow boxes and small forwards are marked with green boxes. Notice how post-production is utilized by each archetype:

  • B3: Post production is in the bottom-half of ShDs and FTDs, yet it is stretched across RbDs. In the B3 archetype, the post functions to defend the interior while claiming missed shots, both on offense and on defense.
  • Go2: Post production is in the 2nd quarter (secondary to the Go-To player) and the 4th quarter of ShDs (reserves also taking on the supporting role), spaced throughout the FTDs, and heavily concentrated in the upper-half of RbDs. In the Go2 archetype, the post functions as a supporting role on offense (to take some of the scoring weight) while securing any misses by either team.
  • 2W: Post production is two of the top 5 in ShDs, all in the upper 2/3 of FTDs, and mostly occupying the top spots in RbDs (usually one top spot occupied by the starting SF). In the 2W archetype, the post functions as 40% of the minutes/rotation and 40% of the scoring option (a balanced distribution), but still defend at a high level (BLKs not shown) and finishing off possessions with a rebound.
  • 5O: Post production is two of the top 5 in ShDs, two of the top 5 in ShDs, and all in the tops of RbDs (especially when considered on the basis of per-minute played) with not as much help from the SFs. In the 5O archetype, the post functions as two-parts of a well-oiled, high-octane, 5-part scoring machine while dominating the glass on both offense and defense so that SFs can be more involved in the transition game rather than the half-court rebounding game.
  • PI: Post production is occupies the 2nd spot in ShDs with very little drop-off from the 1st spot, takes 1/4 of the team's FTDs, and gets at least 10% more of the team's rebounds than the next post-player. In the PI archetype, the post functions as a one-man island (usually a center) whose presence on the floor dictates the team's will through high shot volume, high scoring potential (from the floor and the FT line) and high pursuit of missed shots.

Other details regarding post-production include

  • Only two teams have a post-player as the top player in ShDs: 2009 UNC's Tyler Hansbrough and 2012 UK's Terrence Jones (big surprise that it isn't Anthony Davis).
  • Two post-players in the top 3 of ShDs have only happened twice: 2002 UMD and 2005 UNC, both of which seem like eternities ago in college basketball (I wouldn't bet on a team in 2023 or the near future with this attribute).
  • Only one team featured three post-players in the top 5 of ShDs (2005 UNC), and I would follow the same advice as the previous bullet point for this attribute.
  • Two post-players in the top 4 of ShDs have happened four times, and three of these four were the Go2 archetype. Ironically, in the Go2 archetype, the Go-To player is not a post-player.
  • Only three teams feature a top RbDs player that is not a post-player: 2003 SYR, 2014 CONN, and 2019 UVA. 2003 SYR's Carmelo Anthony was the prototypical Go-To player, so I wouldn't be worried about this one. 2014 CONN didn't have a post-player until the 5th spot in RbDs (a lot of this depends on how we classify 2014 CONN's DeAndre Daniels, which I felt was a struggle in both articles because of the various starting lineups of that team). 2019 UVA also didn't have a post-player until the 5th spot in RbDs, but this may have a lot to do with the pack-line defense.
  • Only in the B3 archetype did post-play have no importance in FTDs, probably because it was all about the team's Big 3 scorers.
  • Only two teams featured a 9-man rotation (according to the 10 MINs & X*0.4 parameters), whereas the rest were either 7- or 8-man rotation (although, you could potentially count 2001 DUKE and 2012 UK as 6-man rotations when basing it on the distribution of minutes played). When shots are distributed evenly, 9-man rotations would each get 11.1%, 8-mans would get 12.5, 7-mans would get 14.3%, 6-mans would get 16.7%, and 5-mans would get 20.0%. I found it interesting how close some of the actual cut-offs for ShDs mirrored these cutoffs for even distribution. Just look at each column in the ShDs in terms of how many are above and below these thresholds and the width of the margins on both sides of the cut-off points. It's why I stated that UK's 7-man rotation almost breaks the bounds of mathematics as six of their players were above the 13.8% level. In fact, their top five in shot distribution almost falls in line with the 6-man and 7-man even distributions. It is why I feel that I made a good choice in using ShDs as one of the factors in identifying a team's player rotation.

Larger Concerns for the Archetypal Approach:

  1. This is a lot of detail for one component of one model for 6 wins in bracket prediction. In the 2022 Final Analysis, the NCP model looked at all of the components of 17 contenders. In some years, this count can be over 20 contending teams. To do this amount of work for this amount of teams in such a short time as BCW, it is akin to fitting a watermelon through the eye of needle.
  2. The other pressing factor is the evolution of the game and its impact on the adaptability of the component (regardless if it is the Taxonomical Approach or the Archetypal Approach). As the game evolves with new (and sometimes stupid) rules and offensive/defensive philosophies, new archetypes will emerge. In 2010, this approach would have never identified DUKE as a national champion as the B3 archetype would not have existed at that time. The same would go for 2005 UNC and 2004 CONN. Likewise, at one point in the study, I identified eight different archetypes into which I could classify these 21 teams. In order to keep consistency with the Taxonomical Approach, I wanted to explain as much behavior as possible in as few labels as possible. Five archetype groups accomplishes what the eight did, so it seemed appropriate to go with this count (six taxonomies in the other article, even though it could have been more). To further this notion of malleability, I've stated that certain teams could be moved into another archetype and the metrical/statistical thresholds would only need to be altered slightly to differentiate between archetypes. This aspect alone gives more of a hindsight feel to the model even though it is being applied as tool for foresight.
  3. In the Taxonomical Approach article, I raised about weighting in the model that I would like to address. If components of the NCP model counted as they currently existed, PG production, Taxonomical Approach, and Archetypal Approach would count as three weightings. In theory, the archetypal approach is the PG production combined with the taxonomical approach, sort of like fitting two puzzle pieces together. Should they count separately as three weights in the NCP model? Or should the two mirco-approaches (PG and taxonomical) be evaluated as validation/invalidation thresholds for the macro-approach (archetypal) and the whole puzzle count as one weight in the NCP model? In 2022, I kind of split the difference, counting PG as one weight and post-production (merging the two approaches into one measure) as another weight. In all honesty, I think only a full back-testing of the NCP model itself could answer which question is correct, so I'll stay with the same weightings as 2022 for now.

As always, thanks for reading my work, and like I said in Part 1, if you see any typos or any glaring misses, feel free to speak up in the comments section.