Project: Perfect Bracket: Post-Player Production of Past Champions: The Taxonomical Approach (Part 1)

The Championship Profile Model used to be a random assortment of models, checklists, and data scattered throughout various articles in the blog until Feb 2022 when I consolidated most of them into one tool. Over the off-season, I wanted to do an overhaul of it due to the various errors in data and calculations in last year's final article. In early December, I did an update to the Model (read it here), but I left one specific component blank with the intentions of standardizing that component. First off, it has been one of the most mentally taxing studies I have ever undertaken. The numerous ways I have attempted to quantify and qualify post-production only to abandon the attempt has felt like Sisyphean labor. It is a significant reason why it has taken until Feb to finish it. Second off, I am still unsure of its appropriate weighting in the model, and after Part 2 is published, you will understand what I mean by this (and I will discuss this problem in that article's conclusion). In the end, I ultimately decided to take the component (in its present form as analyzed in the two 2022 articles) and dissect it along its two primary methodologies. From these two methodologies, you will be able to see and understand my original intentions with the component: Post-player output and its importance to a national title. Thus, the two articles being written will feel and read like a walk-through of rebuilding this component.

Let's begin with the table above, and it should be a recognizable starting point. It's the NC Profile Model from the 2022 Final Analysis article with post-production filled-in and all other components left blank. The values are the same values from that article, but since we are rebuilding the component, these values will be corrected. Also, the table only goes back to 2001 because of data reliability (I have data back to 1998, but I'm confident in the data up to 2001 and it's the data which we will be using throughout both articles). The taxonomical approach is represented by the right column labelled PT / R*, and it concerns the statistical thresholds that post-production must achieve in order to be national championship quality. Thus, this approach will be similar to an archeological expedition, where we dig through a lot of data (our artifacts) and try to organize/sort them into their proper place (spoiler alert, it could get very boring).

STEP 1: Logical Foundations for a Post-Player Taxonomy

What do we know about post-players?

Starts: Every team, champion or not, will 'regularly' start at least one post-player. You will hardly ever see five guards start every game or five post-players start every game. Some may regularly start two post-players. The obvious place to find a team's post-production is 'Games Started.'
Blocks: If there is one category that post-players are nearly guaranteed to lead their team, it is blocks. Basketball is won and lost in the paint because that's where the easiest shots to make are located. If a team can get the ball into the paint, they have a high chance to score, so likewise, if a team can protect the paint, they can keep points off the board.
Rebounds: Since post players are usually close to the basket, they are closest to a missed shot. While it is not always the case that post-players lead their team in rebounds, especially with the higher volume of 3-point attempts in today's game, rebounds -- especially the offensive variety -- will usually identify a team's post players.
3-point Rate: Speaking of 3-point attempts, post players typically have a low 3-point rate (3PR), which is the percentage of their shot attempts that come from the 3-point arc. More accurately said, post players will have a lower 3PR compared to the guards on their team. As we will see shortly, the post players on one national champion violated this theory.
AST/TO: For a metric commonly associated with point guards, a negative AST/TO ratio (a ratio less than 1.0) can usually identify post-players. Logically, a post-player is more likely to draw the attention of the defense rather than run the offense, so it's more likely that post-players commit turnovers than passes leading directly to baskets. As we will also see, a few post-players on national championship don't conform to this theory.

Let's start with the first attribute as the foundation for our taxonomy and use the rest of the attributes as identifying and sorting. I've compiled a list of every starting forward and center on all national champions from 2001 to the present, as labelled by their team athletic website. First, you will notice that some guards are listed, and if you know your college player history, you should be wondering why they were classified as a guard in the first place. I have included them because I want them to be scientifically deduced by the metrics, not because of an unknown arbitrary classification system. Second, I tried to separate each championship team by its team-colors rather than an extra line or column (and in my opinion, knowing the team name and year isn't too essential for this exercise). For teams that share similar colors, like dark blue, or the back-to-back championships of FLA, I used alternate color schemes, but tried to use the same color throughout the table. (KU, DUKE and UK get their home-whites, and CONN and NOVA get grey since black text isn't easily visible when contrasted with their dark/navy blue).

Rather than attempt to identify the primary post player or the grouping of post players, the safest method would be to rule out the least likely candidate(s) to be the primary post player or post-grouping. The goal is to reduce all teams to two (or in rare cases, one) candidates (candidate) for post player. Again, if you know your college player history, you could probably do this exercise without the metrics, but for scientific purposes, I will use the metrics and explain the results. After the first few examples, I will begin to short-hand the analysis for the reader's sake in speed of reading it.

Grey (2018 NOVA): Mikal Bridges is most likely the SF due to highest 3PR, highest A/TO, and fewest ORBs (the opposite of what we want to see in a post-player) and 2nd in BLKs and TRBs. He does not have a best in category for any, which would identify as a post-player.
White (2015 DUKE): Justice Winslow is most likely the SF due to highest 3PR, highest A/TO, and fewest ORBs and TRBs, and tied for 2nd in BLKs.
Grey (2014 CONN): Lasan Kromah is most likely a SF due to highest 3PR, highest A/TO, fewest ORBs, and fewest BLKs. Based on per minute data (not shown), Phillip Nolan played the same amount of mins as Amidah Brimah and half the mins as DeAndre Daniels, yet his metrics are less than the metrics of Brimah and mostly less than 50% of the metrics of DeAndre Daniels. Thus, Phillip Nolan is most likely a back-up/rotation post-player.
White (2012 UK): Michael Kidd-Gilchrist, SF, highest 3PR, median A/TO, tied for last in ORBs, median TRBs, and fewest blocks.
Grey (2011 CONN): Roscoe Smith, SF, highest 3PR, second-lowest blocks. Tyler Olander, back-up post-player, lowest in starts, TRBs, ORBs, BLKs.
White (2010 DUKE): Brian Zoubek, back-up post player, lowest starts, highest A/TO.
Light Blue (2009 UNC): Danny Green, SF, highest 3PR and A/TO, lowest ORBs and TRBs.
Green & Orange (2007 and 2006 FLA, respectively): Corey Brewer, SF, lowest BLKs, ORBs and TRBS, highest A/TO and 3PR.
Orange (2003 SYR): Carmelo Anthony, SF, highest 3PR and A/TO, lowest BLKs.
Red (2002 UMD): Tahj Holden, back-up post player, fewest starts, highest 3PR and A/TO, lowest ORBs, TRBs, and BLKs.
White (2001 DUKE): Mike Dunleavy, SG/SF, highest A/TO, lowest ORBs, not best in any category among four players, most unlikely to be primary post-player. Nate James, SG/SF, lowest BLKs and TRBs, while Boozer had highest DRBs (TRBs - ORBs) suggestive of post-defending.

With these players removed, let's take an updated look at our taxonomy.

STEP 2: Filtering Metrics and Categorizing Post-Player Roles

From this list, the first distinguishing factor I notice is 3PR, so I'll filter out the players with the highest 3PR among all candidates and define categories for them.

You will notice a few additional columns. The unlabelled column between ORB and TRB is DRB, which is TRB - ORB. The last unlabelled column is the ORB ratio (I did not label it because I didn't want it to be confused with OR%, which is one of the Four Factors of Efficiency). ORB Ratio is the percentage of a player's ORBs to their TRBs (or ORB / TRB). The unlabelled column to the right of the player is the player's taxonomical identity, so let me explain the notation.

WF (Wing Forward): 3PR > 0.125 and BLK <= 0.7. A perimeter-oriented forward who has the size to score inside but also the shooting range to play on the perimeter allowing a bigger guard to take a smaller defender into the post.
CF (Combo Forward): 3PR > 0.125 and BLK >= 1.4. A forward that can play inside and outside on both offense and defense.
Of the 42 players under consideration, only eight have a 3PR that is 0.138 or higher. Of the eight players with a 3PR > 0.125, half have BLKs <= 0.7bpg and half have BLKs >= 1.4bpg. With this wide of a margin between the two, there is no need to over-analyze the data when the parameters kind of set themselves. I chose 0.125 for convenience since it is exactly 1 3PA out of 8 FGA. The next closest data value to the minimum parameter is 0.079, which is about 1 3PA out of 12 FGA.
Looking at the other metrics, the taxonomy seems to fit. The 2P% of WFs is almost .100 higher than CFs, which leads me to believe their 3PR drag bigger/slower defenders out to the perimeter and score 2-pointers off the dribble rather than back-to-the-basket post-play. Likewise, the ORB of WFs are approximately 1.0 less than CFs, which leads me to believe that they spend more time on the perimeter than CFs, placing them further away from potential ORBs.
Although eight is probably too small of a sample size to be practical, the results of the correlation coefficient on this group are interesting. Correlating 3PR to ORB, the value is -0.4445, which is expected but weak. As I've claimed, the more a player plays on the perimeter, then the more likely they are to attempt 3PAs than 2PAs and the less likely they will be in physical proximity to a missed shot. Thus, 3PR and ORB should be inversely related. Correlating 3PR to A/TO, the value is 0.7059, which is much stronger. By playing more on the perimeter, a player is more likely to pass the ball that leads directly to a basket, either by driving-and-passing or by direct-feed to the post. The player also likely has better passing and dribbling skills than the typical post player, making them a better fit to play on the perimeter than inside the paint.

The next taxonomies focus on points per game, or more accurately stated, the lack thereof. Basketball is won and lost in the paint, but these players won championships without proficient scoring in the paint.

Again, here's how I define the two new notations:

RC (Rim-protecting Center): Pts < 7.0 and BLK > 1.2. These are better known as shot-blockers.
PC (Paint-protecting Center): Pts < 7.0 and BLk <= 1.0. These players put a body on any opposing offensive player who enters the paint and their intention is redirect them away from the paint.
Again, the parameters seem to set themselves. The highest point total in the RC and PC groups is 6.8ppg (the remaining 23 unidentified players all score more than 9.7ppg).
In all honesty, I could have defined the BLK parameter as <=0.7 and >=1.0 and only one player would have changed roles. After deep thought, a 1.0bpg average is achievable by one game of 0 blocks and another game of 2 blocks, whereas a rim-protector/shot-blocker should theoretically get a block every game. 1.2bpg average has a lower chance (theoretically) of having a 0-block games than a 1.0bpg average. Also, the term 'prototypical shot-blocker' doesn't exactly describe 2021 BAY's Mark Vital, who happened to be the 1.0bpg player.
On the surface, the metrics look a little less supportive of separate identities than the previous two, but there are some slight differences. First, the DRB per game of the PC group is roughly higher, as most PCs are >=2.4 DRB/gm whereas most RCs are <=2.4 DRB/gm. Second, the 2P% of appears higher for the PC group as four of six are >=0.567, whereas four of five RCs are below this threshold. I could also make the counter-argument that higher 2P% in both group are more related to the size of the player than the identity, so take this second assertion with caution.
The correlation analysis is also interesting, but for the wrong reasons. For example, PTS and ORB correlate with a value of 0.5236 for all eleven players, and between groups, it is 0.4439 for RCs and 0.6266 for PCs. However, the same analysis for PTS and DRB shows values of 0.7150 for all eleven, 0.7354 for RCs, and 0.9125 for PCs. DRBs should not correlate with PTS better than ORBs since DRBs and PTS occur at opposite ends of the court. Looking at the correlation analysis between ORBs and A/TO, it shows values of 0.6419 for all eleven, 0.6697 for RCs, and 0.7197 for PCs. The fact that these correlations are higher than ORB-PTS correlations could suggest these type of players are coached to get ORBs in the hands of the guards/perimeter players to reset the possession. Again, interesting, but for the wrong reasons.

Now, the final 23 players are up for labeling, and let's see the results.

Here's how I defined the parameters of the last two identities and how to interpret them:

OP (Offensive Post-player): PTS >= 9.7 and BLK <=1.2. In layman terms, the OP is a scorer/finisher on offense and a PC (paint-protecting post-player) on defense.
2P (2-Way Post-player): PTS >= 9.7 and BLK >=1.4. As the name would suggest, they can play both sides of the ball: Score on offense and rim-protect on defense.
Although the parameters are identical to those of previous identities, I'm not sure if it is coincidence or if it is another case of the parameters setting themselves. For example, all OPs have A/TO between 0.531 and 0.846 except one: 2001 DUKE's Boozer. But this raises the question: Is his A/TO value the exception to the rule (identity) or is it the result of an external influence? I could make a case for the latter because he is the only OP paired with a CF, and in many games he was on the floor with four other players all of whom had 3PRs above 0.425. This lineup/rotation would boost his AST as he passes out of double-teams to the open shooter.
On the same line of thought, 2Ps have essentially two ranges of values for their A/TO metric: Either range from 0.250 to 0.538 or from 0.942 to 1.278. For the 0.250 to 0.538 group, this equates to 4TOs per 1AST to 2TOs per 1AST. Thus, the 2P was either a finisher (focus on scoring, not passing, so low A) or a facilitator (focus on the highest percentage shot, so high TO, mostly >2.0 TOpg). All but one of these were paired with either an RC or a PC. For the 0.942 to 1.278 group, this again may be the result of an external influence, as all of these 2Ps were paired with another post scorer (WF, CF, 2P, or OP). The bifurcation of this group along the A/TO metric could potentially produce another taxonomical identity, but I'm not sure there is any additional predictive value gained in doing so.
One noticeable pattern in this group is that all members have >=1.9ORBs. In the previous four identities, WF had 0/4, CF had 3/4, RCs had 1/5, and PCs had 3/6. Instead of the BLK metric, what if I categorized these players along PTS and ORBs and would it make a difference? In the WF/CF group, no player achieved >2.6ORB/gm. In the RC/PC group, only two out of 11 players achieved 2.6ORB/gm. In the table above, all players highlighted in red text have <=2.6ORBS. If you study the table along this parameter, you will see that it only aligns with one other metric: ORB-Ratio, which ORB is the numerator in the calculation, so it should be close to it. The correlation analysis also proves this lack of alignment, especially when seeing that DRB has higher correlations with TRB than ORB does.
Up to this point, I've only pointed out one instance of what I thought could be an anomaly in the taxonomy, which was Mark Vital being either RC or PC (and for the most part, I believe it is correct to identify him as a PC). The only other potential anomaly that I see is 2012 UK's Terrence Jones as a CF. He gets this designation due to his 3PR of 0.138. I could just be picky, but the next closest to him is 2014 CONN's DeAndre Daniels at 0.284 3PR, more than twice of Jones. Excluding Jones, the remaining CFs and WFs have a 3PR in the range of 0.284 to 0.673, with the highest 3PR being five times that of Jones. If the threshold were raised to 0.250 3PR, then he would qualify as a 2P (PTS >= 9.7 and BLK >= 1.4). Ironically, his A/TO is 0.765, which is outside of the two ranges for the 2P group but splits the difference between them. More than likely, I would change the latter range to 0.765 - 1.278 because Jones aligns more with the traits of that group than the 0.284 - 0.538 group. For now, there's nothing really gained or lost by switching him from CF to 2P, and this will be understood later in the article.

For a quick recap:

WF: 3PR >= 0.125 & BLK <= 0.7, and these parameters produced 4 results having 2P% > 0.583, 3P% between 0.283 and 0.386, ORB <=1.7, A/TO >0.940, PTS >= 10.3.
CF: 3PR >= 0.125 & BLK >= 1.4, and these parameters produced 4 results having 2P% between 0.476 and 0.525, 3P% > 0.429 (for 3 of 4, 0.333 for other), ORB >= 2.3 (for 3 out of 4, 1.5 for other), and PTS >=10.7.
RC: PTS < 7.0 & BLK > 1.2, and these parameters produced 5 results who have TRB <= 3.7 (for 4 out of 5, 5.7 for other), DRB <= 2.4 (for 4 out of 5, 3.0 for other), ORB <= 1.4 (for 4 out of 5, 2.7 for other), and 2P% <= 0.551 (for 4 out of 5, 0.663 for other).
PC: PTS < 7.0 & BLK <= 1.0, and these parameters produced 6 results who have DRB >= 2.3 and 2P% >= 0.566 (for 4 out of 6, 0.422 and 0.487 for other two).
OP: PTS >= 9.7 & BLK <= 1.2, and these parameters produced 10 results who have ORB <= 2.0, DRB >= 3.7, 2P% >= 0.540 (for 6 out of 10, the other four were 0.496, 0.503, 0.520 and 0.524), and AT/O <= 0.846.
2P PTS >= 9.7 & BLK >= 1.4, and these parameters produced 13 results who have ORB <= 1.9, DRB >= 3.9, 2P% >= 0.540 (for 10 out of 13, the other three were 0.498, 0.502 and 0.511), and A/TO between 0.250 and 0.538 or between 0.942 and 1.278.

STEP 3: Putting It All Together for the NCP Model

Since we have the parameters for identifying player roles and the likely metrics that role should produce, let's look at a finalized taxonomy before we discuss.

What else can we learn about the roles and post-production? Let's start with this breakdown:

WF: Paired with CFx1 (2018), OP x2 (2022, 2005), and 2P x1 (2016).
CF: Paired with RC x1 (2014), OP x1 (2001), and 2P x1 (2012).
RC: Paired with PC x1 (2019), OP x1 (2003), and 2P x2 (2011, 2004).
PC: Paired with PC x2 (2021, 2010) and 2P x1 (2015)
OP: Paired with OP x2 (2017, 2009) and 2P x2 (2013, 2008)
2P: Paired with 2P x3 (2007, 2006, 2002)

Here's my thoughts:

2Ps have a pairing with every identity, arguably most important identity. When paired with itself, it has won the most titles of any pairing, but none since 2007.
OPs have won a title with every pairing except PC. Although this rule could change, it does make logical sense considering that neither identity protects the rim.
PCs only win with PCs (both take up as much painted space as possible), RCs or 2Ps (if they can't keep opponents out of the paint, then they have rim-protection as a fail-safe). Rules #2 and #3 give a slight nod to the value of shot-blocking.
PCs have all six of their appearances since 2010 (one year after the 3-pt Line expansion). It is most likely a defensive philosophical shift with more area/space to defend.
WFs have three of their four since 2016 (the beginning of the PPB era). In the same mind as Rule #4, newer identities and newer pairings are very good reasons to do this overhaul and stay up-to-date on the ever-evolving profile of National Championship contenders.
WFs, CFs and RCs have never won a title when paired with themselves. If I had to guess, WFs because of no rim-protection, CFs because of lower 2P% (0.476 - 0.525), and RCs because their rim-protection doesn't compensate for lack points.

After all, you can't pick a perfect bracket if you can't pick the National Champion correctly. If you made it this far into the article, you're probably a champ in your own right. I also wouldn't be surprised -- from all of this data, all of this analysis, and having to re-organize and re-write the article -- if I over-looked a detail, typo'd the wrong number/digit, or left out one of the many insights I've had over the past three months. If you do see a typo or maybe an insight that could be added to the six rules above, feel free to leave a comment. As always, thanks for reading my work, and Part 2 of this study should be published around the same time next week.

Project: Perfect Bracket

Feb 17, 2023

Post-Player Production of Past Champions: The Taxonomical Approach (Part 1)

No comments:

Post a Comment

Feb 17, 2023

Post-Player Production of Past Champions: The Taxonomical Approach (Part 1)

No comments:

Post a Comment

Subscribe To