Project: Perfect Bracket: Historical Relation of 2021 Seeds

Using the Torvik Rating system, which utilizes the familiar Pythagorean win percentage methodology of Bill James, I gauge the relative strength of the 2021 seeds to their counterparts in previous years. (NOTE: I believe Torvik may have tweaked his formula because some of the data in my spreadsheet made years ago are different from the percentages on his site. As a result, the data in this method may be inaccurate and predictive ability may be higher risk.) Remember when 2018 had historically weak 1-seeds and then one of them bowed out to a 16-seed. This is another attempt at this approach. This compares all teams of a particular seed group regardless of year. Since the Torvik ratings go back to 2008 (13 total years), this will rank all seeds from 1 to 52 (4 seeds per year x 13 total years). Thus, if a seed this year is ranked #1, they are the strongest seed ever among the thirteen years, and if they are ranked #52, I'd pencil them to be upset if you get my drift. (NOTE: 11-14 seeds have different totals because they are/have been play-in game seeds, so the total number of seeds is listed "out of ##" to denote this distinction.)

1-seeds: #3, #41, #42, #45. The year is unique in that three of the four 1-seeds rank in the bottom 1/3rd among all historical 1-seeds. No other year has that feature. The closest two years by comparison are 2009 (31, 32, 34 and 39) and 2012 (10, 24, 40, 46). In only five years of the thirteen under scrutiny have all four 1-seeds made it out of the first weekend alive (2008, 2009, 2012, 2016, and 2019). 2021 doesn't have the group strength of 2008 or 2019, and 2016 was the historically weakest top to bottom (which explains why three of them crashed and burned in their E8 game to get to the F4).

16-seeds: #6, #28, #47, #51, #52, #55 out of 75. These teams match-up well with 2014 (13, 15, 37, 50, 57, 61) with 2013 being a close 2nd place (17, 21, 23, 30, 60, 67). Not too worried about this group doing anything involving glass slippers.

8-seeds: #9, #16, #25, #34. The closest year is 2009 (7, 17, 27, 40) with 2012 right behind (4, 11, 30, and 34). Interesting how this group's comparisons are the same two years as the 1-seeds, but no 1-seed was beaten by an 8-seed or a 9-seed.

9-seeds: #6, #15, #16, #25. There is only one year that comes close to paralleling 2021's 9-seeds and that is 2010 (7, 13, 18, 24). It was also the year 9-seed UNI knocked off top 1-seed KU.

4-seeds: #23, #33, #43, #50. The two years that look like approximations of this year are 2018 (25, 32, 45, 46) and 2008 (19, 34, 38, 52). 4-seeds were not a threat to 1-seeds in either of these years, winning only three games total for each year. In both years, two 4-seeds lost to 13-seeds, one won a game then lost to a 5-seed, and one made it to the S16 (with 2008's losing to a 1-seed and 2018's losing to a 9-seed).

13-seeds: #10, #16, #35, #44 out of 53. Two years match this year, in order: 2017 (14, 18, 26, 51) and 2016 (7, 30, 34, 42). 13-seeds won zero games in 2017, as a group, and they won one in 2016 (an injury-laden and over-seeded CAL team took the floor without its 3-pt specialist Jabari Bird and its playmaker Jaylen Brown). With vulnerable 4-seeds in 2021 and a 13-seed group stronger on average than its comparisons, I'd expect one 13-over-a-4 outcome in 2021.

5-seeds: #27, #30, #32, #38. The best comparisons to 2021 are three years, in order of closest to distant: 2010 (22, 23, 25, 40), 2011 (24, 34, 42, 46) and 2014 (33, 35, 45, 50). 2010 was closest on average, 2011 was closest on deviation, and 2014 looks like minimum threshold compared to 2021 (each seed about 8 ranks worse at every position). 2010 had a 3-1 record vs 12-seeeds, with two 5-seeds reaching the F4 and one being a National Runner-up. 2010's 5-seeds were stronger than 2021's 5-seeds. 2011 also had a 3-1 record vs 12-seeds, but two of those bowed out to 4-seeds in the next round, and the other one reached the E8. 2014 had a 1-3 record vs 12-seeds with the sole winner losing to a 4-seed in the next round. 2021 cannot do any worse than 2014 (especially with the weak 4-seeds this year) and I doubt they do better than 2010's 5-seeds which saw two in the F4 and ten total wins in the group. Splitting the difference seems like a good tactic.

12-seeds: #12, #40, #44, #50 out of 55. Unfortunately, 2021 has no good comparisons. On average, it compares closest to 2016, but the deviations are further than I like. The one year (2014) that compares in terms of deviations is 7 points weaker, on average, at each spot (sort of like another minimum threshold). If I threw out one team (XAV) from 2014 because they lost the play-in game and were not part of the field of 64, they become a very good match for 2021, and as a result, this model would predict no better than a 2-2 record for 12-seeds against 5-seeds for 2021. When I work out the data inaccuracies and the methodology of this tool over the off-season, it might produce a better model, but a gentle reminder that this is a high-risk tool in its current state.

2-seeds: #9, #24, #38, #38. The best two years to compare this year is 2008 (10, 23, 33, 40) and 2009 (5, 19, 43, 45). 2008 saw two 2-seeds in R32, one in S16 and one in E8, but they went up against a really strong class of 7-seeds and 10-seeds. 2009 saw two 2-seeds lose to 3-seeds, another one lose to a 1-seed, and the last one be the National Runner-up. The competition against 2-seeds in 2009 was a lot weaker, which explains the better success in 2009 than 2008.

15-seeds: #3, #24, #38, #43. Equal comparisons to 2009, 2014, 2015, and 2018. Like 16-seeds, nothing to see here.

7-seeds: #11, #28, #47, #48. This year's 7-seeds can be compared to 2012 (12, 27, 43, 49). It's probably the best approximation of any seed-group to a historical counterpart. In 2012, they split 2-2 with 10-seeds, with one of the winners advancing to the E8.

10-seeds: #19, #33, #37, #39. This year can be approximated by 2011 (18, 29, 32, 47), followed by, but not as close, 2019 (11, 36, 43, 49). In 2011, only one 10-seed won against 7-seeds and they also won another game to advance to the S16. In 2019, three 10-seeds won, but none advanced past the 2-seeds.

3-seeds: #43, #44, #46, #48. The closest historically is 2012 (29, 37, 41, 49) and it's not really that close. This year's 3-seeds are historically atrocious (not bad, atrocious). Ironically, all of the 3-seeds won the 3v14 match-up, but two lost their next game, one went to the S16 and one went to the E8 losing to the eventual National Champion.

14-seeds: #9, #17, #25, #46 out of 53. There's not a really good comparison to this year. I would suggest 2018 (11, 24, 35, 45), but it misses in the middle by about 8.5 ranks. Ironically, 2012 was historically the strongest year ever with 1, 2, 4, 5, and 6 but none could pull off the 14-over-a-3 upset. 2018 did not have a 14-over-3 upset either, but the 3-seeds in 2018 were among the best historically.

6-seeds: #11, #19, #20, #24. For this group, there is also no good comparisons. 2009 was the best all-around year (1, 2, 6, 13). The next two would be 2015 (10, 28, 33, 36) and 2019 (14, 22, 33, 39) but they are much weaker on the low-end of the group. I would say 2021 has the 2nd-best class of 6-seeds behind 2009 (and I mean, way behind). Three out of four won in 2009, but 2009 was an all-around strong year, so it's understandable.

11-seeds: #24, #33, #38, #52, #58, #64 out of 67. There are no good comparisons for 2021, and I think it has much to do with the use of play-in game participants. 2018 looks like the best comparison, but take it worth a grain of salt.

Conclusion

I really want to fine tune this tool (re-make the spreadsheet with Torvik's new formula) and re-adjust it so that it only features teams in the field of 64, not the field of 68. Also, I'd like to show a few tables with it too, but for now, my interpretations will have to do. I will repeat one more time: This model is higher risk due to inaccurate data/updated formulas and untested methodology. USE AT YOUR OWN RISK.

My suggestion for using this tool: If there's one seed-group you are undecided on, use the comparisons from this model as a tie-breaker. I want this model to answer the question: Why did this seed-group do well in this year but not the others: Was this group historically strong, was it historically weak, was its competition historically strong, was its competition historically weak? But again, updated data, improved methodology, and more tables/less words will make this a formidable model.

Project: Perfect Bracket

Mar 18, 2021

Historical Relation of 2021 Seeds

No comments:

Post a Comment

Mar 18, 2021

Historical Relation of 2021 Seeds

No comments:

Post a Comment

Subscribe To