Jan 11, 2016

What A Perfect Bracket Looks Like, Part 1

In the 90 hours between the complete reveal of the bracket to the tip-off of the first game, bracket pickers throughout the world use a variety of information, methods, and strategies to make their 63 picks. Whether it be points per game, the W-L record, distance traveled, or even the mascot method, bracket history can be a valuable tool in making your picks. The idea is rather simple: If I want my pre-tournament bracket to look exactly like the post-tournament bracket, then I should examine the post-tournament brackets from previous years and conform this year's pre-tournament bracket to them.

Developing the System

When the 2016 Bracket is unveiled, suppose I look back to the final bracket of 2015 and decide to make my 2016 Bracket resemble the Elite 8 of the 2015. If so, the 2016 Bracket would look like the following: 1v3, 1v2, 4v7, and 1v2. Unfortunately, if this approach was used in 2015 with the 2014 results, I would have some major discrepancies, as 2014 produced these final results: 1vs11, 4v7, 1v2, 8v2. Assuming I picked the right regions, I could match two of the four Elite 8 pairings (both 2014 and 2015 produced a 1v2 and 4v7 in the Elite 8), but I would have missed badly with the other two pairings (the 1v11 in 2014 would miss the 1v3 in 2015 and the 8v2 in 2014 would miss the 1v2 in 2015). To take the analysis deeper, I'm actually picking two games incorrectly. If I picked the 8v2 in 2015 based on the 8v2 in 2014, I first have to pick the 8-seed to beat the 1-seed and then pick the 8-seed to beat the winner of the 4-5-12-13 group winner. Since the 1-seed beat both the 8-seed and the group winner, I've incorrectly picked two games with one bad pick.

The question we have to ask ourselves is how do we make this historical data mean something. One answer that I have stumbled upon is Aggregation. If we take the seeds of all four pairs (8 teams) and aggregate them together, that will give us a value to make year-to-year comparisons.



Just a casual look at the table shows some interesting patterns. First, the similarity in aggregate values (AVs) even though the seed pairs are different. In the last 13 years, the final bracket results can be divided into just 5 groupings: 13 and 14 in 2007 & 2009, respectively; 19, 21, 21, and 22 in 2003, 2012, 2015, and 2008, respectively; 25 in both 2006 and 2010; 28 in 2004 and 2013 with 29 in 2005; and 36 in 2011 and 2014. Only in 2012 and 2015 did seed pairings match exactly (which mathematically would result in the same aggregate total of 21). A second interesting pattern is how close in time an AV is to its match. Most of the AVs occur within four years of their matching AV (the exceptions being the 19 in 2003 and the 28 in 2013). If this pattern holds true, maybe this could be a sign that either the 2016 or 2017 Bracket will feature an E8 AV in the 28-29 grouping.

Putting Things in Perspective

While the data for the last 13 years shows five groupings, how does this information play out in practical terms? Let's first look at a theoretical approach to the Aggregation Model.

In theory, the Aggregation Model can take on any AV between 12 and 124 (and some of those AVs can have different seed pairs). The 12-AV can only happen one possible way: 1v2,1v2,1v2,1v2. The 124-AV can only happen one possible way: 15v16,15v16,15v16,15v16. With the AVs over the last 13 years ranging from 13-36, the Elite 8 of potentially perfect brackets will fall in the upper third of the theoretical maximum range of 12-124. Perhaps this is why they are aptly named Elite.

If we examine the Aggregation Model as far back as the first 64-team bracket in 1985, we find similar results.


The first noticeable trait is that the range of AVs is still mostly intact. From 1985-2002, the range of AVs is 18-40, much similar to the 13-36 range from 2003-2015. The second noticeable trait is the pattern of continuity in AVs. From 1985-1992, the AVs ranged from 20-25 with two outliers of 37 and 40 in 1986 and 1990, respectively. From 1993-1996, the AVs range from 18-22, meaning a lot of 1-4 seeds advancing to the Elite 8. From 1997-2002, the AVs ranged from 26-40, with one outlier of 21 in 1998. (NOTE: In this article, I put forth an explanation as to why the AVs from 1993-2015 took the pattern that they have. In a soon to be written article, I will do the same for 1985-1992.)

With a theoretical approach to the Aggregation Model established, let's move on to an empirical approach. Earlier, I used the concept "Elite 8 Seed Pairs" and gave examples of them, but I never examined the implications of this concept. I will do this now. Elite 8 Seed Pairs (E8SP) are two seeds that can potentially meet when only 8 teams remain in a 4-Region, 16-Seed Bracket. A very common E8SP is 1v2, meaning a 1-seed meets the 2-seed in a particular region. This also means that certain combinations of seeds pairs can never meet, such as 1v4 or 2v6, in the E8. Here is a chart detailing the history of the 64-team bracket in terms of E8SP.


Quick Facts:
  • 1v2 is the most common E8SP (42 total). It has covered 34% of all E8SPs by itself. In fact, if each and every Elite 8 for the next five NCAA tournaments featured a 1v3, the 1v2 would still hold the top spot. However, most of the 1v2s occurred in the first half of the tournament's history. From 1985 to 1999, the E8 featured either two 1v2s or zero 1v2s. I certainly wish we had that kind of predictability now. From 2000-2015, the majority of tournaments have feature one 1v2 (nine total), four have had two 1v2s, three have had zero 1v2s, and one has had three 1v2s (the Chalk Year of 2007). Since 1999 produced a dud for 1v2s, every sixth year has also produced a dud (which would suggest 2017, but this year could be a strong candidate for a dud as well).
  • 1v3 is the next most common E8SP. 1v3s have covered exactly half the amount (21) of 1v2s (42). Together, they cover a little more than half of all E8SPs (50.8%). In fact, only two tournaments have failed to produce either a 1v2 or 1v3 (1986 and 2011).
  • 1v6 is the third most common E8SP (8 total), but it only covers 6.5% of all E8SPs. It may be the third most common, but it hasn't happened since 2005. Are we overdue or is the 1v6 a dying breed?
  • 1v7 and 1v10 together have the same amount as 1v6s (4 for each). They are also a long-time since happening: the last for 1v7 in 2004 and 2003 and the last for 1v10 in 2008 and 1999. 1999 was also the last year that any two of the 1v6, 1v7, or 1v10 pairs occurred in the same year (1986 and 1987 are the only other two years). This suggests it may not be wise to have two of your Elite 8s feature teams from these three pairs.
  • 1v11 is an interesting E8SP (6 total). First, it has happened more times than either 1v7 (4 times) or 1v10 (4 times). Second, all six times an 11-seed has made it to the Elite 8, all have faced a 1-seed. Don't despair though, three of six match-ups have been won by the 11-seed. It's not as much of a guarantee as one would think. Third, three of the six 1v11s have happened in the last ten years, with two of those being won by the 11-seed.
  • 4v2s (6 total) and 4v3s (5 total) are another interesting pair of E8SPs. Starting in 1997, only two 4v2s have occurred, and they occurred in the exact same year that a 1v11 occurred (2006 and 2011). In that same time span, four 4v3s have occurred with two coming in 2012 (the other two in 1999 and 2004).
  • 4v6 has occurred three times, all between 1988 and 1992. 4v7 has occurred four times, three in the last four years and once more in 2005. 4v10 has occurred twice, 1990 and 1997.
  • 5v2 and 5v3 have occurred approximately half as much as their counterparts 4v2 and 4v3. They also account for 75% of all E8SPs involving 5-seeds. The 5v3 matchup has happened every 11th year (1989, 2000, and 2011). I can't wait til the 2022 bracket to be released. Interesting note: 5v7 has never happened, even though an 8v7 has.
  • 8v2 has to be the most mind-numbing E8SP with 5 total occurrences. All other E8SPs in the same area of the chart have 1 occurrence compared to its 5. The first two happened in the first two years. One more occurred in 2004. The most recent two happened in 2011 and 2014, which also happened to be two of the wildest tournaments in NCAA history. If we are expecting 2016 to be another year of weakness at the top, then penciling in a 6th 8v2 in the Elite 8 may be a safe bet.
We can take the E8SP empirical analysis a step further by looking at it in terms of seed appearances in the Elite 8. Using a generic description, E8SPs can be defined as XvY, where X and Y are mutually exclusive (a seed can only be in one group at a time) and collectively exhaustive (together X and Y contain all 16 seeds).
  • Group X contains seeds 1,4,5,8,9,12,13, and 16. 
  • Group Y contains seeds 2,3,6,7,10,11,14, and 15.
  • With 31 tournaments (from 1985-2015) and 4 regions per tournament, we should have 124 Xs and 124 Ys.
  • With 124 Xs and 124 Ys, we should have 124 E8SPs.
If we tabulate the occurrence of each seed in the E8SPs throughout the 31 tournaments, we get the table below.


Checking our Math:
If we add together the occurrences of seeds from Group X, we get 124 (84+20+8+8+2+1).
If we add together the occurrences of seeds from Group Y, we get 124 (58+31+13+9+7+6).

Quick Facts:
  • Every tournament to-date has had at least 1vY E8SP. In fact, 1-seeds make the E8 68.5% of the time with 85 total occurrences. Translation: A 1vY seed pair will happen 8 times in a 3-year span. From 1985 to 1999, at least three 1vY E8SPs in a given tournament happened 11 times out of 15 attempts, exactly 73.33% of the time. From 2000-2015, the occurrence of at least three 1vY E8SPs has been far less dependable, happening 8 times out 16 attempts (50% of the time). Starting from 2010-2015, it has only happened 2 times in 6 attempts (33% of the time). Thus, if the trend is your friend, pay close attention to over-seeded 1-seeds or vulnerable 1-seeds that have strong 4- and 5-seeds in their region.
  • 2-seeds are expected to reach the E8 along with their counterpart 1-seeds. Unfortunately, 2-seeds do not live up to expectations, as they have reached the E8 only 58 times in 124 tries (46.7% of the time). From 1990-1999, 2-seed occurrences fluctuated from 0 to 4 to 0 in that 10-year span. From 2000-2008, there was no discernible pattern as the number of 2-seed occurrences varied from year-to-year. From 2009-2015, there has been exactly 2 Xv2s each year. Also, every bracket starting from 2000 has had at least one Xv2. Talk about bracket-prediction gold.
  • 3-seeds usually pick up the slack when 2-seeds let us down. Combined with 2-seeds, they account for 89 occurrences in the E8 (71.7% of all occurrences), which is comparable to 1-seeds by themselves. From 1986-2002, 3-seeds were not very dependable at all, failing to clean up 2-seed short-comings 15 out of 38 times (39.5% of the time). From 2003-2015, they were more dependable, capturing 16 out of 28 occurrences (57.1%) left behind from failed 2-seeds. As far as bracket predictability, at least one Xv3 has happened in every tournament in that time span, excluding 2014.
  • 4-seeds are to 1-seeds like 3-seeds are to 2-seeds. Of the 39 E8SPs that did not feature a 1-seed, 20 of them featured a 4-seed. In other words, the remaining 19 X-pairs are shared among 5-, 8-, 9- and 12-seeds. The chart shows an On/Off Pattern for 4vYs, where at least one 4vY pair happens for a few consecutive years (ON) and then zero 4vY pairs happen for a few consecutive years (OFF). With 5 straight ON-years, if the pattern holds, look for the OFF-trend to happen soon.
  • 5-seeds are an enigma wrapped in a puzzle. They occur less in E8SPs than 6- and 7-seeds and an equal amount to 8-seeds. Most of this can be explained by the fact they get beat by their first round opponent (the 12-seed) more than they should. You can't make an E8 if you can't get by the first round! From 2006-2015, only two tourneys featured a 5vY, and 2010 featured 2 of them. 3 out of 40 attempts is not a risk I would recommend taking.
  • 6-seeds have been historically blessed (double entendre). Of the lower seeds (non 1-4 seeds), they have reached the E8 more than any other. However, the majority of these feats happened in the earlier years. In the 12 tournaments from 1986-1992 and from 1997-2001, Xv6 E8SPs happened in each of those years except three (1989,1991, and 1998). In two of those years (1988 and 1992), two Xv6s  occurred. Since 2002, only two tournaments featured an Xv6, the last time being 2010.
  • 7-seeds could be called the lucky sevens. From 1985-2002, Xv7 E8SPs happened 3 times in 72 attempts. From 2003-2005 and from 2012-2015, each of those tournaments featured an Xv7 except for 2013. Though nine total occurrences have happened, the Xv7 has not happened multiple times in the same tournament.
  • 8vY and 9vY are self-explanatory. When your 2nd game of the tournament puts you against a the 1-seed that boasts 85 E8SPs, your numbers are not going to look as great. 8vY has faired far better than 9vY with 8 occurrences compared to 2 occurrences, respectively. As the chart shows, 8vY tends to happen in bunches, especially in 2000 when two of them happened.
  • Xv10s (7 occurrences) happen almost as much as Xv7s (9 occurrences), which is shocking considering they face each other in the first round of the tournament. They do have the inverse of Xv7s pattern. Where Xv7s have been popular lately (6 occurrences since 2003), Xv10s have only had one occurrence in the same time span.
  • Xv11s (6 times) happen just as much as Xv10s (7 times). Four of those six have occurred from 2001-2015, with two in the last five years.
  • Xv12s have happened only once (2002). This may sound like something Yoda would say, but "The path of the 12 is one of most resistance." First, beat a 5 (8 5vY pairs). If you do, then you most likely have to beat a 4 (20 4vY pairs). If you do this, then you most likely have to beat a 1 (85 1vY pairs). The path of the 12 goes through 110 of the 124 X-pairs. It is safe to say that the lone time it did was the exception to the rule.
Stay tuned for Part 2 of this analysis where I apply a predictive tool to the Aggregation Model. If the predictive tool can show a correlation with the Aggregation Model for any given year, then the predicted aggregation should give us an idea of which E8SPs to expect for that given year.

No comments:

Post a Comment