Nov 20, 2017

Unorthodox Bracket-Picking Methods

If you have followed Bracket Science in the past or PPB currently, you are already familiar with the mainstream bracket-picking tools, such as F4/Champ Contender/Pretender Rules, Upset/Victim Rules, QC/SC Analysis, and Aggregate Value Estimation. Tools like these are mainstream due to a number of reasons, including but not limited to reliability in accurate picks, time/cost efficiency in creation, and simplicity in application. Not all bracket-picking methods have these qualities, and usually this results in the method being passed over for one that does have them. Since we at PPB are always trying to break the norms and raise our head above the crowd to gain newer or better insights, I thought I would dig up one of my first bracket-picking systems and use it as the basis of this article on unorthodox methods.



INTRODUCTION

Following the completion of the 2008 NCAA tournament, I wanted a way to visualize tournament teams that would give a clearer meaning to their records and resumes. Instead of seeing a 1-seed with a 31-3 record, I wanted a visualization that would indicate the quality of those 31 wins and 3 losses (as you will see below). This new method was unorthodox in the sense that I saw the match-ups, the regions, and the entire field like I had never seen them before. In fact, this method is so unorthodox that I never even gave it a name until I wrote this article: The Loss-Mapping Methodology. I worked on this system for four tournaments, and for the first time, I tried to apply it to picking the 2012 NCAA tournament (five total years of work). Though it proved pretty successful for my 2012 picks, I ultimately abandoned it because I felt like its results were based more on interpretation and luck (less scientific is probably a better description). In this article, I will explain how to replicate it, post my example-brackets from 2008-2012 for visual clarity, and examine its potential applications.

METHODOLOGY

In the three steps below, I tried to be as detailed as possible to replicate this method, so if you try to replicate it, pay close attention to the wordings and use the visual examples. Also, I believe this methodology could be done using Spreadsheet software (with a ton of formulas), but I have described the details as if it is being done with pen and paper.
  1. To begin with, use a bracket print-out that includes the W-L record of the team beside their name. This will drastically reduce the workload. The first step involves making a 4-column table (Seed Group Loss Table, SGLT) in the area below the lines to write the F4/Champ. The first column lists the seeds 1-12. The second column (labelled "L") records the total number of losses for the respective seed group, so if all four 1-seeds each had three losses, then a "12" (3+3+3+3) would go in the L-column on the line for 1-seeds. The fourth column (labelled "NL") records the total number of losses to teams not in the tournament field for the respective seed group. If any of the 12 losses of the 1-seeds happened to teams that are not in that year's NCAA tournament, count all of these type of losses and put the sum total in the NL-column on the line for 1-seeds. Continue this process for seeds 2 through 12. The third column (labelled "TL") is the seed group's sum total of losses to teams that are currently in that year's NCAA tournament. It is simply the difference between the 2nd column and the 4th column. If the 1-seeds had 12 total losses and only 1 of those losses is to a team that not currently in that year's NCAA tournament, then place an "11" (12 minus 1) in the TL-column on the line for 1-seeds. Continue this process for seeds 2 through 12. This completes the first step.
  2. The second step is the first part of the mapping process, and it is very labor intensive. Like the last step, start with the 1-seeds and work towards the 12-seeds (although I usually stopped at the 4-seeds). For each 1-seeded team, look at their schedule, and for each tournament team that beat them, place a "1" beside of that team's name. Using the 2017 NCAA tournament as an example, NOVA had three losses during the regular season, one to MARQ and two to BUT. Therefore, find MARQ in the bracket and place one "1" next to their name and find BUT in the bracket and place two "1" next to their name. Do the same process for KU, UNC, and GONZ plus one additional step. Each of these three teams lost to at least one team not in the tournament field. KU lost to TCU and IND, UNC lost to IND and GATC, and GONZ lost to BYU. If a team lost to a non-tournament team during the regular season, write how many of these type of losses on the other side of the team's name from the "win vs seed" notation. For example, write "2-loss" next to the names Kansas and North Carolina and write "1-loss" next to the name Gonzaga. Repeat the process for all seeds 2 through 12, and for these seeds, write their seed value. Above, a "1" was written if that team defeated a 1-seeded team during the regular season. If a team defeated a 2-seeded team during the regular season, write a "2" for each of those wins. Likewise, for wins against 3-seeded teams, write a "3" and so forth and so forth. This step can be performed concurrently with Step 1 and provides a quality control check on Step 1 as well.
  3. This is the final step of the Loss-Mapping methodology, and it is the second part of the mapping process. For each region in the bracket, create a two-column mini-table (Regional Win Vs Seed Table, RWST) that totals the "win vs seed" for all teams in that region. These "wins vs seed" were created in the second step of this method. The first column will list the seeds 1 through 12. The second column will contain the total number of wins by all teams in that region against that specific seed. For example, in the 2017 South Region, BUT has two 1s beside its name (NOVA, NOVA) and UK has one 1 beside its name (UNC) for a total of three. As a result, write "3" in the second-column of the South Region RWST on the line for the 1-seed. Likewise, UNC has two 2s beside its name (LOU, DUKE), BUT has one 2 beside its name (ARI), and UCLA has two 2s beside its name (ARI, UK) for a total of five. Write "5" in the second-column of the South Region RWST on the line for the 2-seed. Repeat this process for the 3 through 12 seeds in the South Region, as well as the three other 2017 regions.
Below is the very first loss map that I created following the 2008 tournament. 2009-2012 are at the bottom of the article.

APPLICATION

This is a very laborious and time-consuming method, which is one reason I only did it during the off-season instead of during Bracket Crunch Time. Since other methods are quicker and provide just as accurate information on match-ups or tournament quality, I prefer to be efficient with my Bracket Crunch Time and place this method on the back-burner. However, I do believe that if this method was produced via Spreadsheet and tracked over the duration of the season, I believe this method would not be that much of a time-drainer during Crunch Time. So, the last thing to examine about this methodology is how to apply it. Below are some possible applications of this methodology, but keep in mind that they are untested and unverified. These are mere assumptions that I have made on how it could work because I have never fully developed this tool. Once the three-steeps above are completed for all years 2002 to the present, I'm sure there will be more than enough information to discover patterns and make insights.
  • Looking at the SGLT first, it seems as if the total losses for each seed group could be an indicator of tournament quality. If the total number of losses for the 1-seed group is higher than average, this seems like it would indicate parity rather than strength (a sign of an upset-heavy dance). Also, I would think a higher-than-average total in the NL-column (losses to teams not in that year's NCAA tournament) would be an indicator of parity. Logically speaking, if 1-seeds lost to teams that didn't make the tournament, then they should be more likely to lose earlier in the tournament because tournament teams should be better (by simply making the tournament) than the teams that actually beat the 1-seeds. The losses to tournament teams seems like a control variable. If tournament-bound teams are playing a higher number than usual of games against other tournament-bound teams, the loss totals would be higher in general for that year because one of those teams has to lose that game.
  • Looking at the RWST next, this seems like a way to make historical comparisons for each region in the bracket. For example, when I first applied this method to making picks, the 2012 MidWest Regional (with a 1-6-5-6 RWSP vs 1-4 seeds) showed a lot of similarities to the 2010 East Regional (with a 2-5-4-6 RWSP vs 1-4 seeds). The two regions played out almost identically as well. Both saw a 2-seed defeat a 1-seed to go to the F4. The two regions S16 AM were 1-12-11-2 and 1-13-11-2, respectively. However, this is also when I realized that I may have got lucky with this method. In all likelihood, a healthy Kendall Marshall would have had a different outcome for the 2012 MidWest Regional Final. All of that aside, a method that can demonstrate parallels between a regional in the current year and a regional in a past year is extremely valuable. In the last six years alone (24 regions), nine have resulted in a 1v2 match-up (37.5% occurrence) with the 1-seeds having the edge 5-4. The second-most common regional match-up in the last six year is 4v7 with four occurrences (16.7%). Like I said, being able to identify a region that could produce a certain results based on its similarity to past regionals would be a very valuable tool since 1v2 and 4v7 identify 13 of the last 24 regional final match-ups (more than half) and it predicts a F4 and an E8 contender (7 correct picks).
  • Looking at the map as a whole, this may be the most difficult to find any patterns. Here are a few ideas, but I have no clue if they would even hold water.
    • A region may have numerous wins (five or more) versus a particular seed, but if all (or a large percentage) of those wins are in the opposite half of the region, it could mean that particular seed has a safe path to the Elite 8.
    • The number of teams with "win vs seed" notations beside of their name may be indicative of parity for a particular region. For example, if six teams out of the region's sixteen have a 1 through 4 "win vs seed" notation beside of their name, it could indicate a region of giants, where the S16 AM is 1-4-3-2 and the regional final match-up is 1v2. Likewise, if ten teams out of the region's sixteen have a 1 through 4 "win vs seed" notation beside of their name, it could indicate a region of parity, where upsets happen frequently and either the 1-seed or 2-seed (or both) do not make the regional final.
    • One more thing to mention is the obfuscation of detail in the loss map. Using the 2012 West Regional as an example, MIST has four of the eight wins versus 4-seeds in that region, yet in a twist of irony, they bowed out to a 4-seed. How did MIST show so much strength against 4-seeds during the regular season yet managed to lose to a 4-seed when it mattered most? The answer is simple: The 4-seed they played against (LOUT was the only 4-seed in the tournament not from the B10 conference (IND, WISC, MICH). It is very easy to rack up wins against 4-seeds if a lot of teams from your conference are awarded those seeds. Details like that are masqueraded when we simply assign a "win vs seed" notation to the team, so keeping an eye on little details like that may clear up any inconsistencies in this method.
CONCLUSION

As I said at the top of the article, there are certain qualities in bracket-picking methods that I consider to be must-haves: reliable accuracy in picks, time/cost efficiency in creating/producing it, and ease in using and amending. The biggest knock I have on the loss-mapping method is the time needed to put into it. After all, time-consumption was the sole reason that I did it for four seasons after that season had already been completed. More importantly, I wouldn't want to put in the time to do the remaining brackets from 2002 to 2017 and it not produce any reliable results. However, it is unique. It is also a solid way to eliminate the team names and any bias that might result from those names. And as I said above, the only reason I abandoned it was ex post facto hindsight even though it produced positive results for the one and only season I applied. Nonetheless, it makes for the perfect example of unorthodox bracket-picking methods. As always, thanks for reading, and the next article should be out on Dec 4 and I hope that doesn't upset anyone. (Hmmmm! Is that another teaser?)

EXTRA LOSS MAPS








No comments:

Post a Comment