Mar 12, 2018

2018 Quality Curve Analysis - Final Edition

Well, the bracket has been released, the match-ups are set, and now it is time to see how the next three weeks will likely play out. If you have read the previous three editions (and if you haven't, here are the links to three very good reads: Jan, Feb and Mar), you will recognize the chart below. It is the Final 2018 Quality Curve.



What is the QC Telling Us?

The same thing I have said all year long: "Parity exists in 2018 and parity translates to an above average number of upsets."



Here are the important points:
  • The Final QC should be approximately close to the Mar QC. With only two weeks of games in between these two curves (whereas the Jan to Feb and Feb to Mar QCs had one month of games in between), there should not be much deviation in these results.
  • The fact that the top teams were playing their best near the end of January (Feb QC) is troubling. However, it is a better situation than it was at the end of December (Jan QC). The teams ranked along the 7-10 spots are also playing at their peak, which is signaled by the Fin QC being higher than all other QCs along this range. From the 22nd to the 40th spots, we have seen gradual deterioration over the course of the season, as visualized by the Jan QC being highest and the Final QC being lowest along this range.
  • Lastly, there are seven teams who comprise the QC that are not in the NCAA tournament, meaning the actual field may be slightly weaker at the bottom than it could have been. Five of these seven were Bubble teams.
Let's see how the 2018 QC compares to the 2017 QC.

Wow! The gap between the two curves looks like the Grand Canyon. What could this mean? The 2017 QC showed significant strength among the Top 26 teams in the country. In 2018, that strength stops at the Top 6. From 7th to 27th, the teams are much weaker, which makes them more prone to upset. Ironically enough, the teams from 28th to 50th are pretty much in line with one another. Since the former group is weaker this year and the latter group is approximately the same, there is relative weakness among the middle teams in 2018. In 2017, all 16 teams seeded on the top four lines (1- thru 4-seeds) advanced past their opening round opponent. The last time this happened was 2007, which happened to be another statistically strong year. I knew the total number of upsets in 2017 would be lower because it was a strong year, but I never called for all 16 to advance their first game because of the uniqueness of this era in college basketball (in retrospect, I probably should have seen it coming). I can safely predict from the relative weakness in the 2018 QC that it won't happen this year. With the sheer strength of 2017 teams, only four upsets happened in the R64 and all four involved 5- and 6-seeds (remember 7- thru 10-seeds do not fit our definition of upsets being seed differentials of 4 or greater). With relative weakness showing in the 2018 QC, it is a good probability to expect that the R64 upsets will not be isolated to the 5- and 6-seeds in the 2018 tournament. Unfortunately, the QC does not tell us where the UPMs are located in the bracket. The Selection Committee doesn't seed teams based on their efficiency rankings, so we have to create a different curve to see where the strengths and the weaknesses are waiting. Let's do that now!

The 2018 Seed Curve


The chart above is the 2018 Seed Curve (SC) with each individual seed overlain on it. The action of the curve is important. If you go back to this article, I did comparative analysis of all seed curves from 2003 to 2014, and within the groupings, the curves fluctuate the same way. In stronger years (2007, 2009, 2012, 2015), there is whipsaw action in the middle of the seed curve: Sharp declines followed by sharp spikes followed by sharper declines. In weaker years (2006, 2010, 2011, 2014), there is a smoothness to the SC with some structural deviations (sometimes bow-like curve, sometimes saw-tooth curves) and elevation at the back-end of the curve. The 2018 SC has the smoothness of the weaker curves, it only has one structural deviation (a bow at the front), and it is missing the elevation at the back-end. Believe it or not, the 2018 SC is "significantly lower" than the 2017 SC at all seed lines from 6- to the 12-seeds except at the 9-seed and 12-seed lines. Only at the 14- and 15-seed lines is the 2018 SC stronger than the 2017 SC. For comparative purposes, I'm not quite sure to which SC-grouping the 2018 SC belongs. Let's take a different perspective on the SC.


This is the 2018 SC with the 2018 QC overlain upon it. I did this same technique for the 2017 SC to find the points of strength and weakness. Do you see what I see? I'll say it in one word: Tightness. With the gross exception of the 1-seed group, the SC and the QC practically hug one another. This absolutely bothers me!!! It bothers me because I don't recognize it, or to translate this better, I don't instantly recall any past tournament with such tightness between the SC and the QC. I do have a theory though! As I have said before, the purpose of the QC Analysis is find the gaps between our knowledge of tournament quality and the Selection Committee's knowledge of tournament quality. If the gap is large, there should be wide deviations between the QC (the tournament field if it was determined by efficiency rankings) and the SC (the efficiency curve skewed by some other measure of team quality). If the knowledge gap is smaller, there should be narrower deviations between the QC and the SC, or "tightness". 2018 is the first year in which advanced metrics rankings were included on team sheets provided to the Selection Committee. Did the presence of this information get the Selection Committee closer to our understanding of tournament quality? I have not studied each and every team sheet of the tourney-bound teams, but I have seen one thing in the efficiency rankings that suggests this theory is plausible. Looking at the top 20 teams in efficiency rankings (which would translate to 1- thru 5-seeds in the QC), no team has worse than a 6-seed. Typically, one or more of these Top 20 efficiency teams receives a seed of 7, 8, 9, 10 or 11. Not in 2018! I did a lot of digging through my data sets (which is why it took so long to get this article out), 2009 is the most recent tournament in which a Top 20 efficiency team received no worse than a 6-seed. While I don't think 2018 and 2009 fit the same mold when it comes to all-around team quality, if the committee's seed matches the efficiency seed, then the match-ups feature true seeds rather disguised seeds. My interpretation: This lack of over-seeds and under-seeds should mitigate some of the craziness. I would have instantly called for 13-15 upsets, but if true-seeds are facing-off against one another, we could instead be looking at 11-13 upsets this year. Let's move onto identifying the areas of strength and weakness in the 2018 SC.

Strengths and Weaknesses of the 2018 SC

Looking back on the 2017 SC, the seed-lines where the SC surpassed the QC (4, 7, 10, and 11) had a relatively successful tournament:
  • All four 4-seeds made the S16 with one advancing to the E8
  • Two 7-seeds knocked off two 2-seeds in the S16 with one advancing to the F4
  • Another 7-seed led their 2-seed most of the game until losing it down the stretch.
  • Only one 10-seed advanced to the R32, only because 10s have to play 7s and only one can win, and this particular 10-seed went toe-to-toe with the bracket's strongest 2-seed, narrowly losing.
  • Three 11-seeds knocked off three 6-seeds, with one of them advancing to the E8.
What does the 2018 SC/QC spread show us? The strength lies in the 2-seeds, 3-seeds, and 5-seeds with the 4-seeds and the 9-seeds narrowly coming up short of the mark. In a nut-shell, I expect "as a whole seed-line" these three groups to meet seed-expectations. Thus, 2-seeds should make the E8 (twelve total wins among all 2-seeded teams), 3-seeds should make the S16 (eight total wins among all 3-seeded teams), and 5-seeds should make the R32 (four total wins among all 5-seeded teams). I don't think I am going out on a limb and saying this: I think the 5-seed group should actually surpass seed expectations (more than four total wins in the tournament).
  • The 2-seed group as a whole is really strong. In fact, three of the Top 6 QC teams (mentioned above in the QC) received a 2-seed, when in reality, only two of the Top 6 should have received a 2-seed (assuming the other four of the Top 6 all get 1-seeds). In fact, if you include the 7th best team (which is where the separation between the 2017 QC and 2018 QC begins), you have all of your 2018 2-seeds accounted for and you haven't even finished out the Top 8 efficiency teams. For historical reference, years with strong 2-seeds (2012, 2015, 2016) have met the challenge, with 2015 being their worst performing year only because the 1-seeds were so strong in that year. Twelve total wins by the group doesn't seem out of the question.
  • The 3-seed group as a whole is skewed upward, meaning one team (MIST) is elevating the group much higher than it would be (evidenced by 3 dots below the SC and 1 dot above). The one downside to the 2018 3-seeds is they share the same half of the bracket with the 2-seeds. Another problem could arise with the 6-seeds and the 11-seeds (discussed later). While eight total wins may be pushing it for this seed-group, a title run like 2011 CONN would account for six of the eight total wins. If you want to err on the side of caution, six total wins is much safer.
  • The 5-seed group as a whole is also strong. Like the 2-seeds did, all four 5-seeds are accounted for on the QC before reaching the Top 20. Only four times since 1985 have all 5-seeds advanced past their 12-seed counterparts: 1988, 2000, 2007, and 2015. Though the timing trends suggest we must wait until 2022 or 2023 (7-8 years apart) to see this feat, I do like the strength of this group. Considering that I think they will meet seed expectations as a group in 2018, I don't expect them to do it all in the R64 games, so I'll take three 5-seeds and a 12-seed.
Let's look at two groups whose portion of the curve is skewed downward by an outlier:
  • The 6-seed group has three notable teams: HOU, FLA, TCU. I say notable because these three outrank the most efficient 7-seed, which is unusual for this seed group to have that many outrank the best 7-seed. However, none of these three outrank the worst 3-seed. Their group average is pulled down by the remaining 6-seed -- MIA -- who has the efficiency ranking suggesting it should be a 9-seed.
  • The 11-seed group has five teams clustered around its average (four above it, one below it) while its average is skewed downward by one team SBON. I also don't want to mislead you with that statement, so I should point out that in 2018, the worst 11-seed upset the best 6-seed (USC over SMU: Not to make any rationalizations, but USC did participate in the play-in game, and SMU did not face any legitimate challenge over the course of the season except for conference foe CIN). SBON does play in the play-in game this year, but the winner only gets the third-best 6-seed. By now, you must be wondering who draws MIA from the 11-seed group. The lucky member is LOYC, who happens to be the strongest 11-seed of them all, and the only 11-seed whose committee-seed is in-line with its efficiency-seed.
Let's look at the only group whose portion of the curve is skewed upward by an outlier, but not to the point where it crosses over the QC.
  • Yes, I am talking about the 4-seeds. Three teams in this seed-group are below their group average, because one of them (GONZ) is head-and-shoulders (this blog is not sponsored) above the rest of them. GONZ is the fourth team in recent years to be national runner-ups in one year and receive a lower-seed in the following year while still carrying over players from the run (2016 UNC, 2015 WISC, and 2011 BUT). Two of those three returned to the title game, with WISC only advancing as far as the S16.
The remaining seed groups have what I would call a balancing act happening on the SC.
  • The 1-seed group has the two strongest teams in the field along with two teams should be a 3-seed and a 4-seed. This is why the 1-seed average is right in line with the 2-seed average. The last time a 1-seeded team was this grossly over-seeded was 2014 WICH, and they were upset by a grossly under-seeded #8 UK in the R32. This year, there are two 1-seeds -- KU and XAV -- that are over-seeded, yet only KU potentially faces an under-seeded #8 (by only one seed line, not three as in 2014) in the R32.
  • The 7s, 8s, 9s, 10s, and 12s are almost comical in how they line up above and below the average value. 
    • If the 7s and 10s look like they deviate 'almost perfectly' from the average, it is because they do. The 7-seeds are 1.27, 1.28, and 1.25 AEM points from each other, in order. The strongest and weakest 10-seed deviate from their next closest group member by 2.41 and 2.5 AEM points, respectively, and both the 2nd-ranked and 3rd-ranked 10-seed deviate from the average approximately 0.50 AEM points each. One final note on the 10-seeds: They are barely on-average stronger than the 9-seeds.
    • The 8s and 12s look like they are both skewed downward by a weaker outlier. In fact, the top three 8-seeds are stronger than the highest 9-seed. The weakest 8-seed MIZZ, who appears to be an over-seeded 10-seed, gets paired against the strongest 9-seed FLST, whose efficiency seed matches its committee seed. The 12-seeds, on the other hand, have only one team who is under-seeded (DAVD) while the other three grade as a 14-seed, 15-seed, and a 19-seed.
    • The 9s, with the exception of FLST, all grade out as over-seeds around the 11-13 seed range. Based on my data-digging, these 9-seeds are among the worst over the last ten years. As over-seeded as two of the 1-seeds are in 2018, I doubt they will receive any resistance from these 9-seeds, even if FLST sets up the 2017 R32 re-match with XAV.

My Concluding Thoughts

I have said all year long that parity exists in 2018, and it does. HOWEVER, the bracket structure of 2018 scares the shit out of me. With all of the strength at the top in 2017, it managed to produce ten upsets. With all of the weakness in 2018, I would have said "Start at 11 and just keep counting higher." The bracket structure has me seriously doubting the "just keep counting higher" portion of that sentence.

As for curve-fitting, I like a combination of the curves from 2003, 2006, and 2010.

You can see why I like the 2003 curve. It has the flatness from 1 to 2, the bowing from 2 to 5, and the smoothness along the back-end of the curve with only two intermittent peaks (one from 6 to 7 and one from 9 to 10). The 2018 SC follows the 2003 SC from 1 to 5 to a tee, but the 2018 SC has two intermittent peaks (one from 7 to 8 and one from 9 to 10). The one noticeable problem I have with the 2003 curve is the steepness of decent, which is not present in 2018.

The 2006 SC, if you start at the 3-seed, looks like the 2018 curve. The smoothness in the back of this curve is what I like about it.

The 2010 SC, while far less structurally similar from 1 to 5, has the back-end of the curve that I want to match. Theoretically speaking, 2010 and 2003 should sort of neutralize each other when combined.

What does a combination of these three curves produce? Let's start with the upsets and UPMs.


Like I said, I like the structure of 2003, but the steepness after the 5-seed line is way off-target. I was hoping the 2006 and 2010, which had back-ends that I liked, would off-set 2003's steepness. For the record, I like 5 - 4 - 2 - 1 - 0 - 0 for 12 total upsets. Now, let's look at the Aggregation Model.


To be honest, I'm pretty happy with all of these targets. Yes, I seriously doubt a 1-seed wins the title this year. For the record, I like an AM of 188 - 74 - 24 - 16 - 7 - ??? Hahahaha!!! Did you think I was going to reveal the seed of my national champion? As Dikembe Mutombo would say as he wags his finger at you, "Not Today!" Anyways, I hope you enjoyed reading this article. In three weeks, we will find out just how wrong I was about the significance of bracket structure on tournament results, but the tightness of the SC with the QC along with the top-down balance of seeding really has me biting my nails. Good luck with your picks, and expect one more article by Wed on Return & Improve.

No comments:

Post a Comment