Feb 6, 2016

What is the Quality Curve?


As promised, I will be presenting an in-depth explanation of the quality curve (QC), the seed curve (SC). A forward warning: This article may be more technical than practical, meaning it is intended to explain a tool rather than tell its results. So, if you are looking for insights into the 2016 Tournament, this article may not be what you are expecting. If you want to broaden your analytical approach to the tournament, you are probably going to love this article, so let's dive right in.

The Quality Curve

Just as its name would suggest, the quality curve is a line graph displaying the "quality" of the teams in a given year of the NCAA tournament. The above chart shows the quality curve of the 2016 Field in January (blue line) and the same field again in February (pink line).


It is drawn using the KenPom ratings for the Top 50 tournament-eligible teams. Teams are grouped into seeds by their position in the KenPom ratings, where every four teams belong to a seed. For example, teams 1-4 in the KenPom ratings are 1-seeds, teams 5-8 are 2-seeds, and so forth until the 11- and 12- seeds are ranked, for which I group five teams instead of four to account for play-in teams. So what does all this mean?
  1. By using only tournament-eligible teams, it shows the best teams in the tournament field rather than the best teams in Division I basketball. In 2016, two top-rated teams (KenPom #8 Louisville and KenPom #22 SMU) are ineligible for the NCAA Tournament. If the ratings of these two teams were used to make the QC, the tournament field would look stronger than it actually is. (NOTE: The two QCs in the above image and previous images were created when LOU was known to be eligible. Every effort will be made to remake these QCs without LOU's data point and when this process is complete, this note will be removed.)
  2. By using the Top 50 KenPom teams, the QC actually sets a upper-bound curve. In other words, the QC is the absolute maximum value that any curve made with KenPom ratings could take. When the 2016 Seed Curve is produced, some seeds may take on a higher value than they do in the QC, but the total area under that Seed Curve will always be less than the total area under the QC. For some seeds to get stronger, they have to reduce the quality of teams at other seeds. This is why the QC represents the upper-bound of the tournament's quality. The tournament's quality will never be better than that represented by the QC.
  3. By grouping teams into their respective seeds, it hides the actual quality of the individual teams in that group. In other words, it imposes the Law of Averages on the teams. For example, let's suppose the Top 4 KenPom teams have ratings of 0.9600, 0.9600, 0.9600, and 0.9500 (which closely describes the situation of the 2015 tournament). In KenPom ratings, a 0.0100 drop between teams is a really significant drop. When these three teams are averaged together to make the group rating for the 1-seeds, it comes out to .9575. It closely approximates the three strong seeds, but drastically overrates the weak seed in the group. The same Law of Averages phenomenon can be seen from one very strong team in a seed group being pulled down by three very weak teams in the same seed group. This could have the effect of making one seed group (i.e. 9-seeds) look very weak compared to another seed group (i.e. 8-seeds), even though one team in the 9-seed group could be evenly matched with their 8-seed opponent. This point is even more important when dealing with Seed Curves.
  4. By taking an ordered top-to-bottom approach, teams will change their order in the rankings as the season progresses. Teams that improve and become more efficient on the basketball court move up the rankings while teams that stagnate or regress move down the rankings. In other words, the #4 ranked KenPom team in January may be the #8 ranked KenPom team in February. By ignoring the actual names of the teams, we see the overall quality of the field through the QC, whether the field is getting better as a whole, worse as a whole, or converging in quality (like the image above shows) or diverging in quality (stronger teams get stronger and weaker teams get weaker).
The Seed Curve

When the bracket is finally unveiled in March, the QC undergoes a transformation that turns it into the Seed Curve. While QCs are drawn by grouping teams according to their rank in the KenPom ratings, SCs are drawn using the KenPom ratings of the four teams (or five teams in the case of play-in teams) at that specific seed in the bracket.

The SC is the eventual result of tournament teams not being seeded in a top-down fashion according to their efficiency ratings (for example, the KenPom ratings). So what is going on with the SC?
  1. By grouping teams in order of the KenPom ranking, you get a gradually declining curve like the 2014QC. By grouping teams based on their bracket seed, you get a fluctuating curve like the 2014SC. Using 2014 as an example, the #2 ranked team in the KenPom ratings (Lousiville) was given a 4-seed in the tournament as was the 10th-overall ranked Kenpom team (Michigan St) when 4-seeds should have went out to the teams ranked #13-#16 in the KenPom ratings. This discrepancy in seeding resulted in the extremely strong 4-seed average in the 2014 SC, as the 4-seeds averaged higher than both the 2- and 3-seeds. This means the SC is also subject to the Law of Averages, as described in Point #3 in the QC Section Above.
  2. Since the SC represents teams that actually get into the tournament rather than the Top-50 ranked teams like the QC, it is quite possible that the 50 best teams do not get 1-12 seeds. This is the phenomenon described in Point #2 in the Quality Curve Section above. If the 50 highest ranked teams do not make the tournament, their rating, whether it be the 51st team or the 101st team, will surely drag down the average of their seed group, resulting in a lower quality curve. This is precisely demonstrated in the image above as the SC only exceeds the QC at three points: the 4-seed, the 9-seed, and the 11-seed. In 2014, six teams in the Top 50 KenPom ratings did not get invited to the tournament, meaning six teams inferior to them made the tournament and dragged down the averages of their respective seeds. The points of excess (4-, 9- and 11-seeds) are the result of seeding discrepancies described in the Point #1 of SCs.

No comments:

Post a Comment