Nov 16, 2019

Warming Up the Crystal Ball: 2019-2020 Edition

In the inaugural article, I promised an article that I had finished before it because I did a lot of the work for it over the summer. If you know me by now, you know that this article is not the promised article. I do intend to auto-publish that one at the beginning of December, which actually aligns better with my work schedule since I won't have a lot of time during those two weeks to put anything together. Instead you get this article, which has become sort of a standard for PPB and it does have a lot of valuable information in it for March. By putting my pre-season biases into the written record, it lets you the reader see my point of view from the start and see how the data is being used to confirm or reject my point of view. Likewise, I would like to publish my preseason thoughts before any other headline-grabbing upsets happen. Nonetheless, here's my outlook for the 2019-2020 season.

Nov 2, 2019

The 2019-2020 College Basketball Season

Welcome to the 2019-2020 College Basketball Season on Project: Perfect Bracket. There's a lot to discuss going forward, so this introduction will be brief in order to elaborate on the more important stuff. First, I will post a copy of my final 2019 predictions that I wrote in the "To my readers" section (due to a lack of time in writing a third full article during Bracket Crunch Week), then I will grade my 2019 predictions, and I will finish with a section on the unpredictable nature of PPB for this season.



2019 Final Predictions - Copy and Paste

Final Predictions:
Summary: Chalky in terms of 2011-2018 Definition of Chalky
Best Tournament Models: 2015 and 2003
2015 AM: 179 - 70 - 21 - 10 - 2 - 1
2003 AM: 174 - 67 - 19 -9 - 5 -3
These are actually good targets. Both SCs are similar, although the 2015 QC matches 2019 far closer than 2003 QC. 2019 is stronger in the middle and back than 2015. The 2019 SC is tight with the 2019 QC, which I theorized last year to minimize R64 upsets. I'm going to follow this again.

Here's where $#!T will hit the fan.
2015 UBR Model: 4-3-1-0-0-0
2003 UBR Model: 3-3-0-0-0-0
2019 PPB Projection: 4-3-1-0-0-0
I've tried every way possible to rule out confirmation bias and failed. I have no choice but to like these numbers! I honestly feel like R64 will be less than 4 upsets, so don't go too crazy in R64

Other Predictions:
-- I like SYR to win at least 1 Game and TENN to win at least 2 games via R&I Model. I think those are safe picks.
-- I like all 1-seeds and 2-seeds to defeat their R64 counterparts via SC. No surprises here! The predicted four upsets must come from 3- thru 6-seeds (although I wouldn't blame you if you went less than four).
-- No more than two 1- or 2-seeds lose in R32 via SC/QC. To get 3 upsets in R32, one of the 11- thru 14-seeds must pull double duty.
-- Do not move NOVA past S16 via Tourney Profiles. Only one reigning NC has won a game after S16 round (2007 FLA, enough said).
-- Final Prediction: No Play-in Winner advances to R32 for first time ever.

Grading the Predictor

The obvious place to start is the section above, just so you, the reader, don't have to constantly scroll up-and-down to match grade and the prediction.
  • AM Targets: The actual 2019 AM was 192-49-18-11-4-1. The E8, F4, NR and NC rounds were exactly where your bracket needed to be. If you are going to win a bracket contest, those are the more valuable rounds, so the AM predictions would have raised your odds tremendously. However, the R32 and S16 rounds were miles away. This could have resulted picking an unfavorable upset. As a result, I grade this prediction around A-/B+.
  • UBR Model: The actual 2019 UBR count was 5-0-1-0-0-0 for six total upsets. This matched the final upset total of 2003, which was also six upsets, but 2019 achieved those six upsets in a different fashion than 2003. To my credit, I mentioned the "2019 SC matched the 2015 SC but 2019 was stronger in the middle and back". However, I failed to understand the impact it would have on the UBR Model (and the AM Targets for that matter). It is probably a good explanation as to why the 2019 UBR Model bent away from 4-3-1-0-0-0 and towards 5-0-1-0-0-0. All in all, I would probably give a grade of B- or even C+ because it was off. I'm not sure how much a lack of sample size or confirmation bias played a role, but I had a feeling it would go off the rails on the UBR model when I said "$#!T would hit the fan."
  • Other Predictions: I loved the strength of the 2019 1-seeds and 2-seeds (and I did state this earlier in the year and often throughout). I was stupid for even suggesting that a 7-10 seed could upset any of them in R32 (although a really strong 7-seed WOF came close to upsetting a 2-seeded UK with a significant injury). Nonetheless, this prediction was an F, plain and simple. "NOVA not advancing past S16" was a perfect guideline, and considering the only reigning NC to achieve this feat was the 2007 FLA team (returning 97% of its 2006 NC team), it is probably a guideline I will use for a long time to come. I give the guideline an A+. "No Play-in Winner advances to R32 for first time ever." GOLDEN!!! I didn't like their 6-seed match-ups (MARY and BUFF) in the R64, I didn't like the two teams (AZST and BELM) that won the play-in game, and I thought both teams won their play-in games too easily against two teams (JOHN and TEM) that I wouldn't have chosen for the play-in game. The gambler in me said fade them both, and that is what happened, but to BELM's credit, they did make it close against MARY.
I skipped over one of the "other predictions" so that I could go into more detail on the model involved in the prediction -- the Return and Improve Model. The TENN prediction to win 2 at least two games worked perfectly. The SYR pick was a different story. If you read the article (Link) on the 2019 R&I Model, it showed SYR returning the highest percentages of any team eligible for R&I consideration. It also went into far more detail about SYR than my final predictions did. The issue with the SYR prediction is lack of information. I honestly did not know about the suspension of starting PG Frank Howard (which happened on Wednesday) until after the games had already started on Thursday, and by then, it was way too late to do anything about it. When you take Frank Howard out of SYR's Return percentages, it drops from 90.72% to 71.8% for MINS and from 93.71% to 72.1% for PTS. According to the adjusted Howard-less numbers, SYR goes from being a 66% chance to return and improve (win at least three games) to slightly less than 50% chance. I also didn't like SYR's path to achieve R&I so I called an audible to just 1-win. The knowledge of Howard's suspension would have changed my math completely. However, I still have to give myself an F on that prediction with the caveat that it was due to incomplete information.

As for the R&I Model itself, I took a different approach to implementation in 2019 -- an almost Bayesian-Probabilistic Approach. I actually liked this approach. It identified a lot of freebies (UVA and VT for at least 1 win, UNC and TENN for 2 wins, and FLST, KNST, KU, and NOVA as highly likely fails), it presented a lot of opportunistic probabilities (AUB and PUR), it identified a lot of high-probability traps (HALL, FLA, CIN, OHST, and GONZ), and it only missed outright the TXTC run to the title game.  Disregarding the incomplete information involving SYR, the R&I model predictions could be given somewhere around a B+/B grade.

As for the Final QC Analysis article, this one is a real head-scratcher. It identified the 6v11 match-ups to be pretty safe for the 6-seeds, and three of the four won their match-up (ironically, the hottest 6-seed coming into the tournament was the one that lost to a power-conference 11-seed who hadn't beat a tournament team since November). It also identified 2-seeds, 5-seeds and 7-seeds as areas of strength. 2-seeds combined collected eleven wins, which is one win shy of expectations based on top-seed advancement (four E8 appearances equals twelve total wins). It was two wins better than 2018's crop of 2-seeds, which was far weaker than 2019's crop. 5-seeds collected four total wins, which matches expectations based on top-seed advancement (four R32 appearances equals four wins), but all four wins were collected by one under-seeded 5-seed advancing to the F4. The "spike at the end of the SC (the 12-seed group)" should have been an indication of potential 5-seed victims. 7-seeds collectively won one game, and this prediction probably shouldn't have been made considering that the 10-seed group was in-line on the QC-SC overlay. This was a clear oversight by yours truly. All in all, there were some hidden gems in this analysis and one clear oversight. As a result, I give myself a B+ on these predictions.

Finally, there was one prediction that never made the blog that I wanted to discuss. Last year, I introduced a new tool called the Seed-Group Loss Table (Part 1 and Part 2). In Part 2, I created a linear regression model for each of the top four seed-groups to predict their F4 and E8 potential. I posted the formulas if you wanted to try them for your 2019 bracket, but I did not cover them as official models for the 2019 predictions. I WISH I WOULD HAVE! Using the linear regression formulas and the L% and N/L% for each seed-group, the SGLT predicted for
  • 1-seeds: 0.227 for the F4 (either zero or one) and 2.433 for the E8 (either two or three).
  • 2-seeds: 0.864 for the F4 (either zero or one) and 1.786 for the E8 (either one or two).
  • 3-seeds: 0.9305 for the F4 (either zero or one) and 1.619 for the E8 (either one or two).
  • 4-seeds: 0.3252 for the F4 (either zero or one) and 0.573 for the E8 (either zero or one).
Each of the estimations was correct. For the F4, there was one 1-seed, one 2-seed, one 3-seed, and one 5-seed. For the E8, there was three 1-seeds, two 2-seeds, two 3-seeds, and zero 4-seeds. I really wish I had extended the SGLT beyond the top four seed-groups to see how it would have fared in predicting 2019's F4 appearance by a 5-seed. Unfortunately, the SGLT tends to lose accuracy for deeper runs when extending it to lower seeds. As the SGLT article itself suggested, it is probably best for predicting how the seed-group performs against its seed-group's expectations (F4 appearances for 1-seeds, E8 appearances for 2-seeds, S16 appearances for 3- and 4-seeds, R32 appearances for 5- thru 8-seeds). It worked well enough that I will probably include it in 2020's predictions.

PPB for the 2019-2020 Season

If I had to make a prediction in November about this blog for B.C.W., it looks highly likely that I will produce one total article and hope I can be both precise and concise with the all of the details. As for the yearly schedule, it is up in the air. For the last two years, I published articles on Monday morning right at midnight. It looks like Saturday or Sunday may be the best bet, and it may be monthly articles instead of bi-weekly. In past season-opening articles, I would lay out a schedule with likely targets for publish dates. I don't think I'm going that route this year because it may result in over-promising and under-delivering on my part. The only guidance that I can give at the moment is I will prioritize QC Analysis articles above all else, followed by articles on new bracket-picking models or improvements to existing bracket-picking models, and lastly any articles with opinion/feedback/criticism and/or history-driven articles. Believe me, I would love to give my two cents' worth on the stupidity of the expansion of power conference schedules, and if you don't know what I am talking about, just look at how many ACC teams are playing conference games in the first week of November. What a joke!!! To finalize the issue of article scheduling, I will more than likely improvise the schedule this season and have a better understanding of my schedule for the following season (2020-21). I do know the contents of the next article because I wrote it before I wrote this one. I just don't know what date I'm going to set for the auto-publish. Anyways, I'm looking forward to another year of trying to accomplish a lifelong dream, and I hope you will join me for the ride.