PDA

View Full Version : NBA: Predicting the Outcomes of Playoff Series



Latrinsorm
12-08-2013, 08:34 PM
Back in this thread (http://forum.gsplayers.com/showthread.php?81766-Best-of-Seven-and-Predictions) I did some good work going from Rtgs to game winning % and game winning % to series winning %, but I took a very poor first touch when it came to going from regular season Rtgs to postseason Rtgs: I guessed "(ORtg + DRtg) / 2", but that makes no sense. What it should be is...

Avg + (ORtg - Avg) + (DRtg - Avg)

...that is, the average plus how much better the offense is (or minus how much worse it is) plus how much worse the defense is (or minus how much better it is). So if a +2 per 100 possession offense goes up against a +2 per 100 possession defense, we would expect the offense to score +4 per 100 possessions, while the ()/2 method predicts only +2. This has the effect of pushing all our game winning %s away from 50% (so 51%+ get higher and 49%- get lower), which is good, but it doesn't push them nearly far enough. If you'll recall, every one of the 14 series for the 2012 NBA postseason was predicted to go 7 games. With the corrected Rtg prediction, only 2 are predicted to go less (Thunder over Mavs in 6, Spurs over Jazz in 6; yes, only 2 years ago the Jazz were a playoff team). We have an improvement, but not nearly enough.

But then it occurred to me: rather than wrestle with formalism, why not go through more seasons empirically and see if any trends emerge? Then we can look for those specific cases that diverge from those trends and potentially put value on things like defense, rebounding, pace, coaching, home court, etc. That's what I will do, but I wanted to write this down before I forgot it. Stay tuned; same bat time, same bat forum!

Anebriated
12-08-2013, 11:25 PM
If this work my betting account thanks you.

Latrinsorm
12-11-2013, 09:56 PM
I still can't get it wide enough. Only 10 series of the last 60 have gone 7, but the method predicts only 10 of the last 60 won't, and all of them are predicted as 6 game wins. That's no good.

So instead of using league average (which by definition is the same for ORtg and DRtg) I tried using the averages of playoff teams. So for any given playoff series between teams A and B we would do...
predicted ORtg = avg ORtg + (home ORtg - avg ORtg) + (away DRtg - avg DRtg)
= home ORtg + away DRtg - avg DRtg
...and similarly with DRtg. The average playoff team will have a winning record, and therefore a higher ORtg than DRtg, so we need to use two figures.

Does it help? Sort of. We get a better spread (23 7 game wins, 35 6 game wins, 2 5 game wins!!! 2010 Cavs over Bulls, 2013 Heat over Bucks), but consider the overall prediction without standard deviations:

old version
(home O + away D - avg) - (home D + away O - avg) = home MOV - away MOV

new version
(home O + away D - avg D) - (home D + away O - avg O) = home MOV - away MOV + avg MOV

Mechanically what we've done is an offset towards the home team, which is great except 19 teams with home court advantage lost in this sample.

.

It also occurs to me to consider the margin of error in pythagorean expectations (a different but related model): 3 games per 82. Even if that magically stayed the same for a 7 game sample, we're talking the difference between a 5 game win and a 7 game loss. I think the sample sizes are just too small to get any kind of model to 1-game precision, so trying to quantify factors not yet in the model (home court, coaching, tuffness) is pointless. At best, it might be interesting to combine this and a simple win-loss prediction model and abstain from publishing any prediction on a series anticipated to go 7, because they would be the most susceptible to noise changing the result.