Readers familiar to this blog know that I’ve been working on a model to predict success in the NBA using the Wins Produced metric (See the Basics here and the development version here). This is one of the goals of this blog, it’s mission statement. Build the model, test it out, get feedback, improve it, test again and so forth in a iterative loop. The final product for this offseason is the full predictions for every team but there are a few things pending before we get there.

Yesterday, I unveiled one of those, my rookie model and it was deemed awesome.

Today we continue with the awesomeness as we review/revise the model, look at how the 2009 draft class fared and crunch the 2010 class. And now on with the show.

I looked at all the data (thank you Draft Express ) and found the following variables that correlate in a meaningful way:

- Win Score per 40 minutes (Can you Play?)
- Height (Are You Tall?)
- Age when drafted (Are you Young?)
- Position (What Position?)

The initial model I came up with yesterday looked as follows:

ADJP48 = K – A* HEIGHT + B* SIMPOS – C* DFTAGE + D* WS40

Were K,A,B,C,D are constant

With a correlation of 42% for every player that played more than 400 minutes as rookies coming from college (from 1996 to 2010 that’s 373 players).

I built a second tweaked version using categorical variables (for Age and Height) based on reader’s Alex’s suggestion. This one had a correlation of 45% for every player that played more than 400 minutes as rookies coming from college (from 1996 to 2010 that’s 373 players). We’ll call these models: Yogi and Booboo.

So I went ahead and did some data analysis and added some simple logic based on the predicted WP48 for each player and the hit rate (% of players who were .090 WP48 over their 1st Four Years). The graph for Yogi follows:

For Yogi, I selected .095 WP48 predicted as the cutoff point and for Boo Boo I went with .067 WP48. The Full modified table is here. To give me an idea of the value of the model I decided to look at:

- The probability of landing a better than average player (>.090 WP48) for his first four seasons
- The probability of landing a good player (>.150 WP48) for his first four seasons

I also decided to show this for:

- Any qualifying pick (>400 MP in his rookie Year)
- Any Top 5 pick
- Any Top 10 pick
- Any 1st Round Pick
- And Both models.

And this was done for 1995 to 2009. The table is here:

The best performing scenario is both models calling for you to draft the player, followed by Yogi then Boo Boo than having the Top five picks. Yogi is more picky, Boo Boo casts a broader net and is more accurate. The best illustration I can give for their effectiveness is setting them loose on the 2009 rookies :

Early returns, show that Model # 2 did a fabuluos job picking winners and calling the ADJP48 (61% Correlation).

Now all that’s left is to throw it at the 2010 rookies and combine it with the rookie minute model (Cindy) and we are done!

[…] Part 2 is here Possibly related posts: (automatically generated)A Sunday Kind of Piece: Sources of Error in Predicting the Future Wins in t…The Worst Team since the mergerThe Short Supply of Tall People revisitedRevisiting the mission statement – and requesting reader suggestions Posted in: Uncategorized ← Bobo the Monkey Prove me wrong, Rook (Part 2) → 10 Responses “Prove me wrong, Rook Part 1” → […]

[…] (Yogi and Boo Boo). For the math behind it see the Basics . For the model build see parts 1 & part 2. Now, we get to the payoff where we feed the numbers for the 2010 rookies into the models and see […]

[…] – has offered a few studies of rookies recently (see his “Prove me Wrong” series HERE, HERE, and HERE). His latest – reposted below – is a quick look at the preseason rookie numbers. […]

[…] liberally to start. As we freely admit with rookies, they’re hard to predict. Even using a much better model than “conventional wisdom”, we’ve noticed we’ll miss more often than hit on […]

[…] means that now is the perfect time to get in some quick analysis of the top draft prospects (before Yogi and Booboo make an appearance). On Thursday, I used NCAA Wins Produced to determine how productive the top 25 […]

[…] to predict the future performance of NBA draft picks (go here for the model build parts 1 & part 2 ). In very general terms, the models use the available data to predict future performance for each […]

[…] to predict the future performance of NBA draft picks (go here for the model build parts 1 & part 2 ). In very general terms, the models use the available data to predict future performance for each […]

[…] original build in detail is achived here (parts 1 & part 2 ). In very general terms, the models use the available data to predict future performance for each […]