Re-examining myths and explaining how regression works

You know, sometimes it’s fun to poke the bear.

Did you know that when the bomb squad wants to defuse a bomb and they don’t know how to, they take it out to an empty field somewhere and blow it up? That’s kinda the opposite of what we do here. My defining trait is that I’m an engineer and a scientist.  I don’t want to ignore questions and problems, I want to take a tool to it and try to solve or understand it.

Yesterday’s post was a result of this. I knew I was kicking an anthill. Today is no different.

Grab a drink, take a bathroom break, because this will take a while.

One of the hallmarks of society is intelligent polite discussion. If any argument,tool or theory is worth anything it can stand up to scrutiny and review (such as Wins Produced see The Basics) . I may not agree with you but this is why I take the time to respond to the questions. I typically arrive at surprising and unexpected conclusions.

The point of having a blog is to invite discourse. As long as everyone has a thick skin and is prepared to be wrong (and right ) , including me. We’ll keep advancing our understanding. Together.

And you know in this case, Guy is right.

What!!!!!

Not like you think (sorry, couldn’t help myself 😉 ). We do have the data to answer some of these  questions and some of the answers are really surprising.

Before we get to that, lets talk a bit about linear regression.

Let's put that Ivy League education to good use

Linear regression is one of the most commonly used approaches to modeling the relationship between an variable y (say wins) and one or more variables X (say box score stats). Linear regression is simple, useful and well understood and in a lot of cases it works.  It’s typically used for two things:

  • Predicting or forecasting Y (say Wins) based on a know set of X’s. This is something we continually do here at ASLS.
  • Given a variable Y (say wins again) and a numbers of variables (say, I don’t know, points, def. rebounds, offensive rebounds, assists, etc.) then linear regression analysis can be applied to quantify the strength of the relationship between them.  This is what Prof. Berri did. He did it again in Stumbling on Wins. I did it here and at least five other times.

What  does this mean for this discussion? I’m happy with the coefficients math gave me for rebounding and other variables, I’m not changing them.

You shut your mouth when math is talking! (Image courtesy of xkcd.com)

Ok, let’s get to the question and answer portion of the program. First let’s look at rebounding. Are there diminishing returns for rebounding? As I said, I have the data (yeah, every single player and season since 1979, I made excel tap out today), let’s take a look:

If I look at the best rebounder (by qty i.e the most rebounds for that team) rebound rate per 48 minutes for every team vs. the rebounding rate there appear to be a diminishing effect. A correlation of 3.4% does not much water. Did I just prove diminishing returns for rebounds? Not so fast there Kemosabe:

If I take out the best rebounder (by qty) for every team, I may not be getting the best rebounder by rate (per 48 minutes), however if I take out the best two rebounders on each team I get a surprising result. There is a positive relationship between the two best rebounders on each team. So having two good rebounders next to other increases returns (everyone else does see some mild dropoff at the extreme). Who knew?

Phil Jackson knew that's who!

Again fairly weaksauce correlation though.

The second question had to do with above average rebounders.  Can I find one with >15 treb per 48 whose teamates where above average. Here’s a list of all the  players whose rebound rate is >15 per 48 who led their team’s in total rebounds since 1979 :

So the average rebound rate per 48 for teams without their best rebounder is 7.65. By my count 40% of the people on this list qualify.This concludes the rebounding portion of our program.

Let’s talk Wins Produced. If I repeat the exercise I did for rebounding  of comparing the best player on each team (in terms of wins produced) vs the rest of the team:

No diminishing returns here. The data is a little stratified so let’s repeat the trick of looking at the second best separate from the rest but this time let’s add the third best:

Now Color coded for your protection

Hmm. Fascinating. Playing with good players makes you a better player. Instead of diminishing returns for WP48, we see increasing returns, and it holds for your top three. And it doesn’t affect the rest of the team that much. Again who knew?

Anything is Possible

50 Comments

  1. EntityAbyss
    11/24/2010
    Reply

    So uh… case closed? Does that answer all the skeptics. If there more that needs to be said? Hey, you do a whole bunch of posts, maybe you can start using this information to predict stats and WP48 having the returns there, although the difference won’t be too big.

    Also, could do a post on the correlation with rebounds and rebound percentage, because wins produced takes a players pace adjusted rebounds rather than rebound percentage. A team with more misses gets more rebounds.

    Also, prof Berri he was gonna do something on the effect of drawing fouls. Would you do something like that? I know you have other stuff to do, but I love these posts.

    Also, technical freethrows, and And1s. Wouldn’t it make WP and efficiency differentials more accurate (accurately gets number of possessions). Just thoughts.

    • 11/24/2010
      Reply

      EA,
      That’s kinda the point with the smackdown (predictions vs actuals). And actually, the analysis on this post opened up all sorts of interesting future possibilities. What about orebs and drebs? Points?

      For this post I felt rebounds per 48 was appropriate, since I was comparing a player to his teammates but pace adjustment is a possibility.

      There’s some work being done on improving the data to allow us to do things like individual d and drawing fouls. Right now it’s not a priority but it’s on the queue as a future improvement (meaning we’ll get to it at some point in the next decade :-))

  2. Austin
    11/24/2010
    Reply

    Three things:

    1. The positive correlation between top rounder and 2nd rebounder could be due to pace, since you’re using rebs/48. Given equally productive top-2 rebounders,

    2. Could you rerun the results of the study for defensive rebounds only? No one ever claims diminishing returns for offensive rebounds, so we might as well get to the pertinent variable.

    3. The most convincing studies I’ve seen for diminishing returns on defensive rebounding used lineup rebounding rates. Specifically, it was shown that if you added up the dreb% of a 5-man lineup, and compared it to the dreb% of that lineup while on the floor, it regressed significantly to the mean. Is there any way you could use that data from 82games on 5-man lineups to examine this specific hypothesis? Personally I find that to be the ideal case since there is a clear set of expected values (summed dreb%) to compare to actual values (lineup dreb%).

    Thanks as always for the fanservice and tackling the tough stuff, Arturo. I figure you’ve probably seen some of the stuff I said before, that’s why I threw three things at ya.

    -Austin

  3. Guy
    11/24/2010
    Reply

    Boy, I confess I was very surprised to see so many top rebounders whose teammates were above average rebounders. Of course, I had said “rebounders,” not seasons, so I was talking about entire careers — you can any kind of quirky result in a single season. But OK, let’s look at these seasons.

    I started at the top of your list looking at the guys with above-average teammates. First up, Jayson Williams 1996. His 20.74 Reb48 translates into an extra 342 “WP rebounds” for his team. His team had 3,853 boards, while 50% rebounding on 7236 opportunities would have given them 3,618, so the team was 235 rebounds above average. Not bad, but Williams’ teammates still fall short: 107 rebounds below average.

    So I looked at a few more:
    Tarpley 1988: +363 WP Rebounds. Team: +202. Teammates: -161.
    Walton 1978: +268 WP rebounds. Team +82. Teammates: -186.
    Forston 2002: +372 WP rebounds. Team +138. Teammates: -234.

    None of these were panning out. Not one team had a rebound advantage as large as the individual player’s WP estimate. So I looked for someone whose teammates were rated very high on your table: Camby in 2007. Maybe he could pull it off. But sadly, no. Camby was +203 rebounds, but the team was just +46, so Camby’s teammates were -157. So at that point I gave up. Maybe there is someone on the list who passes the test, but clearly it can’t be very many.

    And let’s remember how incredibly easy a test this is. In a world where diminishing returns is a minor factor, there will be dozens of these players. Lots of great rebounders will play with average teammates, some will even have good rebounding teammates (and a huge team total). This is what we see with shooting efficiency: a great shooter at one position does not result in inefficient teammates. So we know what it looks like when production at one position is indepenent of the others. In that case, finding top performers with average teammates is easy. Yet with rebounds, it appears to be very hard to find even one.

    The regressions, honestly, are quite a hash. You can’t look at a team’s “top rebounder” because that’s relative to his teammates. Obviously, if a team’s best rebounder has a Reb48 of 8, there’s a limit on how many rebounds the rest of the team can possibly have. What you want to do is see how many rebounds a team got from its center, and compare that to the rest of the team. Then do the same for PF. The question is, “if I get an extra rebound from position/player X, how many extra rebounds will the team have?” I believe you will find very strong negative correlations, especially on defensive rebounds. And you will definitely find diminishing returns on Wins Produced — heck, Dr. Berri has even reported that finding.

    And let me note that I agree with Arturo when he says “I’m happy with the coefficients math gave me for rebounding.” The coefficients were derived from TEAM regressions, and I agree that the proper coefficient for a team rebound is 1. Where WP goes awry is applying the coefficients to individual players, a step that reflected only an untested assumption that still has never been validated.

    • Fred Bush
      11/24/2010
      Reply

      Guy, all teammates of a rebounder are going to be “below average” by your methodology — guards don’t get as many rebounds as bigs, so once you take the big out of the equation, *and compare the rest of the team to other teams with their bigs*, of course you’re going to get a disparity. To support your argument, instead of looking at team averages, you would need to compare individual players to their counterparts at the same position, or remove the top rebounder from every team.

      • Guy
        11/24/2010
        Reply

        Fred: sorry, I missed your comment until now. You are correct that failing to account for position would create the appearance of diminishing returns, even if they didn’t exist. However, in every case I am comparing a rebounder to the average at his position, so he is being compared to other big men. And the teammates are therefore being compared to the average at their respective positions as well. As best I can tell, the teammates of all these “good rebounders” are below-average rebounders for their respective positions (once we account for rebounding opportunities and pace). And by the way, if you made a list of the “worst rebounders” you would see the opposite effect: their teammates would be well above-average rebounders overall.

        • bduran
          11/24/2010
          Reply

          Where can you find average rebounding by position? I did a quick search and didn’t find it. I’d be interested in how the average was calculated. It should be (total rebounds for the year collected by position)/(82*48). I was wondering if someone’s RB48 who played 12 min a game was waited equally with someone who played more. This would bring the average down assuming bench players are worse rebounders then starters.

          • EvanZ
            11/24/2010
            Reply

            Hoopdata, basketballvalue, basketball-reference…

            • bduran
              11/25/2010
              Reply

              Thanks, I was never able to find position average on basketball-refernce, maybe I’m blind? Found it easily enough on hoopdata.

  4. some dude
    11/24/2010
    Reply

    1. You can’t compare team rebounding rate to the best rebounder. What if that rebounder’s backup is almost as good or above average? You have to compare it to his 5 man unit’s rebounding teammates. Just look at the list you came up with. Jayson Williams only played 24 or so minutes per game in ’96, so that his teammates were above average doesn’t prove anything. Half the time he wasn’t in there to rebound.

    2. Taking out the 2 best rebounders doesn’t prove anything. Of course they’re going to have increasing returns. I’m confused as to why you think this is surprising. You expected diminishing returns!? This is what happened to Golden State last year. Diminishing returns has a threshold. When you take out the 2 best rebounders, assuming they play a lot of time together, you’re going to cross that threshold and screw yourself. You’re essentially regressing only on the extremes.

    3. You still have to compare the best rebounder to the team after it’s being replaced by someone else who is a worse rebounder to see if there’s diminishing returns (or visa-versa).

    4. The WP48 stuff is intuitive. We’d expect it to be that way (Kobe does make Pau better!!!). That said, I wonder if the “everybody else” cancels each other out. What I mean is, role players on good to elite teams play better because of their top 3 players while role players on poor teams play worse because of their top 3 players and thus cancel each other out.

    Let’s say we looked at only the top 6 teams each year via point differential and regressed their top 3 vs the rest of the team. Would increasing returns be the result, now? I’d be curious.

    I applaud the work but fear OVB still at the heart of the rebounding stuff. That and I don’t get the 2 rebounding thing since I don’t know how anything but increasing returns could be seen.

  5. some dude
    11/24/2010
    Reply

    I check up on Derick Coleman in 1994 just cuz. Know who his direct backup was? jayson Williams, with nearly the exact same individual rebound rate. So when Coleman sat, they brought in Jay-Will who was over 14RP48. Of course the team was above average! You’re looking at his backup who is also above average!

    I don’t doubt this is an issue throughout the list. And Guy pointed out the other issues.

  6. Rashad
    11/24/2010
    Reply

    I think the problem is that the reason rebounds are so valuable in wins produced is more of as a signaling mechanism. I mean, a defensive rebound means that your opponent missed a shot, and then you gained possession. We usually think about the rebound measuring the gaining of possession, but unless the opponent missed that shot in the first place, the rebounding opportunity wouldn’t be there.

    Basically, the issue is horrendously complicated, because teams with good defense will tend to generate more defensive rebound opportunities (lower opp FG%). But at the same time, bad teams will generate more offensive rebound opportunities (poor own FG%).

    Then, teams that get fouled during shooting a lot will generate less reb opportunities (no rebound off of a shooting foul).

    So in the end, looking at team rebound differential doesn’t tell you much, because you don’t know why. Basically, in terms of statistical analysis I’ve come to the conclusion that rebounds are good, we can’t really quantify exactly why, and Wins Produced gets close enough for government work that I’m happy with it overall.

  7. Rashad
    11/24/2010
    Reply

    Now that I think about it a bit more, it should theoretically be possible to do a regression dis-aggregating the value of a rebound.

    I mean, you just add variables in to the regression, right? You could add in a measure of a team’s ability to create missed shots and see how that works out, and how it affects the value of rebounds in the regression. I guess this gets back to Berri’s attempts to include some measure of defense in the equation, and finding out that it didn’t change much.

    • 11/24/2010
      Reply

      Rashad,

      We’ve actually done a lot of playing with the rebounding coefficients (I think between the prof. and me we must be at close to 40 post, two books and multiple articles on the subject). There is a case for Multicollinearity (i.e. multiple predictor variables in a multiple regression model that are highly correlated). Rebounding is that measure of a team’s ability to create missed shots and all the actual WP data seems to say that the rebounding value is still properly assigned to the particular player. If it wasn’t, then WP48 numbers wouldn’t be as consistent for players when they move teams.
      Yes, the value of a rebound reflects more than the actual rebound (i.e the defense played by the team to force the miss) but for the most part assigning the value to the player who got the rebound while not right in every instance seems to tend to the reality for a large enough sample size (law of large numbers again).

      As for dis-aggregating the value of a rebound, we now do oreb and dreb but ideally further categorization would help. We just need a game charting project !!! :-)

      • Guy
        11/24/2010
        Reply

        ” If it wasn’t, then WP48 numbers wouldn’t be as consistent for players when they move teams.”

        I think it’s revealing that the arguments against diminishing returns usually take this form: “if there are diminishing returns, then X would happen, but we don’t see X.” The trick is that those arguing for diminishing returns never claimed “X” is true, and dispute that it logically follows from diminishing returns. There are many reasons player WP48 could correlate well from year to year despite diminishing returns. But that’s a distraction. Let’s just focus on the issue at hand. Where is the evidence that individual rebounds increase team rebounds on a roughly 1:1 basis? If that is true, the evidence will be everywhere and easy to see. We don’t need these intellectual bankshots of “if X then Y, etc.”

        And Rashad, you should know that there have been several studies of diminishing returns on rebounds. Evan Z linked to a good one on the previous post, which includes links to others. Every one finds huge diminishing returns for Drebs, and smaller ones for ORebs. Dr. Berri, AFAIK, has never studied the issue (he simply assumed no diminishing returns when he constructed WP, and now is stuck with that unfortunate position). So the consensus position based on research is that the effect is there, and there’s really no disagreement on that in the research community. But there are various estimates of the size of the effect.

        • ilikeflowers
          11/24/2010
          Reply

          Where is your model? All this talk, time to put up.

        • 11/24/2010
          Reply

          Guy,
          If I replace Player X (10 reb p48) with Player Y (11 reb p48) , give him the exact same minutes and hold every other variable in the universe constant then i’ll see exactly that. It’s an oversimplification of any situation. That said let me give you an example :-)

          San Antonio in 94-95 with Rodman for 1568 minutes, 823 boards: 3690 Treb for, 3320 against +370
          San Antonio in 95-96 with Will Perdue for 1396 minutes, 485 boards (-338): 3523 Treb for, 3582 against -59 (or a total difference of -429)

          Chicago in 94-95 with Perdue for 1592 min and Greg Foster for 299 (1891 min, 576 reb): 3400 Treb for, 3320 against +80
          Chicago in 95-96 with Rodman for 2088 min, 952 reb: 3658Treb for, 3117 against +541

          So Rodman for Perdue in San Antonio on a player basis gives me -338 ( and I get -429 for the team)
          and Perdue/Foster for Rodamn in Chicago on a player basis gives me +316 ( and I get +461 for the team)

          Keep in mind that I only looked for one example and in both cases I found increasing returns (which confirms what I found for the full data set).

          • Guy
            11/24/2010
            Reply

            “If I replace Player X (10 reb p48) with Player Y (11 reb p48) , give him the exact same minutes and hold every other variable in the universe constant then i’ll see exactly that.”

            No, you won’t see that. You will see team rebounds increase by something like .2 or .3 Reb48, not 1.0. Unless you mean holding all other player Reb48 constant. But of course that’s a tautology: the sum of player Reb48 must equal team Reb48. That’s not exactly in dispute. The question is: if one team has a 15.4 Reb48 center (+3), and another team has a 9.4 Reb48 center, how different will the two teams’ rebound totals be? That’s what we want to know. WP says the first team will have about 6 more rebounds per game. That’s ON AVERAGE, of course — obviously not true for every team. If there are no diminishing returns, then the gap should be more than 6 about half the time, less than 6 the other half, and about 6 on average. My argument is that the real gap is actually closer to 2 rebounds than 6.

            Arturo: you have the data and the statistical tools to answer this easily. The only question is, are you willing to do it? You say you are willing to poke the bear (I guess that’s me?), but will you risk “poking-the-Berri?” :>)

            • greyline
              11/24/2010
              Reply

              Actually Wages of Wins says that the first team will be better by .192 WP per game. It says nothing about how many more team rebounds the team will have.

              • Guy
                11/24/2010

                Greyline: So you are saying that each addional rebound by a team’s C should generate an extra .03 wins for the team. That’s a corrolary of what I said. But if you go ahead and run that regression, you will find it does not have that effect. The payoff will be less than half of that.

                And really WP says both things. It says the first team is .180 higher in WP, BECAUSE it will get 6 more rebounds per game. The WOW authors are very clear that player rebounds create team rebounds. Remember, their regression showing rebounds are worth one point is a TEAM level regression — it’s not based on a regression using player stats. So the whole reason for valuing player rebounds at one point is because they are assumed to add an equal number of team rebounds.

          • Guy
            11/24/2010
            Reply

            Just out of curiosity, I looked at Rodman’s impact on San Antonio and Chicago. The impact is indeed dramatic — but I think not in the way you expected. Five players had at least 1000 MP for SA in both 94-95 (with Rodman) and 95-96 (without). With Rodman, they averaged 6.37 Reb48. The next year, without Rodman, the same five players averaged 7.92, an incredible gain of 1.55 Reb48 per player. (Robinson went from 13.7 to 15.9; Person from 6.1 to 9.3). With four other players on the court, Rodman had apparently been taking about 6 rebounds per game from these teammates. (And that’s what he could do playing only 1568 minutes!).

            The next year he joined Chicago. Again, 5 players had at least 1000 MP that year and the year before. Before Rodman, they averaged 7.92 Reb48. With Rodman, they dropped to 6.67, a drop of 1.25 per player. (Pippen, for example, declined from 10.1 to 8.4.) So Rodman’s teammates had a loss of about 5 rebounds per 48 across 4 players.

            Thanks for pointing me to such a fabulous example of diminishing returns, Arturo. Are you ready to come over to the Dark Side? :>)

          • some dude
            11/24/2010
            Reply

            Your math is weird, Artuto.

            When I compare Rodman and Perdue based against the 50% mark (average), Rodman’s extra 400 or so rebounds only nets like +200 team rebounds. Probably less if we adjust for the fact that the 2nd year had more shot attempts and more minutes. 130-150 range for an extra 360 boards individually.

            I also have all of Rodman’s teammate’s rebounds in SA at 7.6RP48 and Perdue’s teammates at 7.97RP48. Of course this includes minutes Rodman and Perdue were not on the floor and still shows diminishing returns.

      • Westy
        12/2/2010
        Reply

        A basic question here. You note, “If it wasn’t, then WP48 numbers wouldn’t be as consistent for players when they move teams.” Why is this a problem? You’ve noted that rebounds are much more consistent than, for instance, shooting efficiency from year to year. But why does this need to be the case? If rebounds were properly credited to, as you note, a full team effort, yes the consistency would potentially be lower from year to year for an individual player, but could it be that was more reflective of their value as a part of the team defense? If it’s understood that a player’s contribution to their WP48 via shooting can fluctuate, why couldn’t it happen via their rebounding?

  8. Cal
    11/24/2010
    Reply

    So, let me just get that I’m interpreting the last scatter plot correctly:

    1) There’s no real correlation between the WP48 of the best player and the WP48 of his teammates.
    2) Same goes for 2nd best player.
    3) There is a positive correlation between WP48 of the third best player and the WP48 of his teammates.

    I think all we’re picking up here is that some teams are deeper in their talent than others and the the WP48 of the third best player is a reasonable indicator of how deep a team is (notice the density of observations gets thinner as everyone else’s WP48 gets higher). I’d guess that if you did the 4th or 5th best player the correlation would be even better.

    Also theres an unusual observation at the top of the last scatter plot way away from everyone else. What team is that? They must be ridiculously stacked.

    • 11/24/2010
      Reply

      Cal,

      I screwed the placement on the equation on those.

      The second best player’s WP48 improves with the best player WP48 as does the 3rd best player’s WP48. Everyone else remains about the same.

      • Cal
        11/24/2010
        Reply

        Oh, okay. Thanks for clarifying.

        I doubt there is causation here, seems more likely to me that superstars attract more stars. Maybe there’s a good lesson to GM’s to lock down their best players while they can. You know, a ‘if you sign him, they will come’ kind of effect. Might be something to look into.

        Either way, we know the better the first player on a team is, the better we expect the second and third will be. Sounds good to me.

  9. todd2
    11/24/2010
    Reply

    How about a rebounding percentage? Rebounders benefit more if their teamates are better defenders.

  10. Guy
    11/24/2010
    Reply

    Methodologically, it is simply incorrect to use “best on team” (whether in WP or rebounds) as your independent variable here. This variable must not be determined in any way by the dependent variable (hence the label “independent.”) But that’s not the case in Arturo’s regressions. The “best WP48″ is also, by definition, a cap on the WP48 of all other players. If the best player is .150, then we already know a very powerful fact about the other 10 guys: every one is .149 or lower! But if the best player is .310 WP48, he could have teammates as good as .309, a much higher “cap.” So this will have a very powerful tendency to create a positive correlation between best player and rest-of-players, offsetting any diminishing returns that may exist. I have never seen anyone use within-team ranking as an independent variable, and for good reason.

    And as it happens, Professor Berri reports that for every .100 increase in teammates WP48, a player’s WP48 will be reduced by .030, so we already know there are diminishing returns. It would be very helpful to know the reverse: for every .100 WP48 a player delivers, how much does that decrease WP48 for his teammates? Maybe Arturo could run that? That would tell us, on average, how much WP48 overstates an above-average WP48 player’s contribution.

  11. Anon
    11/24/2010
    Reply

    Why is Ray Allen in your picture at the end and not Rajon Rondo?

  12. Ken
    11/24/2010
    Reply

    Arturo,

    I have several thoughts about the rebounding debate here.

    First, it looks to me like the R^2 for all of these models is close to zero. Doesn’t that mean that all of the regressions in the post do a very bad job of explaining the data? You make that point yourself after the first chart, but don’t repeat it regarding the subsequent models where, in many cases, the R^2 is worse.

    Also, why wouldn’t Guy’s analysis of Rodman’s impact on his teammates in SA and Chicago be the right approach? It’s not inconsistent with the WP findings to say that a good rebounder would pull rebounds away from his teammates, leaving them all within around 10-20% of where they were in prior years. This would lead to both diminishing returns and fairly stable statistics over time. One explanation would be that rebounding is a skill, just like we’re treating it, but that you’re in competition with everyone else on the court for a rebound, so some of that skill is being used in a way that, while good for your stats, is neutral for your team. Having a rebounder for a teammate means that your competition just got a bit stiffer.

    Last, from a theoretical perspective, a basketball game has a finite number of rebounds, just like there are a finite number of shots. Even if rebounds is a good statistic to look at, wouldn’t it make sense for “rebounds/chance at rebounds” or something like that to be a BETTER statistic? It would be similar to accounting for missed shots in WP. Obviously rebounds correlate with winning much better than the number of made shots does, so I would guess that “missed rebounds” is far less important (smaller coefficient in the regression) than “missed shots.” But I think it would still make more sense to look at efficiency instead of a counting stat with rebounds just as we do with shots. (I guess the same would go for assists, turnovers, etc.)

    I realize that rebounding efficiency is not something for which there is data, so we obviously couldn’t use it in our present models. But I don’t think I fully understand your rationale for rejecting the hypothesis that some kind of rebounding efficiency statistic would be slightly better at valuing player performance.

    Does your data set have enough information to figure out how many rebounds a player didn’t get while on the court? If so, we could test whether putting that notion into the wins produced regression would improve it or not.

    • 11/24/2010
      Reply

      Ken,
      Typically a low R^2 is indicative of low correlation. In this case, given the huge amount of uncontrolled variable, the fact that the wp48 of the best player explains 10% of the variation of the second best is interesting.

  13. Guy
    11/24/2010
    Reply

    Good news: it turns out Professor Berri has already measured diminishing returns for WP. So at least we don’t need to argue about that issue: he has already reported very large diminishing returns for WP. He finds that for each additional WP by teammates, a player’s WP declines by .3. Let’s consider the impact of replacing an average player (.100) on a team with a superstar (.300). For teammate #1 of this superstar, the WP48 of his four teammates has increased from .100 to .150. That means teammate #1 sees his WP48 fall from .100 to .085. Doesn’t seem that huge, right? But the superstar has this same effect on EVERY teammate. So he has reduced the total WP48 of four teammates by 4 * .015 = .060. In other words, we gained .200 by replacing an average player with the superstar, but at the cost of other players declining .060. So fully 30% of the WP48-estimated value of great players (on average) is really an illusion — it’s offset by losses in other players.

    And this is just the overall average. I’m certain that the diminishing returns is much lower for players whose WP48 results from being a high-volume and efficient shooter. But diminishing returns will be much higher — probably 50% of more — for players whose WP48 is largely based on rebounding. That would actually be a great research project for Arturo: estimating the diminishing returns coefficients for different types of players.

    Prof. Berri’s interpretation of his finding seems to be that good players literally make their teammates worse. More likely, what it reveals is that WP48 is simply overestimating the true value of above-average players by 30%. But either way, a .300 WP48 player is only delivering about 70% as many additional wins as WP currently estimates. Opinions may differ on whether this effect is “small” or “large”, but being off by 30% on average seems pretty large to me.

    • 11/24/2010
      Reply

      Guy this is wrong.

      The post in question is :http://dberri.wordpress.com/2010/10/15/al-jefferson-and-diminishing-returns/
      The finding is that if player A goeas from Team 1 to Team 2 his WP48 will be changed based on the following equation:
      .300 (WP48 of Team 1 -WP48 of Team 2)

      So if you go from a bad team (.050) to a good one (.150) a player would be expected to improve .030 Wp48 on average and vice versa (get worse going from good team to bad). This actually lines up with my last chart and practical reality (you’re better off playing with better teammates).

      • Guy
        11/24/2010
        Reply

        Arturo, I’m afraid you’ve got it backwards. The coefficient is negative .3. (It wouldn’t be “diminishing returns” otherwise, right?) Berri says this:

        “This study – across 30 years of data – indicated that teammate WP48 had a statistically significant and negative impact on player performance. The coefficient on this factor was -0.300. And this tells us that the Jefferson’s WP48 should decline by 0.025 as he moves from Minnesota to Utah [-0.300 * (0.109 – 0.025)].”

        So as players’ WP48 improves, his teammates lose about 30% of that gain as their WP48 declines. (And vice versa.)

      • 11/25/2010
        Reply

        Yes,
        It’s backward. It’s -.300 . Still the most extreme shifts (say .050 to .150) only lead to a .030 WP48 shift (or about 1.25 wins for 2000 minutes).

        • Guy
          11/25/2010
          Reply

          Yes, it’s a small change for any one teammate, but the change in production impacts four other players, so the cumulative effect is quite large. If my team replaces a .100 WP48 PF with a .300 PF, the other four players on the court EACH decline by .015. That’s a total loss to the team of .060. At 3,000 minutes, I’ve added 12.5 wins at PF, but subtracted 3.75 wins from other players. So the team only gains 8.75 wins, or 70% of the apparent gain at PF. Don’t you think that’s a very large effect? If your productivity declined 30%, wouldn’t your boss notice?

          Now, for the record, I don’t think great players really reduce teammate performance this much. It’s a little of that, but mostly it’s WP48 overstating the real productivity of many players. But either way, the result is the same: each additional Win Produced generated by a player only produces .7 actual wins for his team.

          I think part of the problem with the discussion of diminishing returns is Prof. Berri examines it always in terms of the effect that “rest of team” has on an individual player’s WP48. This allows him to say the effect is “small.” But it’s equally true, and I think a more helpful way to think about it, that the individual player’s WP48 affects his teammates’. And the story is the same with rebounds: the “great” and “bad” rebounders are pretty consistent each year in their reb48, but they are having a big impact on their teammates’ rebound totals. Since that effect is spread around 10 other players, and a lot of players don’t change teams, the y-t-y correlation for players is quite high. But watch what happens when these guys change teams: as you saw with Rodman, the effect is quite large.

  14. Man of Steele
    11/24/2010
    Reply

    funny thing for those wanting to use circumstantial evidence: you might want to run the numbers for the warriors this year. Both Lee and Biedrins seem to be doing well (although I don’t have the numbers in front of me).

    • some dude
      11/24/2010
      Reply

      Lee’s rebound RB48 from last season of 15.6 is down to this season of 15.1 playing next to Biedrens.

      The gap is actually bigger when you adjust for the fact that Lee is seeing more missed shots in Golden State than New York to the tune of 1.54 more per 48 minutes.

      Last season Biedrens had a RP48 of 16.2. This season it’s 15.96 and don’t forget Lee missed time with the same opportunities.

      Again, indication of diminishing returns, but less so because let’s face it, the other 3 positions can’t rebound for the life of em. So even in the extreme example of Golden State, to this point both Biedrens and Lee has shown a lower RB48.

      Of course as a unit they’re rebounding better. They were the worst rebounding team of all-time last year. Nowhere to go but up landing Lee. But he isn’t adding as many rebounds as he did for New York.

  15. 11/26/2010
    Reply

    This is a great discussion. Kudos to everyone for the thoughtful debate. I am curious, is there no data for actual rebounding rate (as measured by available rebounds by lineups). Seems easy enough to get that data from the play by play from ryan parker on basketballgeek.

    If that is the case, it seems like looking at the rebounding rate by availability would be a much better statistic to look at for diminishing rates.

  16. Italian Stallion
    11/29/2010
    Reply

    I was born with a strong aptitude for mathematics, but unfortunately I never pursued it. So I have a strong appreciation for what you guys are attempting to do with math. However, I think sometimes mathematically oriented people get so lost in their numbers they fail to see a reality staring them right in their face.

    I watch about 300 basketball games every season (including every Knicks game).

    I can’t count how many times I’ve seen a shot go up and watched 2 (or more) defenders standing under the basket in position to get the rebound. In almost all cases, one defers to the other (on the rare occasions they don’t bad things can happen). If they all deferred equally it would all net out to zero. But the don’t. Players usually defer to the best or most aggressive rebounder. So he gets more rebounds than the others of those that were available to multiple players.

    For years I watched other Knicks stand under the basket and defer to David Lee. If Mini Me was playing C for the Knicks, other Knicks would have gotten some of those rebounds.

    So what did Lee add?

    He added plenty, but certainly not as much as he actually rebounded.

    To me, this couldn’t be more obvious.

    If the numbers suggest otherwise, then IMO there’s got to be something about the study that is flawed even if I lack the mathematical and other insights required to determine what it is.

    The real problem is that this kind of thing does not apply equally to all players. There is NOT some clean uniform formula that will tell you how many reb0unds a player actually added based on how many he got, the position he plays etc…

    Until every possible detail is tracked, we won’t have that information.

    Until such time as we do, stats have to be used as one tool in the analysis along with common sense and game watching. And the answers we get will remain gray.

    • Guy
      11/29/2010
      Reply

      I agree 95% with this, IS. My only amendment (friendly?) would be to this statement: “The real problem is that this kind of thing does not apply equally to all players. There is NOT some clean uniform formula that will tell you how many reb0unds a player actually added … Until every possible detail is tracked, we won’t have that information.” This is right, but in the absence of such information you still have to make the best assessment of a player that you can. We CAN measure the average gain in rebounds at the team level for a player with OReb% = X and Dreb% = Y. That will be a much more accurate starting point than using Reb48 and pretending every extra player rebound equals an extra team rebound. Then, apply whatever game-watching and common sense adjustment you want to. But use the correct objective foundation before adding your subjective data.

      • Italian Stallion
        11/29/2010
        Reply

        Agreed.

        That would be an improvement.

  17. […] think of this like a +/- for rebounding. The focus of most discussions/debates on rebounding (see here or here for recent examples) tends to be on forwards and centers, simply because they get the most […]

  18. 12/13/2010
    Reply

    In addition to exhibiting fairly low r-squared values, all of your regressions show pretty small coefficients, suggesting that if there really is any effect, it is very weak.

    • 12/13/2010
      Reply

      Will,
      The big problem with this kinda stuff is that really I can’t properly control for significant variable (like roster composition). That said, the correlations are weaksauce (which is what I would expect if as I belive for the most part player performance is a function of that player). That said, the best correlation is between the two top players.

Leave a Reply

Your email address will not be published. Required fields are marked *