The Hit Parade: More on the Relationship between Hitting and Winning

This week’s Star column looked at the relationship between hitting and winning. Several articles have already been written on this topic in the advanced stats community, but somehow the analyses generally seem to fall short in what seemed like a rather obvious way. This piece set out to correct this oversight.

This week, I need to mention that I am indebted to Pierre Chausse, another of my colleagues here at the University of Waterloo. Pierre is an econometric theorist and hails from La Belle Province, so he has really entered enemy territory. I am sympathetic to Pierre’s plight, having spent several years out in Vancouver myself, but I think by and large, even Pierre would agree that it hasn’t been too bad here with regards to hockey. After all, he can still get RDS. And of course, being from Quebec, Pierre can also flat out play hockey.

I would also like to mention a key motivation for this piece was a fairly recent post-game interview given by Dallas Eakins. It is easily one of my favourite moments from this season. The interview occurred after the Oilers had lost a 2-1 game to the Canucks, and Eakins was asked whether he thought his team had hit the Canucks enough. It is worth noting that the Oiler had outhit the Canucks 28-13.

Eakins’ response was a thing of beauty. Pretty much right away, he pointed out that the number of hits a team has is as much of an indication of the things they did wrong as the things they did right. To quote Eakins, “do you know what the perfect game is? The perfect game is no hits. Do you know why that is? Because you have the puck. You don’t have to hit anybody, you have the puck.”

Eakins’ response really encapsulates something that advanced stats guys had noticed when looking at the data. In order to deliver a hit, it must be that your team doesn’t have the puck and the other team does. Delivering more hits than the other team, therefore generally indicates that you had the puck less than they did. Since good teams generally have good possession stats, what you end up seeing in the data is either no pattern between hitting and winning, or perhaps even that good teams hit less.

At this point, what seemed rather obvious to Pierre and I was that you need to look at the correlation between hitting and winning when the effect of possession is removed – the term used in statistical analysis is controlling for possession. I’m not aware of a study that has done this.

Essentially what you do is that you consider the probability of winning a game to be a function of both hits and possession (and anything else you may want to look at) and let the data tell you what that function must look like. You can then use the estimated function to figure out what effect hitting has on the probability of winning.

One of the ways we looked at the data was mentioned in The Star – we looked at every game from 2007-08 to 2012-13 and considered a single function that would fit that data best. I should mention that, when it came to the measure of possession, we used both Fenwick and Corsi (specifically, Fenwick % and Corsi %). The results were identical. I should also mention that the effect we found was significant at the 1% level – meaning that we can feel very confident that this is not the product of random chance.

Another thing we did was to look at functions that are team-specific. That is, we ran 30 regressions – one for every team – and estimated 30 functions. This, I thought, was also quite interesting. We found a positive relationship between hitting and winning for every team, although the magnitude varied substantially across teams. In these regressions, we considered not hit differential, but hit percentage – the percentage of total hits in a game that were delivered by the team in question.

With this measure of hits, the effect we found is in terms of raising the hit percentage by one percentage point – so going from a hit percentage of 39% to 40%, for example. With this measure, we found that the teams who benefitted the most from hitting were Los Angeles, Phoenix, Anaheim, and Chicago. For LA, an additional percentage point in the hit percentage translated to a bump up in the probability of winning of 1.39 percentage points, while Phoenix, Anaheim and Chicago came in at 1.29, 1.25 and 1.21, respectively.

At first, I wondered if the differences in returns to hitting were being driven by differences in the teams actual hitting in the data. To that end, I checked to see where these teams lay in the spectrum. What I found was that, in our sample, Chicago had the best hit percentage, Phoenix and LA were middle of the pack, and Anaheim was 4th last. So all over the place.

The teams that had the lowest effect from hitting were the Islanders (0.25), Edmonton (0.34) and Vancouver (0.42). In terms of hit percentages, Vancouver was 4th best, Edmonton was middle of the pack, and the Islanders were bottom 10. So I feel pretty confident that these variations aren’t being driven by something inherent in the measure of hit percentage. I should also mention that these estimates were significant at the 5% level for all teams (and the vast majority at the 1% level) except for Vancouver. Leave it to VanCity to be different.

However, it is interesting to note that, in our sample, Chicago and LA both won Stanley Cups, while Anaheim won just the year previous. Edmonton and the Islanders were pretty bad during this time, but Vancouver was certainly very good.

Given that these teams changed not only their players but also their coaches over this time span, I’m not sure what to make of the differences between teams. But the fact that all teams had a positive return to hitting does seem to reinforce the effect we found when looking at the league as a whole.

At the end of the day, I don’t think there’s anything surprising about the fact that hitting does in fact correlate with winning. The ability to hit well, especially within a defensive system, can pay dividends that are noticeable while watching the game. But the fact that hitting and possession are negatively correlated certainly means that the relationship takes a little more work to uncover.

5 Comments

  1. March 28, 2014    

    Any chance you could share the regression equation?

    • Phil Curry's Gravatar Phil Curry
      March 28, 2014    

      Sure. I'll work on a new post that goes into the details about the regression. I should be able to get it up this afternoon (Friday).

      • Phil Curry's Gravatar Phil Curry
        March 28, 2014    

        Okay, I have a post ready to go - hopefully it'll be up soon. I hope this is what you were looking for.

        And thanks, by the way. I also went and checked out your site. I wasn't aware of it before. You've got some really good stuff there!

  2. james's Gravatar james
    March 28, 2014    

    Echo above on regression equations. Ideally you'd be showing figures/tables, regression coefficients, p-values. Let's see the good stuff!

  3. Adam C's Gravatar Adam C
    March 28, 2014    

    The recording of hits in the NHL is notoriously inconsistent from rink to rink. Did you control for this by restricting your analysis to home or road games only?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>