Leveling the Playing Field: A Look at Parity in the NHL

This week’s column in The Star looked at parity in the NHL. Parity refers to the extent to which teams are equal in terms of talent and ability. The specific form of parity that was addressed concerned the extent to which teams are evenly matched in a given season. (Sports economists also look at measures of parity that examine the extent to which some teams are consistently at the top of the standings while others at the bottom, but that’s not the focus of this column.)

When teams are evenly matched, we should expect the standings at the end of the season to be more compressed than when they are not. The dispersion in winning percentage is therefore a good place to start in terms of measuring parity.

The usual measure of dispersion is the standard deviation. The standard deviation gives a measure of how much teams differ from the average winning percentage (which must be .500). Specifically, the formula for the standard deviation is

\sigma = \sqrt{\frac{\sum \left( \textrm{Win Pct} - .500 \right)^2}{N}}

where N refers to the number of teams, and the summation is over those N teams.

If we were just going to look at recent NHL seasons, we could just use the standard deviation as our measure of parity. Unfortunately, this measure isn’t particularly useful when we want to look at seasons with differing numbers of games (either because we’re making comparisons across leagues or across eras).

The intuition for why this is the case is (hopefully) captured by the following example.

Consider a league in which 4 teams play 16 games each, and another league in which 4 teams play 160 games each. In the first league, the final standings are as follows:

Rank Team Wins Losses
1 Tequila 11 5
2 Bourbon 9 7
3 Vodka 8 8
4 Wine 4 12


These standings suggest that Tequila was the strongest, but not that much more so than Bourbon and Vodka, but that there was a bit of gap from there down to Wine. On the whole, however, not the difference in potency is noticeable, but not striking.

Now consider the final standings for the second league that played 160 games each.

Rank Team Wins Losses
1 Creemore 110 50
2 Okanagan Springs 90 70
3 Block 3 80 80
4 Blue 40 120

Based on these standings, I think it would be fair to say that Creemore was the best team, and actually quite dominant, Okanagan Springs and Block 3 were pretty good, but the difference between Okanagan Springs and Block 3 was fairly noticeable, and that Blue just wasn’t very tasty good. In this case, I think one would be inclined to say that the gap between first and last is much bigger than in the first standings above.

However, the distribution of winning percentages is the same in both standings.

The only difference between the two cases is that Creemore had to maintain their league leading winning percentage for 10 times as many games as Tequila did. The probability that Creemore could achieve their record through luck is considerably less than the probability that Tequila could. And therein lies the problem.

The solution, then, is to consider what would happen if all teams were exactly equal and each team had a 50-50 chance of winning each game. In this case, the standings would essentially be determined by a series of coin flips, with the coin being flipped 10 times more often in the second league.

When it comes to coin flipping, the distribution of heads (wins) is given by the binomial distribution. The more you flip the coin, the tighter the distribution becomes (the lower the standard deviation). The standard deviation of the binomial distribution (with a probability of winning equal to ½) is given by

\sigma_I = \frac{0.5}{\sqrt{G}}

where G is the number of coin flips, or games played in this case.

In order to compare seasons with differing number of games, then, sports economists use the measure

R = \frac{\sigma}{\sigma_I}

which is the number reported in the column in The Star.  As mentioned in The Star, if the parity measure for a given season is, for example, 1.7, then the standard deviation of winning percentages that year is 1.7 times the standard deviation of the standard deviation we would expect to see by flipping coins for each game.

The other issue that had to be addressed is the introduction of the “Bettman point”. In the 1999-2000 season, the league began awarding a point for overtime losses. In order to make sure that we’re comparing apples to apples when looking at different seasons, the standard deviation of winning percentages (not point percentages) is used for all seasons, where ties count as half a win.

The following table shows the measure of parity, R, for all seasons since 1918-19. Note that the NHL began in 1917, but the 1917-18 season has been excluded because teams did not play an equal number of games. Specifically, the Montreal Wanderers only played 6 games while the Montreal Canadiens, Toronto Hockey Club and Ottawa Senators all played 22.

Year Parity Year Parity Year Parity Year Parity Year Parity
1918 1.387777 1938 1.862858 1958 1.249762 1978 2.271952 1998 1.714227
1919 2.179449 1939 2.128324 1959 1.582644 1979 1.998511 1999 1.913641
1920 1.443376 1940 1.896111 1960 2.132515 1980 2.119355 2000 1.974018
1921 1.424001 1941 1.309307 1961 2.285565 1981 2.09563 2001 1.662108
1922 1.42156 1942 1.390444 1962 1.70154 1982 2.259109 2002 1.769893
1923 1.118034 1943 2.764055 1963 1.708522 1983 2.334014 2003 1.753278
1924 1.870829 1944 2.58328 1964 1.771467 1984 2.065015 2004
1925 1.623244 1945 1.177568 1965 1.741647 1985 2.1539 2005 1.945602
1926 1.700267 1946 1.464076 1966 1.873372 1986 1.328264 2006 1.821283
1927 1.788854 1947 1.30526 1967 1.377097 1987 1.639178 2007 1.172864
1928 1.754216 1948 1.142609 1968 1.989005 1988 1.696635 2008 1.559654
1929 2.350048 1949 1.467749 1969 2.233124 1989 1.64534 2009 1.517379
1930 2.223838 1950 2.633122 1970 2.527867 1990 1.822805 2010 1.506625
1931 0.808122 1951 2.096482 1971 2.648173 1991 1.676712 2011 1.424524
1932 1.222853 1952 1.410842 1972 2.518521 1992 2.604369 2012 1.485111
1933 1.232282 1953 2.209288 1973 2.400254 1993 1.838488 2013 1.731581
1934 1.607275 1954 2.296996 1974 2.816371 1994 1.510646
1935 1.136515 1955 1.923538 1975 2.882659 1995 2.051632
1936 1.222617 1956 1.747106 1976 2.558211 1996 1.384042
1937 1.769239 1957 1.718249 1977 2.711344 1997 1.708054


It is worth noting that in 1931 the measure of parity is actually less than 1. Recall that, when teams are equal, we should expect to see some dispersion in the standings just based on luck. The standard deviation of the binomial distribution tells us the amount of dispersion we would see if we flipped a coin a certain number of times for an arbitrarily large number of trials. In looking at the dispersion in the standings, the number of times the coin is flipped is determined by the number of games in the season, and the number of trials is determined by the number of teams. Because the number of teams is so small, it is possible, through randomness, to end up with a spread that is less than the binomial distribution. Remember – it’s not impossible for the standings to have all teams with a .500 record, just very unlikely!

For this reason, it makes sense to look at average measures of parity over a number of years. The eras used in The Star column were with two factors in mind: making sure there were enough years to diminish the effect of chance in any single year; and, based on some feature concerning the number of teams in the league.

The era from 1918 to 1930 was a time where the number of teams changed dramatically, starting with 3 teams and ending with 10. From 1931 to 1941, the set of teams was a bit more stable, with there being 7, 8 or 9 teams in the league (the original Ottawa Senators and the Montreal Maroons left the league during this time). In 1942, the New York (or Brooklyn) Americans folded, leaving the league with the teams now known as the Original Six, which persisted until the 1966-67 season. The 1967-68 season kicked off the Expansion Era, and the 1994-95 season was the first lockout. (A case could be made for lumping the 1994-95 to 2003-04 seasons together with the Expansion Era seasons, but there are reasons to think that the rapid expansion of the earlier years would have a different effect on parity than the slower expansion that came after, and this is in fact borne out in the data.) Finally, the last era considered is the Salary Cap Era.

I think the data present a fairly convincing picture of the effect that the salary cap has had on parity, and also illustrates the importance of a hard cap in this regard. The NBA’s soft cap is really a bit of a joke with all its exceptions. If it’s possible to spend almost triple the salary the cap, as the New York Knicks did in 2005-06, then there really is no difference between a soft cap and a luxury tax, which MLB employs.

No Comments Yet

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>