This week’s column in The Star looked at parity in the NHL. Parity refers to the extent to which teams are equal in terms of talent and ability. The specific form of parity that was addressed concerned the extent to which teams are evenly matched in a given season. (Sports economists also look at measures of parity that examine the extent to which some teams are consistently at the top of the standings while others at the bottom, but that’s not the focus of this column.)
When teams are evenly matched, we should expect the standings at the end of the season to be more compressed than when they are not. The dispersion in winning percentage is therefore a good place to start in terms of measuring parity.
The usual measure of dispersion is the standard deviation. The standard deviation gives a measure of how much teams differ from the average winning percentage (which must be .500). Specifically, the formula for the standard deviation is
where N refers to the number of teams, and the summation is over those N teams.
If we were just going to look at recent NHL seasons, we could just use the standard deviation as our measure of parity. Unfortunately, this measure isn’t particularly useful when we want to look at seasons with differing numbers of games (either because we’re making comparisons across leagues or across eras).
The intuition for why this is the case is (hopefully) captured by the following example.
Consider a league in which 4 teams play 16 games each, and another league in which 4 teams play 160 games each. In the first league, the final standings are as follows:
Rank | Team | Wins | Losses |
1 | Tequila | 11 | 5 |
2 | Bourbon | 9 | 7 |
3 | Vodka | 8 | 8 |
4 | Wine | 4 | 12 |
These standings suggest that Tequila was the strongest, but not that much more so than Bourbon and Vodka, but that there was a bit of gap from there down to Wine. On the whole, however, not the difference in potency is noticeable, but not striking.
Now consider the final standings for the second league that played 160 games each.
Rank | Team | Wins | Losses |
1 | Creemore | 110 | 50 |
2 | Okanagan Springs | 90 | 70 |
3 | Block 3 | 80 | 80 |
4 | Blue | 40 | 120 |
Based on these standings, I think it would be fair to say that Creemore was the best team, and actually quite dominant, Okanagan Springs and Block 3 were pretty good, but the difference between Okanagan Springs and Block 3 was fairly noticeable, and that Blue just wasn’t very tasty good. In this case, I think one would be inclined to say that the gap between first and last is much bigger than in the first standings above.
However, the distribution of winning percentages is the same in both standings.
The only difference between the two cases is that Creemore had to maintain their league leading winning percentage for 10 times as many games as Tequila did. The probability that Creemore could achieve their record through luck is considerably less than the probability that Tequila could. And therein lies the problem.
The solution, then, is to consider what would happen if all teams were exactly equal and each team had a 50-50 chance of winning each game. In this case, the standings would essentially be determined by a series of coin flips, with the coin being flipped 10 times more often in the second league.
When it comes to coin flipping, the distribution of heads (wins) is given by the binomial distribution. The more you flip the coin, the tighter the distribution becomes (the lower the standard deviation). The standard deviation of the binomial distribution (with a probability of winning equal to ½) is given by
where G is the number of coin flips, or games played in this case.
In order to compare seasons with differing number of games, then, sports economists use the measure
which is the number reported in the column in The Star. As mentioned in The Star, if the parity measure for a given season is, for example, 1.7, then the standard deviation of winning percentages that year is 1.7 times the standard deviation of the standard deviation we would expect to see by flipping coins for each game.
The other issue that had to be addressed is the introduction of the “Bettman point”. In the 1999-2000 season, the league began awarding a point for overtime losses. In order to make sure that we’re comparing apples to apples when looking at different seasons, the standard deviation of winning percentages (not point percentages) is used for all seasons, where ties count as half a win.
The following table shows the measure of parity, R, for all seasons since 1918-19. Note that the NHL began in 1917, but the 1917-18 season has been excluded because teams did not play an equal number of games. Specifically, the Montreal Wanderers only played 6 games while the Montreal Canadiens, Toronto Hockey Club and Ottawa Senators all played 22.
Year | Parity | Year | Parity | Year | Parity | Year | Parity | Year | Parity |
1918 | 1.387777 | 1938 | 1.862858 | 1958 | 1.249762 | 1978 | 2.271952 | 1998 | 1.714227 |
1919 | 2.179449 | 1939 | 2.128324 | 1959 | 1.582644 | 1979 | 1.998511 | 1999 | 1.913641 |
1920 | 1.443376 | 1940 | 1.896111 | 1960 | 2.132515 | 1980 | 2.119355 | 2000 | 1.974018 |
1921 | 1.424001 | 1941 | 1.309307 | 1961 | 2.285565 | 1981 | 2.09563 | 2001 | 1.662108 |
1922 | 1.42156 | 1942 | 1.390444 | 1962 | 1.70154 | 1982 | 2.259109 | 2002 | 1.769893 |
1923 | 1.118034 | 1943 | 2.764055 | 1963 | 1.708522 | 1983 | 2.334014 | 2003 | 1.753278 |
1924 | 1.870829 | 1944 | 2.58328 | 1964 | 1.771467 | 1984 | 2.065015 | 2004 | |
1925 | 1.623244 | 1945 | 1.177568 | 1965 | 1.741647 | 1985 | 2.1539 | 2005 | 1.945602 |
1926 | 1.700267 | 1946 | 1.464076 | 1966 | 1.873372 | 1986 | 1.328264 | 2006 | 1.821283 |
1927 | 1.788854 | 1947 | 1.30526 | 1967 | 1.377097 | 1987 | 1.639178 | 2007 | 1.172864 |
1928 | 1.754216 | 1948 | 1.142609 | 1968 | 1.989005 | 1988 | 1.696635 | 2008 | 1.559654 |
1929 | 2.350048 | 1949 | 1.467749 | 1969 | 2.233124 | 1989 | 1.64534 | 2009 | 1.517379 |
1930 | 2.223838 | 1950 | 2.633122 | 1970 | 2.527867 | 1990 | 1.822805 | 2010 | 1.506625 |
1931 | 0.808122 | 1951 | 2.096482 | 1971 | 2.648173 | 1991 | 1.676712 | 2011 | 1.424524 |
1932 | 1.222853 | 1952 | 1.410842 | 1972 | 2.518521 | 1992 | 2.604369 | 2012 | 1.485111 |
1933 | 1.232282 | 1953 | 2.209288 | 1973 | 2.400254 | 1993 | 1.838488 | 2013 | 1.731581 |
1934 | 1.607275 | 1954 | 2.296996 | 1974 | 2.816371 | 1994 | 1.510646 | ||
1935 | 1.136515 | 1955 | 1.923538 | 1975 | 2.882659 | 1995 | 2.051632 | ||
1936 | 1.222617 | 1956 | 1.747106 | 1976 | 2.558211 | 1996 | 1.384042 | ||
1937 | 1.769239 | 1957 | 1.718249 | 1977 | 2.711344 | 1997 | 1.708054 |
It is worth noting that in 1931 the measure of parity is actually less than 1. Recall that, when teams are equal, we should expect to see some dispersion in the standings just based on luck. The standard deviation of the binomial distribution tells us the amount of dispersion we would see if we flipped a coin a certain number of times for an arbitrarily large number of trials. In looking at the dispersion in the standings, the number of times the coin is flipped is determined by the number of games in the season, and the number of trials is determined by the number of teams. Because the number of teams is so small, it is possible, through randomness, to end up with a spread that is less than the binomial distribution. Remember – it’s not impossible for the standings to have all teams with a .500 record, just very unlikely!
For this reason, it makes sense to look at average measures of parity over a number of years. The eras used in The Star column were with two factors in mind: making sure there were enough years to diminish the effect of chance in any single year; and, based on some feature concerning the number of teams in the league.
The era from 1918 to 1930 was a time where the number of teams changed dramatically, starting with 3 teams and ending with 10. From 1931 to 1941, the set of teams was a bit more stable, with there being 7, 8 or 9 teams in the league (the original Ottawa Senators and the Montreal Maroons left the league during this time). In 1942, the New York (or Brooklyn) Americans folded, leaving the league with the teams now known as the Original Six, which persisted until the 1966-67 season. The 1967-68 season kicked off the Expansion Era, and the 1994-95 season was the first lockout. (A case could be made for lumping the 1994-95 to 2003-04 seasons together with the Expansion Era seasons, but there are reasons to think that the rapid expansion of the earlier years would have a different effect on parity than the slower expansion that came after, and this is in fact borne out in the data.) Finally, the last era considered is the Salary Cap Era.
I think the data present a fairly convincing picture of the effect that the salary cap has had on parity, and also illustrates the importance of a hard cap in this regard. The NBA’s soft cap is really a bit of a joke with all its exceptions. If it’s possible to spend almost triple the salary the cap, as the New York Knicks did in 2005-06, then there really is no difference between a soft cap and a luxury tax, which MLB employs.