Experimenting With Unbiased Analytics

Anyone who has taken Statistics 101 understands the importance of randomness in generating a useful data set, and you don’t need a PhD in game theory to acknowledge the wisdom of conceding a battle or two to win a war.


Putting these basic ideas into practice isn’t always easy.


In my first job out of Wharton, I worked for a consulting firm that had me building statistical models for a major credit card issuer. Our client wanted to know how different customers would respond to changes in the way the bank did business with them, and how that would affect profitability.


So, for example, if they gave someone a credit line increase, our model would look at specific characteristics of that person and predict what would happen in terms of card utilization, likelihood of revolving a balance, likelihood of being delinquent in payment, etc., which would then build up into an overall profitability number.


I won’t pretend the work made my soul “hum”, but it did teach me a few things.


To begin with, there was the bank itself.


In order to build a decent model, we would have needed them to give random credit line increases to thousands of randomly selected card holders, track the data that interested us, and then come back and build our models after a year or two.


Of course, they wanted the work done now, and being conservative, they weren’t in the business of extending credit at random either. As a result, we had to work with the data they had, which was biased. Anything we could learn about how people behaved in response to a credit line increase would be limited to those “good” credit risks the bank deemed worthy of receiving additional credit.


That told us nothing about everyone else, despite the fact that people with bad credit might be highly profitable. The bank’s unwillingness to experiment with a tiny group of cardholders and risk a minor blip in their default numbers was leaving a big opportunity on the table.[1]


So what does any of this have to do with sports analytics?


Well, anyone with even a passing interest in professional sports knows winning is paramount. We don’t cheer for our favorite team to lose, and unless a team is so out of it that they start thinking about next year’s draft, we don’t want coaches and players to throw games.


But is it always bad to lose?


Put differently, does a team that is so preoccupied with winning display the same sort of conservatism as a bank that leaves money on the table by refusing to extend higher risk / higher yield loans?


I think so, and here’s why.


At a very general level, the performance of a professional athlete is the product of four elements: (i) ability; (ii) opportunity; (iii) quality of opponents; and (iv) randomness.


As fans, who have no control over any of these variables, we take all these elements as “given”, but in fact, two of them (opportunity and quality of opponents) are completely determined by coaches and GMs.


So, for example, those of us who had fantasy football drafts this month pored over rosters trying to find that middling running back with no competition in the backfield and a mediocre quarterback who will get a lot of “touches” and put up great numbers based on the sheer volume of opportunities available to him (can anyone say Toby Gerhart?)


Baseball fans know that guys who hit 9th will get fewer plate appearances overall and be less likely to hit with runners on base.


My sport (hockey) is no different. A hockey player’s performance is highly dependent on the quality of his linemates, how much ice time he gets, the situations in which he’s used (e.g. does he get power play time), and the opponents he’s thrown out against.


These factors (cast broadly as opportunity and quality of opponents) are predetermined by his coach. We might speculate, but in the absence of actual data, we have no idea what would happen if the player had different opportunities or different opponents.


More worrisome is the possibility that rather than playing the “best” players, coaches may be making on ice decisions that are simply validating the roster choices handed to them by their GMs (presumably with their input).


For example, we recently looked at veteran NHL players last season and found there was a high correlation between what players were paid and their ice time. The correlation between pay and power play ice time, where scoring opportunities are far more frequent, was even higher, and not surprisingly, the correlation between power play ice time and power play points was also quite high.




Now unless NHL GMs are complete idiots, one would expect better players to get more ice time, but this phenomenon leaves open the possibility that GMs are “picking their winners” by signing players to big contracts and then protecting their own jobs by ensuring those highly paid players get lots of chances to score and make them look smart.


In other words, equally (or perhaps more) talented players are not getting the same opportunities, or better players are being used in situations where lesser players might do just as well.


In a highly physical sport like hockey, which has an 82 game season, a tight playoff schedule, and playoff games that could theoretically go on until the end of all time, fatigue plays a huge role. As a result, teams that have deep lineups tend to fare better, particularly in the post-season.


Fielding a winning hockey team isn’t just about finding the best players or giving lots of minutes to a handful of superstars, it’s about managing a complicated optimization equation that maximizes a team’s overall performance during every minute played. The best hockey coaches understand not only who their best players are in the abstract, they also have a firm understanding of when linemates, opponents or a “fresh set of legs” might make a lesser overall talent perform better.


Putting that thinking into practice is harder than it sounds because, like the bank I was working for almost 20 years ago, the only data we have is biased.


So, for example, when my colleagues in the Department of Hockey Analytics and I declared our unbridled affection for Benoit Pouliot, who at the time was earning a “paltry” $1.3 million on a one-year contract, we based our view on Pouliot’s performance with limited ice time.


The guy was very productive with the minutes he had and seemed to display a good balance of offensive and defensive play. That led us to wonder what he might do if given a bigger role.




We weren’t alone here – others in the hockey analytics community were seeing the same thing – and the Edmonton Oilers obviously saw Pouliot’s value as well, signing him this summer to a 5 year $20 million contract.


But here’s the problem. We really don’t know what might happen if Pouliot had been given more ice time and tougher assignments (his linemates were actually pretty good last year). It’s possible his performance would remain strong, but it’s also possible it might taper off, perhaps even precipitously. After all, it’s one thing to do well when your legs are relatively fresh, other guys are drawing the opponent’s toughest players, and you get the easy assignments. Being “the guy” is a different story.


By signing Pouliot to a big contract, the Oilers are essentially engaging in an expensive experiment. But is there a cheaper way to perform that experiment?


In my view there is.


And as was the case with my bank client, that would require a team to embrace experimentation and short-term risk. For example, rather than assume your highest scorers are the best players, or even try to guess at which guys you’re underutilizing, a team could pick a certain number of games during the season (at random, to avoid the possibility of drawing too many weak or strong opponents, home games vs. away games, etc.) and assign roles and ice time at random.


So the 4th line might become the 1st, penalty killers might suddenly find themselves on the power play, shutdown players might find themselves in an offensive role, etc.


I’ll admit the examples I’m proposing, while incredibly interesting to people who toil away in the world of observational data, are unlikely to be practical in the real world of professional sports.  Banks have millions of customers, so randomly experimenting with a few thousand isn’t out of the question. Pro sports teams don’t have the luxury of millions of games to play with.


But even a more conservative approach to experimentation would still yield benefits. The important part is for teams to set up experiments where the outcome isn’t known in advance, which means the tinkering shouldn’t be designed solely to validate “hunches”.


Taking this kind of agnostic approach would help many teams settle essential questions – in a way that isn’t biased by a coach’s preconceptions or the complicated political relationships that exist in all sports franchises – of who their best players are, which ones are simply getting the benefit of better opportunities, and where the point of inflection is at which a better player shouldn’t be out there due to fatigue or other factors.


Each team would have its ideas about what it wanted to experiment with (determined by risk tolerance as much as anything else), but as long as they truly embraced the idea of unbiased experimentation, they could learn a great deal of new information.


Intentionally putting games at risk is not for the faint of heart, nor is it for a team that thinks it will be “on the playoff bubble”. But if done properly, such an approach could yield valuable insights at little cost. For an elite team that is essentially assured a playoff berth, such as Boston or Chicago, stepping back from a “must win every game” mentality to one that puts some games in play in order to learn key information during the regular season may in the end position them for bigger returns in the playoffs.


In sports as in business, letting go of those twin crutches – making decisions based on what we (think we) know for certain and obsessively focusing on winning every battle – can often be a dominant strategy.


[1] To be fair, the pendulum has swung back and forth a couple of times in the almost 20 years since I did this work, so finding the balance between “conservative”, “opportunistic” and “reckless” lending isn’t quite as easy as I’m making it sound.

No Comments Yet

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>