The ever-quotable basketball legend Charles Barkley recently said: “Analytics don’t work at all. It’s just some crap some people who are really smart made up to try to get in the game because they had no talent.” Many fans and players of the game may agree, but the teams have mostly bought in – today, each of the 30 NBA (National Basketball Association) teams employ at least one analytics expert, and more than half have entire analytics departments in their front office (Colás, 2017, p. 339).
Analytics in basketball refers to the usage of big data acquisition, statistical analysis, and information utilization in order to better understand the worldwide competitive sport. Analytics are today used by (and occasionally refuted by) front offices, coaches, television analysts, journalists, and more to relay information, in some cases to improve a team itself, and in others to simply critique the strategies of the former. Basketball statistics, as they are currently understood, began in their most basic form in 1937 with points per game, followed soon by assists, rebounds, blocks, and steals, the latter two not until 1973, to round out what is now known as the “box score” (Colás, 2017, p. 337). These statistics have been around long enough that they are now widely used and accepted by all followers of the game, and no one questions whether they “work” or who “made them up to get into the game”. Once, of course, these stats, too, were considered new and revolutionary. At their core, though, they seek to simply measure important basketball information. Each game’s winner is the team that scores the most points – it makes sense to keep track of which players score the most points in a game. Offensive rating, another stat, measures the number of points scored per possession (each time a team receives the ball), as opposed to per game. This is considered in the public sphere to be an “advanced” statistic, simply because it does not show up in the box score, but many experts now deem it more descriptive than points per game as a measure of scoring efficiency, as it eliminates the confounding variable of the varying number of possessions from game to game. It is in this way that stats are refined over time and improve upon their most basic iterations.
The methods in basketball analytics have grown exponentially in the past few years, and in many different spheres. Teams are performing independent analysis using data from Second Spectrum, which tracks the movement of all ten active players, and the ball, multiple times a second throughout a game, to learn what offensive and defensive setups are most effective (Colás, 2017, p. 337). With this impressive amount of available data, teams are stacking their front offices with talented brainiacs that know how to analyze it, e.g. Haralabos Voulgaris – once a professional gambler who gamed the NBA betting scene using his own analytics, Voulgaris was recently hired to work for the Dallas Mavericks’ front office (Rader, 2018). Independent analysis is popping up everywhere, as well, as the wealth of basketball data, combined with the vast number of interested analyzers, has led to a proliferation of interesting studies. Need to know how important forcing a player to his “strong” or “weak” dribbling hand is, in terms of correlation with KPIs (to use a business term, though basketball is as big a business as any these days) like points and field goal percentage? That analysis is available, thanks to correlation matrices, multiple regression, and confidence intervals (Barthomew & Collier, 2012, p. 25). Need to know to what extent scoring probabilities are affected by high-leverage in-game situations? Classification trees, bootstrap kernel estimates, and boxplots have your back (Zuccolotto, Manisera, & Sandri, 2018, p. 587). It’s easy for the fan to access advanced statistics as well – websites such as ESPN, Nylon Calculus, and Corner Three routinely post advanced stats for public consumption, such as WAR, “Wins Above Replacement”, which seeks to boil the number of wins contributed by a specific player to his team down to a single number.
Dr. Eric Siegel, founder of Predictive Analytics World, posits that each application of predictive analytics is defined by two factors: what’s predicted, and what’s done about it (Siegel, 2013, p. 19). All phases of basketball management have begun to take the leap into analytical thinking. With the math-driven insight that three-point attempts have a higher expected value on average than two-point attempts (as their reward outweighs the added difficulty), NBA teams last season set their sixth consecutive three-pointers made record (Reynolds, 2018). Coaches can utilize data on the effectiveness of every individual lineup they’ve played, to see who plays well with whom, and in which situations, without needing to rely on the “eye test”. Players can look at “shot charts”, which map their efficiencies shooting from different areas on the court, in order to address their strengths and weakness and better approach their practices. Front offices evaluate players’ values and can decide who is worth what contract, taking into account such factors as who’s played well in what areas of the country, who’s played well with which types of teammates/coaches, and how likely performance is to drop off after receiving a long-term contract. All of this available information, though, must be taken advantage of, and different teams have taken advantage to different extents. One of the most important factors in any business organization’s ability to become analytically competitive is a full buy-in from its executives, leading to a change in culture across the firm (Davenport & Harris, 2007 p. 765). Behind general manager and Northwestern graduate Daryl Morey (founder of the MIT Sloan Sports Analytics Conference), the Houston Rockets look on track to make the playoffs for the 7th straight season, while Phil Jackson’s more traditional, “go with the gut” approach led to three consecutive seasons of the New York Knicks missing the playoffs before he was fired.
Even with how much data has already changed the game of basketball, it surely has a ways to go. Every day, more data points are available, and more talented minds are thinking up new ways to use those points to gain a competitive advantage for their team, or reveal a shocking insight to the public. It’s an exciting time to be studying data science, or a fan of basketball, or, if you’re lucky, both.
by Derek Reifer, Northwestern University
Colás, Y. (2017). The Culture of Moving Dots: Toward a History of Counting and of What Counts in Basketball. Journal of Sport History 44(2), 336-349. University of Illinois Press. Retrieved January 26, 2019, from Project MUSE database.
Rader, Doyle (2018). Dallas Mavericks Hire Former Professional Gambler Bob Voulgaris. Retrieved from https://www.forbes.com/sites/doylerader/2018/10/09/dallas-mavericks-hire-former-professional-gambler-bob-voulgaris/#5a581a406bf1
Bartholomew, J. T., & Collier, D. A. (2012). The benefits of forcing offensive basketball players to their weak side. Journal of Multidisciplinary Research, 4(2), 19-27.
Zuccolotto, P., Manisera, M., & Sandri, M. (2018). Big data analytics for modeling scoring probability in basketball: The effect of shooting under high-pressure conditions. International Journal of Sports Science & Coaching, 13(4), 569–589. https://doi.org/10.1177/1747954117737492
Siegel, E. (2013). Predictive analytics: The power to predict who will click, buy, lie, or die. Hoboken, NJ: John Wiley & Sons, Inc.
Reynolds, Tim (2018). NBA sets sixth consecutive 3-point record. Retrieved from http://www.nba.com/article/2018/03/30/nba-set-break-3-point-record
Davenport, T. H., & Harris, J. G. (2017). Competing on Analytics: Updated, with a New Introduction: The New Science of Winning. Harvard Business Review Press.