For those of you who follow baseball, the Los Angeles Dodgers clinched a spot in the playoffs this past weekend. I predicted this outcome back in early spring. I’m not like a regular fan who predicts the home team will win; I am guy who builds models for a living, typically for large companies, trying to inform important business decisions by making estimates and predictions.
I build models using many types of Trade-Off Analysis such as Conjoint, Discrete Choice and MaxDiff. In my spare time, I sometimes build models around sports. You can imagine why that might be a good idea. Imagine if you could predict the outcomes of all the games, matches and races. That would be cool, wouldn’t it?
Being a baseball fan, I developed a model for baseball this year using a couple seasons’ worth of games and some advanced baseball statistics known as sabermetrics to predict the results of the season. This model, which would be useful to a team manager trying to build a winning team, determines the importance of various players’ skills on game outcomes and predicts the probability of each team winning based on the skills of the players involved in the game. Not surprisingly, the five most important player attributes (in ranked order) in determining outcome of a game are (1) starting pitching, (2) hit ratings for the starting lineup (3) home field advantage (4) relief pitching and (5) fielding ratings for the starting lineup.
A model with just these five player ratings does a good job of predicting the likely winner of a baseball game. These player ratings along with the expected starting lineups of all the teams allowed me to simulate the entire baseball season and predict the total number of games each team will win over the course of the year.
The Los Angeles Dodgers won 104 games this season, just over 64% of all their games, a level that only a few teams have achieved in the history of baseball. My model’s pre-season forecast was that the Dodgers would win around 85 games this year. In baseball, the difference between winning half of your games and two-thirds of them is the difference between mediocrity and a World Series contender. Does this mean that the model is worthless? Not according to our view of things.
Whether in models for baseball or for say the introduction of a new product, there are potentially many reasons why a prediction will be off, even if the model is valid. In the case of the Dodgers, we just didn’t have enough data on a core group of younger players who made significant contributions to the team. When we model a new product launch, we probably won’t know the quality of the final product, or if the advertising campaign will be effective in creating awareness and demand, or if the competitor launches a new campaign or product. Maybe a single event will have a significant impact either positive or negative on a product’s appeal or adoption.
Whether or not the model happens to yield accurate prediction over short periods of time (e.g., a two weeks losing streak) for a given team in a given year, it can still be useful for making decisions about what actions to take to increase the probability for future success. Focusing on starting pitching is a good team composition approach. The same applies for business decisions. Yogi Berra was right when he said, “It is difficult to make predictions, especially about the future.” But market researchers can overcome the challenges of prediction models with:
This is true whether the model is a choice model to help decide on features for an introduction and determine long-term profitability or a brand model indicating whether your brand is communicating the right things to set a product up for future success or a sales call model to suggest call targeting and frequency.
No marketing model is a crystal ball that lets us see into the future. George Box, a famous statistician who as far as I know never played baseball, said, “All models are wrong. Some models are useful.” The key is whether or not useful learning and input to decision making comes from the model. After all, in the National League, 11 games separate the conference winner from the first wild card spot. Every win counts.