Batting averages, on-base percentages, and ERA are all standard metrics used to compare baseball players. But how do they help us to determine a player’s day-to-day production? I wanted to look at two player’s overall game performance over the course of 100+ starts. Let’s begin with Albert Pujols.
I think it would be an understatement to say Pujols has been the most consistent hitter in baseball over the past 5+ years. He’s a sure bet to be in the MVP talks from the beginning of each season. I decided to look at his Win Probability Added per game (a plus WPA means he added towards winning the game, while a negative WPA suggests a player was detrimental to the winning cause) over the 2010 season.
I added a line at 0.0 WPA to easily compare a good or bad game produced by Pujols. More so than usual, Pujols was above the line, and you can see his highest WPA (at around 0.38) is double that of his lowest (about -0.20).
Just so I make myself clear, I’m not trying to take anything away from Pujols as a player. But as you can tell the series goes up and down very sporadically. If you had to predict Pujols’ next game performance by WPA you’d probably have a tough time. From a time series approach, we can try to find any sort of correlation between 2 games that are X lags apart (ie – knowing the outcome of game 50, what can we predict about the 55th game?). By doing so, a time series approach allows us to formulate an equation for predictions. But in this case, no model seems to fit, as the correlation isn’t convincing for one. Thus the best approach would be a white noise fit, meaning Pujols’ performance would be like predicting the Dow Jones’ price index on Monday morning.
As promised, I also looked at a pitcher’s performance: Roy Halladay’s Fielding Independent Pitching (FIP) metric over the past 5 seasons. For those who don’t know, FIP shows a runs allowed number (comparable to ERA) that is dependent only on home runs, walks and strikeouts (the three outcomes a pitcher has 100% control over, thus they are fielding independent).
These two examples essentially tell us the day-to-day production of baseball players is very random. More so, looking over a player’s success over a long period of time (ie – 162 games) is the better indicator of how good players are. It’s all about sample size folks. Thus, my version of an answer to the main question of this post is, “well…depends on over what period of time…per season? per game? per at-bat?”