Graphs, Hitters, IceBat

Home Runs Follow Up

As I stated in my last post, predicting home runs from fly ball data could be another component of how we compare home run hitting abilities of players. As you can see from the graph above, I plotted the fitted values from the model split by each hitter. Most of the fitted values tend to be at the extremes, which coincide with logistic model properties. As for comparing hitters, it looks like Adam Dunn has the most predicted fly balls becoming home runs, while Jason Bay is on the other side of the fence.

In case you were wondering, here are each player’s mean HR prediction for fly balls: Adam Dunn (~50.4%), Manny Ramirez (~39.2%) and Jason Bay (~34.0%). No surprises there really. The order of these hitter’s HR predictions coincide with their career HR/FB rates and our general notion of their hitting style. Dunn swings for the fences or strikes out otherwise, while Manny and Bay display more use of the entire field (though that may be an optimistic statement about Manny’s capabilities now). While Hit Tracker does an excellent job telling us how far home runs really went, and what park/weather factors impacted the ball’s real projection, we don’t really have an idea of what those factors had on non-HR fly balls. Though I am speculating, maybe this topic of research will increase once the data from Field F/X (previewed in THT’s 2011 Annual) is published. Only time will tell…

Graphs, Hitters, IceBat

Home Runs: They May Deceive You

First of all, Happy New Years everyone! Hope you had a fun time doing whatever you do on these holidays. IceBat was a party pooper and decided to sleep all night in his freezer bed.

As you may remember, back in December I had to complete a couple of final projects. One idea that I didn’t use dealt with the concept that home runs are not always equal in displaying a player’s power or batting skills. We  equate overpowering shots to right-center field by Prince Fielder with balls that graze the more-than-generous right field wall of Yankee Stadium. What I mean is, there are more variables than just pure distance that go in to determining whether or not a fly ball becomes a home run. With this in mind, I can run a regression model to compute the probability that a flyball will turn in to a home run. I received a large data set (many thanks to Greg Rybarczyk at Hit Tracker) that spans the 2006-2008 seasons for three players (Adam Dunn, Manny Ramirez and Jason Bay). The data includes observational and calculated data (in the similar ways of Hit Tracker’s data – i.e. True Distance or Elevation Angle, etc.) on every long fly ball the players hit, totaling a tad over 700 observations. Included are variables such as what ballpark the ball was hit in, date & time, and the outcome of the play (single, double, home run, out, etc.)

As you can tell from the graph above, the outcome of the play isn’t so clear when only given the elevation angle and distance traveled summary of the ball. All the outcomes are generally scattered so that we cannot conclude any real correlation. I superimposed two boxes to easily show how similar balls can have different outcomes. In the case of the right-side box, a slightly different elevation angle could mean the difference between a home run and a fly ball.

Continue reading

Graphs, Hitters, IceBat, Pitching

No Time!!

Crazy schedule for me these next few weeks. I’ll try to stay active, I promise. If not, IceBat will take over, but I’m guessing he won’t say much (he’s pretty shy and likes to chill in the corner of my room). Anyways, I thought I’d share a recent report I did for my times series class. It’s about the general shift of runs scored per game (by one team) over the years of MLB’s existence. If you have some time (and enjoy a few technical terms) I’ve uploaded a link below. Happy December holidays!

R.G in MLB History

Graphs, IceBat, Pitching

Pitch F/X

Ever wonder the exact location, movement, speed, rotation, spin angle of a pitch? With Pitch F/X, every ball thrown in the majors is calculated to a science. It’s pretty awesome but even after spending hours looking at the data, it can be a bit confusing as to what the variables mean and how they are meaningful. I’ll try to explain most of the variables to the best of my abilities. I’ll be using F/X data from Dallas Braden’s perfect game on May 9, 2010 against the Tampa Bay Rays.

Continue reading

Graphs, Hitters, Pitching

How Consistent are Baseball Players?

Batting averages, on-base percentages, and ERA are all standard metrics used to compare baseball players. But how do they help us to determine a player’s day-to-day production? I wanted to look at two player’s overall game performance over the course of 100+ starts. Let’s begin with Albert Pujols.

I think it would be an understatement to say Pujols has been the most consistent hitter in baseball over the past 5+ years. He’s a sure bet to be in the MVP talks from the beginning of each season. I decided to look at his Win Probability Added per game (a plus WPA means he added towards winning the game, while a negative WPA suggests a player was detrimental to the winning cause) over the 2010 season.

Continue reading


Tulo’s Crazy September

Update: The following graphics do have updated numbers from all of September.

(I’m just going to ignore the fact that you, the reader, has just realized I have removed the 6-month old dust from this blog.)

Full version: Tulo Heatmap

Troy Tulowitzki is crazy! I was reading an article recently about his supposed September surge in numbers. I decided to take a look and created a heatmap in R. The visual is pretty simple to read: light blue=not so good and dark blue=on fire. Remember that so far, 15 games have been played. In that span, Tulo’s had 14 homeruns. September could end right now and he’d have matched or set new highs for a month’s worth of baseball. Although there isn’t much trend in his past numbers, Tulo has shown better numbers in the second half of the season in the ’07 and ’09 campaigns. If you’re Colorado, you’re lovin’ Tulo’s contribution to the Rockies’ playoff push. Let’s hope the next two weeks are interesting.

Note: You can find a tutorial on heatmaps from the excellent data visualization blog, FlowingData.