How Well Did ZiPS Perform for the 2011 Rookie Hitters?

Introduction

The incredible Dan Szymborski is currently rolling out his ZiPS projections for the 2012 baseball season team-by-team at the Baseball Think Factory (most recently the Orioles). These projections not only include well-established veteran players, but also minor league players who have little to no track record in the major leagues, which is where I will focus my attention.

Szymborski was kind enough to send me a link to the 2011 projections. I went to Fangraphs and downloaded a table of all of the 2011 rookies (there may be some mistakes, such as Alexi Ogando who is not actually a rookie). Then, I found OPS+ statistics from baseball-reference (not available at Fangraphs). Finally, I compared the ZiPS projections to the actual totals from those rookies.

Method

There areĀ 17 counting stats, four rate stats and one league-average stat projected by ZiPS. I was able to find the actual production for each of these except RC/27, which I have not compared yet and will not include in this analysis. Fangraphs has a wRC stat, but it is a counting stat (total number of runs) instead of a rate stat (runs created per 27 outs). That still leaves 21 stats to compare, so I consolidated a few of them. I compared all counting stats besides games and at-bats (playing time stats) together. I took the absolute difference between how many runs, hits, doubles, triples, etc. ZiPS projected and the player actually accumulated and added them all together. This gives a total “counting stat” difference between projection and actual. For playing time stats I only looked at at-bats, as I figured it would give me the same basic information as games. For batting average, on-base percentage, slugging percentage and OPS+, I only looked at players with more than 100 at-bats (no projected plate appearances).

The dark line on each of the graphs represents x=y (if ZiPS could perfectly project every statistic), not a trend line.

Results

Playing Time

This category covers at-bats.

The playing time numbers are way off in general. There are maybe 11 players where the playing time is actually in line with what was projected. This is understandable since it is very difficult to project how a team will need a player throughout the year and what their plans are for him. Here is that list of 11 players:

The correlation coefficient for at-bats is 0.09 for rookies. The coefficient for all players in 2011 is about 0.16.

Counting Stats

This category covers runs, hits, doubles, triples, home runs, runs batted in, walks, strikeouts, hit by pitches, stolen bases, caught stealing, sacrifice hits and flies, intentional walks and grounded into double plays.

There were 11 players with an absolute difference between all of the counting stats of less than 100. The best projection was for Luke Hughes. It was off by two runs, four hits, three doubles, two triples, one run batted in, two walks, one stolen base, one caught stealing, two sacrifice flies, three grounded-into-double-plays and 0 home runs, strike-outs, hit-by-pitches, sacrifice hits and intentional walks. The worst projection was for Chris Carter, where the main culprit was his 586 projected at-bats versus 44 actual at-bats. It’s difficult to accumulate counting stats when you rarely play. Here is that top 11:

You may notice that 10 of the top 11 in the counting stat category are also in the top 11 of the projected at-bat category. Jemile Weeks is the only player from the at-bat top 11 not in the counting stats top 11 (he comes in 13th) and Chris Stewart is the only player in the counting stats top 11 not in the at-bat top 11 (he comes in 13th also).

The average sum of the absolute difference of all of these categories (total amount off for all stats) is 336.

Rate Stats

This category covers batting average, on-base percentage and slugging percentage.

The rate stats all follow an upward linear correlation. The r^2 values for each are listed here:

Batting average is the most highly correlated statistic that this analysis covers and it only has an r^2 value of 0.16.

OPS+

This is the best overall measure of offense to compare ZiPS to the actual rookie performance.

The correlation coefficient for this graph is 0.1, which is not very strong. There is always the possibility that the difference in OPS+ calculations (park factors, etc) makes enough of a difference to make this an invalid relationship to try to make. However, as it looks now, ZiPS does not do a very good job of projecting rookie OPS+.

Conclusion

At least in 2011, ZiPS struggled with projecting the proper playing time for rookie hitters. However, it seems that the projected rates of the counting stats are a bit more valid. Batting average and slugging percentage have among the highest of correlations between projected and actual statistics, but they are still below 0.2. Finding some way to improve rookie playing time numbers (if possible) would greatly increase the usability of these projections.

About these ads

2 Comments on “How Well Did ZiPS Perform for the 2011 Rookie Hitters?”

  1. [...] How Well Did ZiPS Perform for the 2011 Rookie Hitters? (stealofhome.wordpress.com) [...]

  2. [...] How Well Did ZiPS Perform for the 2011 Rookie Hitters? (stealofhome.wordpress.com) [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,282 other followers