Over the past few weeks on Twitter, I have been teasing a prospect evaluation system called JAVIER. Most everything is ready to go for the release at Beyond the Box Score, we are just waiting to do it during the All-Star break. However, since Baseball Prospectus and Baseball America both released their Top 50s, I thought I would tease this a little more in the form of a google document. Below is said document, giving JAVIER’s output of the 33 hitters who are placed on at least one of the aforementioned midseason top 50s. Much of this is without the context that the BtBS posts will provide on the system, but it is rather self-explanatory. The numbers are not league or age-adjusted, so take some of that with a grain of salt. However, Anderson, Nimmo, Frazier, and Mondesi all rank poorly so far into their minor league career. Nimmo is the odd one in that group, as his minor league numbers actually look fairly good. But of the 49 similar players, only one even made the major leagues – Delino DeShields.
I’ve also included age and league-adjusted values for the 2014 statistics for the above four prospects guessed as busts. I had to set Mondesi’s age range from 17 to 21, since not many prospects play High-A at age 18. There are four similar seasons to him with the age range set at 18-19 and the best MLB career from those was Ben Davis’s.
It’s fantasy baseball draft season again and here I sit…again…trying to analyze the heck out of everything. The initial trigger for what is to follow was a prospect draft and I’ll get to that at a later time. First, I want to lay the groundwork for my ranking system. I decided it would be worth my time and effort to create a post-hoc ranking system for past fantasy seasons. This is not common as fantasy baseball deals strictly with “what have you done for me lately,” not “what did you do 25 years ago.” But here we are anyway, talking about Julio Franco’s 13 home runs as a second baseman in 1989.
First, the ground rules. Players have different values based on different league rules. What statistics are counted? How many teams are there? What are the starting positions? Instead of adding adjustments for different league settings, I decided to use the Yahoo standard settings. This means 12 teams; offensive statistics of HR, R, RBI, SB and AVG; starting positions of 1 Catcher, 1 First Baseman, 1 Second Basemen, 1 Shortstop, 1 Third Baseman, 3 Outfielders, 1 Utility Player; and positional eligibility of 10 games played or 5 games started.
Now I could put a value to every single player season since 1989. I chose this year because Baseball America started ranking prospects in 1990 and I wanted to include all of their data in my database. I could go back farther, but 24 seasons is plenty good. I already had the backbone of a 5×5 fantasy system from some projection work I did a few years ago, I just needed to apply it to multiple seasons. The actual ranking system relies on z-scores to give a point total to each player’s contribution in each of the five offensive categories. Instead of using league averages and standard deviations, I took those numbers only from players with at least 320 plate appearances in a season (for the strike-shortened 1994 and 1995 seasons I used PA minimums of 225 and 284, respectively). This is roughly the top 230 to 250 players in playing time, which is a good selection of the readily available fantasy talent. The numbers may change a bit depending on this assumption, but not too much to matter. For instance in 1989, the average HR total for the top 250 fantasy players was 10.9 with a standard deviation of 8.6. This means Julio Franco gets (13-10.9)/8.6 = 0.24 HR points in 1989.
I also had to create a replacement value for every position in every season, but before I could do this, I had to find for what position each player was eligible. I used the 5 games started/10 games played requirement provided by Yahoo and matched up defensive stats for each player and gave them eligibility at the appropriate positions. Players maintained eligibility the following year for each position. Then I set the replacement value of each position as the non-adjusted fantasy value (NAFV) of the 12th best player at that position for that year (36th for outfielders). For first basemen and outfielders, I filtered the data to only show 1B, 1B/OF or OF, 1B/OF players since these positions tend to have the highest replacement value. If a player was eligible for multiple positions, his positional value was set as the most valuable of these positions for that year. This is admittedly the weakest part of the analysis, but I believe the numbers are still solid and provide an accurate representation of value. The replacement value for designated hitters was set as the least valuable of the other positions.
In summary, player value is calculated as HR points + R points + RBI points + SB points + AVG points – Position Replacement Value.
For example, we’ll look at 2012 Mike Trout. The average stat line for a fantasy-relevant player was 11 HRs, 59 Rs, 56 RBI, 11 SB and a 0.264 AVG. Trout contribituted 30 HRs, 129 Rs, 83 RBI, 49 SB and a 0.326 AVG. This gave him 1.49 HR points, 3.15 R points, 0.93 RBI points, 3.67 SB points and 2.26 AVG points for a raw value of 11.50. However, outfield was the second strongest position last year, with Ichiro Suzuki (9/77/55/29/0.283*) representing replacement level. Trout lost 2.23 points for his position and had a final PAFV of 9.27, which ranked 27th out of all fantasy offensive player seasons since 1989.
*I will be using a quadruple slash line (there are four slashes and five stats) to represent player seasons in the interest of time and virtual space. These slash lines will always be formatted as HR/R/RBI/SB/AVG.
What do these points mean? Well obviously, higher is better; however, what is great, what is good and what is horrible? Fantasy point total is not normally distributed, so I can’t (or don’t know how to) create a system similar to ERA+ where better or worse values lie above or below an average of 100. Instead, I’ll make this chart:
I included all players with at least 130 at-bats in this chart because that was the cut-off for rookie eligibility. I figured it was a good enough place to stop, but again it doesn’t change much. The top 120 players are at least average and the bottom 250 are bad or horrible. About eight players have great fantasy seasons every year.
Along with the typical baseball acronyms (HR, RBI, etc.) I will be using two brand-spanking new acronyms, PAFV and NAFV. PAFV stands for position-adjusted fantasy value. It is the amount a player is worth based on his numbers compared to the fantasy average, with a position adjustment. This is his overall value. NAFV is non-adjusted fantasy value. This is how much a player is worth, regardless of position.
Last night, Brett Lawrie had a run-in with the umpire. Lawrie is a very high-energy player and responded to a called third strike with…lots of energy. He threw down his helmet, which bounced on the ground and hit the umpire, most certainly leading to a suspension, regardless of intent. There are lots of arguments about his intent, about how you can’t justify his response because of a bad call, blah blah blah. I’m not into that kind of stuff. Fact 1: Umpire called two pitches outside the normal strikezone strikes. Fact 2: Brett Lawrie didn’t like it. Fact 3: Brett Lawrie took his helmet off, spiked it on the ground and it hit the umpire. Moving on.
My first response to hearing about this was “Jose Molina involved in another controversial called strike? Seems to be a pattern there.” I then entered into a good conversation with Mike Ferrin on Twitter about catcher framing and how much we both love it and how undervalued it still is.
Here is the Brooks Baseball strikezone plot for right-handed hitters from Bill Miller last night. The three circled pitches are the Ball 1, Strike 2 and Strike 3 in the Lawrie PA.
After Jose Altuve scorched the Brewers for four hits in five plate appearances yesterday, I did a little research. I have been following his major league career a tiny bit, as I own him on a fantasy team. I picked him up after Kevin Goldstein’s obsession with him last season. Basically, he’s a little guy that can hit but lacks any sort of plate discipline. In fact, that’s exactly what happened last year. In 234 PAs, he had a .276 batting average (not bad for a 21-year old) and a 2.1% walk rate. However, this year in 77 PAs, his walk rate is an above-average 9.1%. Derek Carty found that walk rate stabilizes after 168 non-IBB/HBP PAs, so the sample size is not yet large enough, but it is about halfway there.
Obviously, this realization required a deeper look. That very day, Astros beat writer Brian McTaggart noticed that while the Astros batting average and slugging percentage through 19 games are very similar to last year, their on-base percentage is 20 points higher. Astros front-office analyst Mike Fast replied to that post with “Interesting comparisons. I love seeing that OBP where it is this year so far.” Of course you do, Mike.
So what is leading to the Astros increased on-base percentage and is it a result of the new Jeff Luhnow-run front office? Well I can’t answer that second question because I’m not on the inside, but I may have some input into the first. I do believe that there is a systematic change in the Astros hitters’ approaches this season.
The first difficulty in looking at this is that so many of the Astros hitters are fairly young, so they don’t have a good amount of MLB PAs to compare this season with. However, since walk rate does stabilize so quickly, the numbers should still be helpful. I compared each player’s 2012 plate discipline numbers with their 2011 ones. Jason Castro and Justin Maxwell did not play in 2011, so I used their 2010 numbers instead.
I found the top 12 hitters in PAs for the Astros this season, excluding rookie Marwin Gonzalez. Then I compared their BB%, O-Swing%, Z-Swing%, Swing%, O-Contact%, Z-Contact% and Contact%. between 2011 and 2012. You can read about the definitions of these stats on Fangraphs (their custom leaderboards and player lists made this very easy). I used the pitch-fx versions of each of them.
This is what I found. Positive numbers mean the number is higher in 2012 than it was in 2011. (I apologize for the poor formatting)
I have heard this many times and finally decided to look at it. Baseball commentators will say “Justin Masterson really attacks hitters early in the count so he tends to give up more solo home runs.” First of all, Masterson doesn’t get a lot of first strikes and he doesn’t give up a high relative amount of solo HRs. But moving on…
I looked at all pitchers years 2009-2011 who gave up at least 10 total home runs and compared their solo HRs with HRs with runners on base. I then compared this to each pitcher’s Zone% and First Strike%. The Zone% should capture how aggressive a pitcher is overall, while the First Strike% should capture how aggressive the pitcher is early in the count.Follow @stealofhome
The Detroit Tigers agreed to sign Prince Fielder to a 9 year, 214 million dollar contract today. According to Seamheads, Miller Park has a home run park factor for LHB of 121. Comerica’s is 98. From katron, Prince Fielder’s hits superimposed onto Comerica Park:
At some point in the next few days, the name(s) of the people elected to Major League Baseball’s Hall of Fame will be announced. There has been much banter about who should or shouldn’t be elected as always. Honestly, I haven’t looked into it deeply enough to form my own opinion about some of the borderline players. However, there is one player that I have yet to see a vote for who I believe has earned consideration, namely, Tony Womack. Here we go.
I know what you’re saying: “Tony Womack was at best just above replacement level and should probably not even be on the ballot.” Maybe you’re right. But let’s take a closer look at his career, shall we?
Womack led the National League in stolen bases three consecutive years. You know who else did that? Lou Brock, Willie Mays, Kiki Cuyler and Max Carey among others. That’s some good company. His 363 career stolen bases ranks above Hughie Jennings, Buck Ewing and Rod Carew – three more hall of famers. He is one of only 19 players to exceed the 60 stolen base mark more than once in the expansion era. Simply put, Womack was one of the most prolific base-stealers of all time.
Baseball Prospectus says: “…Womack is a track star…”
Womack was also versatile. He played three key defensive positions for over 100 games each and two more for 40 games each. Though he never won a Gold Glove, he was certainly more than adequate each place he played. Not only was he versatile on the field, he was also versatile in the batting order, accumulating at least 100 plate appearances in five different lineup spots. A majority of his time was spent at the most important spot — leading off.
Baseball Prospectus says: “As a child, I [...] believed that Tony Womack was a great leadoff man…”
The incredible Dan Szymborski is currently rolling out his ZiPS projections for the 2012 baseball season team-by-team at the Baseball Think Factory (most recently the Orioles). These projections not only include well-established veteran players, but also minor league players who have little to no track record in the major leagues, which is where I will focus my attention.
Szymborski was kind enough to send me a link to the 2011 projections. I went to Fangraphs and downloaded a table of all of the 2011 rookies (there may be some mistakes, such as Alexi Ogando who is not actually a rookie). Then, I found OPS+ statistics from baseball-reference (not available at Fangraphs). Finally, I compared the ZiPS projections to the actual totals from those rookies.
There are 17 counting stats, four rate stats and one league-average stat projected by ZiPS. I was able to find the actual production for each of these except RC/27, which I have not compared yet and will not include in this analysis. Fangraphs has a wRC stat, but it is a counting stat (total number of runs) instead of a rate stat (runs created per 27 outs). That still leaves 21 stats to compare, so I consolidated a few of them. I compared all counting stats besides games and at-bats (playing time stats) together. I took the absolute difference between how many runs, hits, doubles, triples, etc. ZiPS projected and the player actually accumulated and added them all together. This gives a total “counting stat” difference between projection and actual. For playing time stats I only looked at at-bats, as I figured it would give me the same basic information as games. For batting average, on-base percentage, slugging percentage and OPS+, I only looked at players with more than 100 at-bats (no projected plate appearances).
The dark line on each of the graphs represents x=y (if ZiPS could perfectly project every statistic), not a trend line.
This category covers at-bats.
Yu Darvish’s actual stats are on the very bottom of this page.
As you may know, Yu Darvish, an elite Japanese pitcher may be coming to the major leagues next year. Teams bid on a posting fee (paid to Darvish’s Japanese team, the Nippon-Ham Fighters) for the 26 year-old and the winning team – the Texas Rangers – gets to negotiate a contract with him. If the negotiations are successful by the 4 PM CT deadline on January 18th, Darvish will be pitching for the Rangers next season. The obvious question is how good is he? We have been told that he is really good and his numbers (career 2.00 ERA) make him look good, but how will he perform in the major leagues against the best hitting in the world? Daisuke Matsuzaka was supposed to be good, but he has not performed as well as hoped with the Red Sox. So it all boils down to this: What pitching statistics have historically correlated well between the NPB and MLB?
First, I decided to only use players who began their career in the NPB, which excludes American players like C.J. Nitkwoski and Colby Lewis. According to Japanese Ballplayers, this list comprises of 40 players, 29 of which are pitchers. Then, I found the amazing Data Warehouse at Japan Baseball Daily, where all of the NPB stats in this article originated. Finally, I gathered all of the NPB stats for each of these 30 pitchers (including Darvish) and compared them with major league statistics from Fangraphs. I excluded all statistics from players in the NPB after they appeared in the major leagues. I added Darvish’s calculated major league statistics based on the trend line for each (marked by the red data point).
(Click to enlarge)
Darvish MLB FIP: 4.36
NPB FIP is calculated according to the formula on the Fangraphs Glossary, excluding intentional walks, since I don’t have that data. There is basically no correlation between NPB FIP and MLB FIP.
Seven months ago, back when The Process Report was still up and running, R.J. Anderson looked at which Rays player spent the most time on the field. I mulled this over a bit and decided that he was missing the offense part of the equation. I created a metric for time on field in May, but never got around to actually revealing my findings. Since that time, Fangraphs has updated their leaderboards, making this a much easier task. I am looking at how much time each player spent on the field (Time on Field) and how many wins they contributed per ten thousand minutes (Value Intensity).
As far as I can tell, a player can spend time on the field in three ways: playing defense, making a plate appearance and running the bases. I’ll break each of these up and explain how I calculated the amount of time each player spent there.