Arbitration Projection Model Rumors
Twenty teams have officially finished their seasons and are already considering available free agents, which means they also have to consider whom they can afford. This involves predicting the salaries they will have to pay to arbitration-eligible players. Rather than having their salaries determined by the highest bid, their salaries are set to be determined by an arbitration panel. Of course, very few players actually ever make it in front of that panel, since teams and agents spend considerable resources trying to resolve their salaries in advance.
Last year, MLBTR owner Tim Dierkes asked me if I thought I could put together a model that predicted arbitration salaries. I had studied free agent salaries, but I decided that I could probably do almost as well with arbitration salaries. It went better than expected: the model was within 10% of the actual salary for 55% of players who signed one-year deals, and was within $1MM for all but 4 of the 156 arbitration-eligible players.
Unlike free agents, whose salaries are determined by the highest bid among 30 teams with 30 different ways of predicting and valuing future performance, arbitration eligible players receive salaries based on the similarity between their past performance and the performances of other comparable players. A well-designed model can do a good job of sifting out which statistics are most important and predict salaries accordingly.
Last year’s model was strong, but there were still a number of players who were poorly projected. One category with which I struggled was breakout stars entering arbitration for the first time. Jordan Zimmermann received a salary of $2.3 million, above my projection of $1.8 million. Even though he only had 23 career starts in an injury-checkered past before going into 2011, his solid 3.18 ERA in 161 1/3 innings in 2011 seemed to matter more than his previous injuries. Another thing I learned in my projections for 2012 was that previous salary did not matter much for first-time eligible players. My biggest overestimates included projecting David Price at $7.8MM instead of his actual $4.35 million salary and Rick Porcello for $4.2MM instead of his actual $3.1MM, since I thought hefty Major League deals given to draft picks would give these players a leg up going into arbitration. This is not true, as I have since learned. I also missed big on some players who had strong rebound performances after being non-tendered the season before. The biggest miss was projecting Melky Cabrera’s 2012 salary; I only predicted a $4.4MM salary, instead of his $6MM earnings. It turns out that bouncing back after being non-tendered gives players like Cabrera a little extra room for raises, and such players are now projected for higher salaries in 2013.
I did a lot of work on improving pitcher projections for this year’s model. I originally included all pitchers into the same model, which gave them credit for wins, saves, and holds as they received them in each role. This was supposed to better incorporate swingmen and other pitchers with evolving roles, but now I have separate models for starters and relievers, which allows for more accuracy for everyone. In last year’s model, I ignored the importance of strikeouts for starters and had to introduce other measures to juice the salaries of elite starters. This year’s model incorporates elite starters much more smoothly. The starter/reliever distinction also gave me an opportunity to notice an important feature about arbitration -- declining marginal returns to individual statistics. It turns out that the gap in earnings is much larger between pitchers with 170 innings and 200 innings than between pitchers with 200 innings and 230 innings, and that a guy with 30 saves out-earns a guy with 20 saves far more than a guy with 40 saves out-earns a guy with 30 saves.
In the coming weeks, we will present the projections for all 30 Major League teams’ arbitration-eligible players. Last year’s projections had a good foundation, but we believe this year’s will be even better. These can help teams and fans alike as they try to anticipate trades, extensions and non-tenders and determine how much money is available for free agents.
This past offseason, we projected the salaries for 155 arbitration-eligible players who received one-year contracts. The results were significantly better than I expected. In my first article on the projections, I estimated that we would be within $700K for about half of players, but we actually were within $200K for half of players. Our projected salary was within 5% of actual salary for 29% of players, within 10% of actual salary for 55% of players, and within 20% of actual salary for 81% of players. In fact, there were only four players who received a salary that was more than $1MM away from their projected salary.
However, within these aggregate numbers are a mixture of many very accurate projections, and quite a few that were way off. Pitchers were either very easy to project or very hard to project. When pitchers matched up very well with historical comparables, they fit squarely into categories. However, some pitchers proved to be a new breed with weak sets of comparables. As a result, of the closest ten projections, nine were pitchers, but of the worst ten projections, eight were pitchers.
When there was more precedence for a player’s performance, projecting his salary was much easier. The reason that so many relief pitchers were among the best projections was that they have very defined roles, and they are paid according to their role. Closers, set-up men, middle relievers, and long relievers all tend to get similar salaries as other such relievers in their service class and role have received in the past. Our projections were within $25K of actual salaries for relievers such as Craig Breslow, Brad Ziegler, Daniel Bard, Bill Bray, Edward Mujica, and Burke Badenhop. Each of these guys had defined roles and matched up nicely with historical comparables in similar roles.
Salaries are also very predictable for players who miss all or most of the previous season. These players almost never get big raises, and almost no players ever get pay cuts—so these are often players who get the same salary as the previous season. So, this year it wasn’t surprising when Manny Parra and Dallas Braden were rewarded another go around at their 2011 salaries of $1.2MM and $3.35MM. Next year, it won’t be surprising when Joey Devine and Brian Wilson get repeats of their 2012 salaries (if they are tendered contracts) after coming back from Tommy John Surgery.
Defined back-up hitters’ salaries can be pretty predictable as well. As with relievers, players with roles that are comparable to several other players in recent history make for quick agreements between players and teams. Jeff Baker, Emmanuel Burriss, Wilson Valdez, and Chris Denorfia all had salaries within $35K of our projected estimates.
Not all projections were so easy. One subgroup of pitchers where we may have overestimated salaries is swingmen, or pitchers who were converted from reliever to starter, or vice versa, during the season. Andrew Miller only received $1.04MM despite a $1.6MM projection, and Jesse Litsch received $975K after we estimated $1.3MM. There may be room for improvement by correctly modeling pitchers like these going forward. Broken service time can really take a chunk out of a player's salary too, especially if it’s in an atypical way. Also somewhat of a swingman, Jerome Williams settled for $800K after being projected for $1.4MM.
Two of our four biggest misses came on pitchers who were eligible for arbitration for the first time, but were coming off large salaries they received as part of a Major League contract signed as amateur. These pitchers are Rick Porcello, for whom we overestimated his expected salary by $1.1MM, and David Price, for whom we overestimated his salary by $2.55MM.
Porcello was an interesting case because his numbers were pretty standard for a healthy, solid, but not elite, starting pitcher. Pitchers like those typically get salaries in the $3.0-3.5MM range, so Porcello’s salary wasn’t surprising. However, he already earned $1.536MM in 2011 as part of his original contract signed out of high school, so we projected him for $4.2MM. His 2011 salary seems to have been irrelevant in the discussion about his 2012 salary.
Price was coming off a $1.25MM salary in 2011, and with a 19-win season in 2010 and 224.1 IP in 2011, he seemed primed to get a nice raise. However, as I attempted to model the effect of his 2011 salary, I overshot. It seems like Price may have given in a little early in accepting a $4.35MM deal, though, because Tim Lincecum was the only pitcher in the previous five years before Price with a career ERA under 3.70 (Price’s was 3.38), at least 40 career wins (Price had 41), and over 200 IP in his platform season (Price had 224.1). Jered Weaver was given $4.265MM, the largest one-year deal for a starting pitcher his first time through arbitration in that timespan, and he had a career ERA that was 0.34 higher than Price had, while having fewer innings, though Weaver did have more wins (51) than Price. However, it seems reasonable to guess that Price should have landed closer to Weaver than to Lincecum. I will look for ways to better incorporate pre-arbitration salaries going forward.
The most surprising big miss was Melky Cabrera. We expected that he would receive a nice raise from $1.25MM to $4.4MM in 2012. That would have been a raise as high as all but 14 position players over the previous five years. However, Melky Cabrera and the Giants agreed on a $6MM salary for 2012. There were only six position players to get raises that large in the last five years. They were Jose Bautista in 2011, Josh Hamilton in 2011, Carlos Pena in 2008, Matt Holliday in 2008, Ryan Howard in 2009, and Rickie Weeks in 2011. Those players had anywhere from 29 to 54 home runs in their platform year; Cabrera only had 18. Only Rickie Weeks (a leadoff hitter) had fewer than 121 RBI. Cabrera had 87. Among the players who had raises larger than our estimated $3.15MM estimate, none of them had more than 25 home runs either. In this case, I think this one might just be a case of the Giants were out-bargained by the aces at ACES. I’m not sure that he would have gotten a raise anywhere near that large if the Giants had held out and taken Cabrera’s case to a hearing (however, the sides wouldn't necessarily have argued 'raise' for Cabrera and others with broken service time).
Overall, the first year of these projections went very well. However, the projections were not so perfect that there is not still some room for improvement. Going forward, we will make sure to take a better look at swingmen, and other pitchers who had multiple roles in their platform season. We will also see if there is some way to tell when a large salary before arbitration is going to affect a player’s salary when he is eligible for the first time. There also may be a way to find a class of hitters where projections are as cut and dry as they often are for relievers with defined roles, so we will look for this as well. As players are just starting to accumulate their statistics for the 2012 season, we are already preparing to evaluate what those statistics will mean for their bottom lines in 2013.
In the past couple days, I have been discussing some of the factors that play into arbitration salaries and the new model that I have developed for MLBTR to predict them. Yesterday, I discussed what gets a hitter paid. Today, we’ll look at pitchers.
One thing that advanced statistical analysis of pitchers has taught us is that luck, teammates, and opportunity play large roles in a pitcher’s success. A good defense can end rallies and convert a sure extra-base hit into an out, while a good offense can put you in the position to get a win or a save. The free agent market has clearly adjusted to this knowledge—Cliff Lee had just 12 wins and finished 21st in ERA in 2010. He still got $120MM as a free agent, because his peripherals indicated he was a better pitcher than that—his SIERA was 3rd in the league. This year, his ERA was 3rd in the league too and he got 17 wins, thanks to more support from his teammates. Even recent Cy Young Awards have gone to Zack Greinke, Felix Hernandez, and Tim Lincecum, who fell far short of the standard 20-win Cy Young Award winner. However, arbitration panels have not made these same adjustments. The statistics that matter to panels remain IP, W, and ERA for starting pitchers, and IP, ERA, saves and holds for relief pitchers.
Playing time is crucial for pitchers’ arbitration salaries, just as it was for hitters. Accumulating innings gets you a big raise, even with a mediocre season. Joe Saunders got a $1.8MM raise last year, with 203 1/3 IP despite a 4.47 ERA and a 9-17 record. This year, we project Mike Pelfrey to get a $1.9MM raise to about $5.8MM for his 193 2/3 IP, despite a 4.74 ERA and a 7-13 record. Both pitchers will get raises for bad performance, since IP reign supreme.
Wins are pretty important as well. Jorge de la Rosa had 16 wins in 2009, despite a 4.38 ERA, which got him a $3.6MM raise. Our model predicts that for each four wins a pitcher gets, he will receive about a 10% larger raise, even with all of his other statistics unchanged. For example, our model has Cole Hamels getting $14.0MM in arbitration this winter with a solid ERA but only 14 wins. On the last day of the season, Phillies skipper Charlie Manuel used Hamels as a reliever in the 5th inning with the hopes that he could back into his 15th win. It didn’t work, but our model says that if it had, he could have expected an extra $200K in arbitration with a little help from his teammates during his throw day.
Relievers get paid by role. An elite closer with a history of saves gets paid far more than a set-up man, who gets paid far more than a middle reliever, even with similar performances. Andrew Bailey is slotted for $3.5MM this winter, but turn his 24 saves into 24 holds and he’d only get $2.1MM with the same elite ERA of 2.07, even with his 51 career saves prior to 2011 still on his record. Take all those saves and holds away, and he’d get under $1.0M with 174 career IP of a 2.07 ERA. Tyler Clippard had 38 holds this year for the Nationals, which boosts him up to a $1.7MM salary estimate. Take away 33 of those 38 holds to make him a middle reliever, and he only projects to get $1.3MM.
Even more so than hitters, one of the best ways for a pitcher to woo an arbitration panel is to have good teammates and a manager that puts him in a position to accumulate the right statistics. He’ll get more wins, saves, and holds with an offense that puts him in front, and more IP with a lower ERA with a defense that turns hits into outs.
Yesterday, I discussed the model that I developed for MLBTR to predict arbitration salaries. The model uses similar information to that which arbitration panels use to determine salaries, and generates an estimate for players that is very close to the actual salary the players earn. Today, I’ll talk a little bit about the salaries of hitters.
One of the most important determinants of a hitter's salary is playing time. For position players, this comes in the form of plate appearances. While it shouldn't be surprising that back-ups make less than regulars, position players who make it onto the field every day get paid more. For example, Hunter Pence got a $3.4MM raise last year for hitting .282 with 25 HR and 91 RBI, but with 658 PA. Adam LaRoche hit .270 with 25 HR and 85 RBI in 2009, but only got a $2.15MM raise for his 554 PA. This year, we predict Nelson Cruz only managing a $2.1MM raise despite 29 HR and 87 RBI, due to his 513 PA, while we have Hunter Pence getting a $4.2MM raise with 22 HR and 97 RBI, in part due to his excellent 658 PA. Getting onto the field matters to panels, both because you can accumulate bigger counting stat totals and because playing time is just important. Take Pablo Sandoval as another example. He has a career .307 batting average coming into his first year of arbitration, and has averaged over 20 HR per season. Our model projects him for just $3.2MM due to his 466 PA this season. Give him the same career rates of AVG, HR, RBI, and SB but with 650 PA in 2011, and he would get about $4.7MM.
Arbitration isn't fair. The one skill that really gets you paid is power—HR and RBI are far more important than other statistics. Knocking in runs matters, yet scoring them is not too important at all. In fact, once you factor in the AVG and SB that hitters do to put themselves in position to score, the actual runs scored doesn’t seem to matter much at all to arbitration panels. Even AVG and SB, however, pale in importance to almighty HR and RBI. Mike Morse had 95 RBI in the Nationals’ lineup this year, and combined with his .303 AVG and 31 HR, we have him coming in with a solid $3.9MM salary. Baseball-Reference.com estimated in August that Morse would have 50% more RBI if given the same RBI opportunities as Ryan Howard. What would Morse earn with 50% more RBI? Try $4.6MM. That’s $700K the Nationals will save on him simply by putting different guys in front of him in the lineup than the Phillies put in front of Howard.
Position does not seem to matter much either—while catchers certainly get paid a premium for their hard work behind the plate, middle infielders get paid about as well as corner infielders and outfielders. Arbitration, apparently, was built to put smiles on the faces of Mark Teixeira, Prince Fielder and Ryan Howard, who accumulate massive HR/RBI totals in potent lineups, but play easy positions. Quietly skilled players who get on base in front of them and play harder positions get paid far less for their contributions. Shortstop Elvis Andrus, for example, comes in at $2.9MM in our projections. Sabermetricians would estimate that his WAR would be about 20% lower if he produced similarly but played 1B instead of SS. However, his arbitration salary would only be about 2% lower.
You can estimate a player's salary to a certain extent using more accurate estimates of value like WAR, but a more sophisticated model that utilizes the same flawed information that arbitration panels use can pick up on these kinds of inefficiencies. Tomorrow, we’ll discuss how panels decide what to pay pitchers.
Arbitration salaries totaled about $867MM in 2011, and within a few years they will total over a billion dollars across the league, yet the arbitration process is poorly understood and rarely studied to the extent of free agent salaries. With the help of Tim Dierkes, Ben Nicholson-Smith, and other friends of MLBTR, I have fine-tuned a model for predicting arbitration salaries. By incorporating arbitration earnings from the last five years, the model is able to predict salaries using a range of related players. The model has a correlation of roughly .98 with actual salaries, and predicts actual earnings within $170K for more than half of players.
How good is the model? Well, it works well when it already knows what all the players made and can try to fit the data perfectly. So, I decided to see how well it did if I recreated the model without data from a year and then predicted the salaries from that year using the data from the other years. So I used 2007, 2008, 2009, and 2010 statistics and salaries to predict 2011 salaries, then 2007, 2008, 2009, and 2011 salaries and data to predict 2010 salaries, and so on. The result was still a very strong prediction: it was within $320K half the time. Even the most sophisticated model using service time, career wins above replacement, and single-season WAR (and remember that WAR is an actual one-size-fits-all estimate of player value) could only get within $700K half the time. For the average player, even a simplified version of my model cuts the error in half!
The salaries of arbitration eligible players are determined by arbitration panels or by contracts signed under the shadow of potential panel decisions. This represents a lot of players. Only about a third of playing time goes to free agents, and another third of playing time goes to players not yet eligible for arbitration. The other third of playing time -- and about 25% of payroll -- goes to players whose salaries will be determined by an arbitration panel, unless they reach an agreement first.
In contrast to the free agent market, which now incorporates a modern understanding of baseball, arbitration relies on simple statistics such as pitcher wins and runs batted in. When advanced statistics became available, teams incorporated these into their free agent bids, and stopped paying much attention to old-school statistics. Meanwhile, arbitration panels determine a player's salary based on "comparables," players with similar basic statistics and service time. The salaries that the model produces aren't far from what an educated fan might guess, but the subtle differences are important.
In Tim Dierkes's arbitration series, he has been giving rough estimates of salaries for players based on in-season projections, but we will be releasing the model’s official salary projections for the 2012 season shortly. The most influential factor for both hitters and pitchers is playing time. More plate appearances and innings pitched make a huge difference. For batters, unsurprisingly, home runs and runs batted in matter most to arbitration panels and our model, while stolen bases and batting average also play important roles. For starting pitchers, wins and ERA are the most important, while relief pitchers get paid mostly based on saves and holds, with a dash of ERA as well. This week, I will post another article on hitters and another article on pitchers explaining the importance of these statistics for certain players in more detail, and I will highlight a couple of unique cases for the 2012 season. Will the model miss by a lot for some players? It absolutely will. But it’s going to hit a lot more than it’s going to miss, and it can provide guidance on players that are harder to understand.