Tango’s Marcel the Projections are out. You can download the set here. Marcel is the simplest of all projections, (as in it’s so easy a trained monkey could run them), but the system hangs in well with the big boys. Since Marcel is simple, I’ll keep this post simple.

Here’s the Cardinal hitters:

nameLast reliability wOBA
Pujols 0.87 0.423
Holliday 0.87 0.394
Ludwick 0.84 0.357
Freese 0.12 0.343
Schumaker 0.83 0.338
Molina 0.83 0.33
Rasmus 0.68 0.327
Ryan 0.75 0.322
Greene 0.33 0.314
Lugo 0.79 0.307
LaRue 0.61 0.286
  • That would be a down year for Albert by his own glorious standards, but bear in mind he leads all offensive players in Marcel-projected by wOBA by a big margin.
  • I’m loving that Holliday projection.
  • CHONE and Marcel diverge greatly on Colby Rasmus. CHONE projects a .340 wOBA, Bill James .330, and now Marcel is flinging poo at him.
  • On the other hand, Marcel likes David Freese, as does CHONE.

Here’s our pitchers:

nameLast reliability mIP ERA mHR mSO mBB mHBP
Carpenter 0.7 157 3.01 9 115 41 6
Wainwright 0.81 190 3.34 15 156 59 4
Penny 0.77 155 4.47 16 103 56 5
Lohse 0.78 138 4.24 15 91 43 4
Garcia 0.11 30 4.2 4 23 12 1
Franklin 0.61 63 3.71 6 44 24 2
McClellan 0.57 66 3.89 6 53 27 2
Motte 0.42 54 4.25 7 48 21 2
Miller 0.5 51 3.88 5 46 20 3
Reyes 0.48 50 3.87 4 40 22 2
Boggs 0.48 79 4.73 7 59 38 4
Hawksworth 0.31 45 3.5 4 32 17 1
Walters 0.15 37 4.74 5 31 15 1
  • It could be another Lincecum-Carpenter-Wainwright finish in the Cy Young balloting again.
  • I’m not sure what to make of Hawksworth’s low ERA projection, but if you calculate his FIP it comes out to 4.13.
  • Sign Kiko Calero! 55 IP, 3.60 ERA, 51 K.

With the Matt Holliday clock now ticking there has been a lot of talk about the plan B options, so I thought it would be a good time to bring out my WAR simulation (which I’ve expanded to include defense).

I looked at the following options that I’ve heard/read about at some time over the past couple of days [Quick update: as Dan at VEB points out this list is not exhaustive, and I’ll try to run at least some of the suggestions people make here or over there]

  • Holliday (hey I was curious) in left and David Freese at 3rd (MH)
  • Erik’s nightmare – AKA Miguel Tejeda at third and Allen Craig in Left (MT)
  • Mark Derosa at 3rd and Craig in Left (MD)
  • Freese at 3rd and an  Craig/Kelly Johnson platoon in LF (KJ/AC)

Here’s the CDF graph

The x axis is position player WAR and the y axis is probability.  For those that are not statistically inclined, the probability is the probability that the respective WAR would not be exceeded.  For example, the probability that the Holliday team won’t exceed the 2009 Phils is ~0.8.  More simply stated, there’s a 20% chance that (based on CHONE offensive and my defensive projections) a Cards team with Holliday would outperform the position player production of the 2009 Phils.

The MT line is under the MD line.

Clearly this exercise doesn’t factor in that money saved could be put towards pitching (I hope to add pitching to the sim this week).  That being said it’s fairly obvious (even without the sim) that Holliday>>these plan b options.  Also of note is that going young and rolling the dice on Johnson is likely better than the proven vet options.  We’ll see what pitching adds to the equation later this week.

Data: Offensive projections from CHONE, defensive projections mine (right sidebar) or CHONE for those with no MLB experience, 2009 data from fangraphs

There’s a couple of different defensive projections that are currently available for all to see.  You’ve got mine linked over on the right sidebar, and Jeff Z’s available through this link.  The beauty of the two sets of projections are that the respective methodologies are discussed in the articles presenting them, and the projections are fairly simple to compare (i.e. only one number really).  A second positive is that the methodologies only differ by one element, the FSR, as Jeff includes 4 yrs UZR (when available) and I include 3 yrs + the fans.  Since that’s the only difference, it makes drawing some insights/conclusions from analyzing the differences a little simpler, and that’s exactly what I’m going to step through here.  Jeff is doing the same over at BtB, so go check out his piece as well.

First, I’d like to get a feel for just how different the two projections were.  For that a simple distribution should do the trick.  The absolute difference is across the x axis and the count is on the y.

Clearly the majority of the differences are less that 4 runs and over half has a difference of 0 or 1.  Given that there are differences though, what positions are the differences coming from.  In this chart absolute difference is again across the x, but now percent (by position) is on the y.

At first glance it seems like the outfield becomes more prevalent the farther right you go…

So now that we have a decent idea of the magnitude of the differences, it’s time to dig into where the actual differences are.  Who is affected by adding in the FSR as a factor?  I’ll answer that question by examining two parameters: 1)Experience of the player and 2)Position / FSR rank combination.  This first table highlights the experience piece

Game Bin AVG ABS Diff STD DEV Count
<50 1.85 1.14 41
50-100 1.88 1.45 64
100-150 1.90 1.45 53
150-200 1.54 1.30 30
200-300 1.38 1.11 81
300-400 1.26 0.85 46

As one would expect the less experience the player has the bigger the difference between the two projections.  The FSR number are a larger percentage of the puzzle for less experienced players as I weighted it at 125 games no matter what the experience level of the player was.  [Update: I used my effective defensive games to bin the games, not actual games as Jeff did in his analysis]

Finally, which position / FSR rank combos gained the most by inclusion of the FSR

and lost the most

All told  it appears that the FSR does make a difference, but it’s usually only on the order of a couple of runs, which is well within the margin of error for UZR.  It has the potential to clear up the picture for players with limited major league experience, as it makes the “available data set” larger, so there is less regression to the mean.

A few weeks ago I posted a set of defensive projections for SS based on regressing a 3 year average UZR to a population based on the Fan’s Scouting Report created by tangotiger.  After some discussion over  at The Book Blog, I altered my methodology a little and have come up with a set of projections for all positions.

First a quick discussion about the methodology.  The projected values are a weighted average of the

  1. Players 3 year weighted UZR (5/4/3 style)
  2. The UZR mean of the “scouting population” to which the player belongs (more on this in a minute)
  3. The league average (i.e. 0).

The weights are

  1. Effective defensive games over the three year sample (also weighted, so not just the sum)
  2. 125 games
  3. 125 games

which basically means the larger the 3 year sample, the less impact the “regressions” have, which falls under the basic premise of the more data you have the less you need to regress.

The scouting population is determined by where the player ranks in Tango’s Fans Scouting Report (FSR).  I took the last three years of FSR data and found the average UZR/150s for various bins of players (currently done by ordinal ranking, but will likely transition to binning by overall score once 2009 numbers are computed by Tango ).  I then crossed that data with were the specific player ranked in the 2009 voting, with that number becoming the scouting regressing factor.

For those that read my previous post on it, Method 2 was the methodology adopted (as MGL pointed out that it was the correct method).  Anyway on to the results.  First the leaders (with a minimum of 60 effective DGs)

Name Pos UZR/150
Travis Ishikawa 1B 5.6
Chase Utley 2B 10.8
Omar Vizquel SS 9.3
Evan Longoria 3B 11.9
Carl Crawford LF 10.9
Franklin Gutierrez CF 12.2
Jayson Werth RF 11.2

You’ll note that the projections are for UZR/150 so you’d need to utilize an expected playing time to convert these to runs.  For example, I find it highly unlikely that Omar Vizquel will get enough playing time to save ~9 runs, but clearly if he played 75 DGs then he’d save ~4-5 runs.

Now for the laggards

Name Pos UZR/150
Jason Giambi 1B -5.6
Alberto Callaspo 2B -5.7
Yuniesky Betancourt SS -10.1
Edwin Encarnacion 3B -8.9
Adam Dunn LF -14.9
Vernon Wells CF -10.1
Brad Hawpe RF -19.1

For those that want to make the argument that Dunn won’t be playing left field again, second to last went to Delmon Young. For those making the same argument about Giambi, second to last there was Billy Butler. I’m posting the results spreadsheet on google docs with the link over on the sidebar, so feel free to download it and use it for whatever you want. The sheet contains the position the projection is for, the projection itself, 3 year UZR/150, and the effective DGs.

Finally, since this is a Cardinals blog, I wouldn’t leave you without giving you the key returning Cardinal players

Name Pos UZR/150
Brendan Ryan SS 7.2
Colby Rasmus CF 5.5
Albert Pujols 1B 5.0
Ryan Ludwick RF 1.0
Skip Schumaker 2B -5.1
Julio Lugo SS -5.9

A couple of final caveats about the projections. I know there are players missing, and there are definitely player/position combos missing. As a first pass I only projected the position that they had been identified with in the FSR. I plan to remedy that, but it’ll have to wait until the next iteration. Also, I didn’t apply an aging factor, which is clearly not a good way to go about projecting. In his BtB piece Jeff mentioned a -0.7 UZR, but I want to give some thought about how to apply that to UZR/150. Hopefully the next iteration will have some aging factor applied, up until then, apply whatever you see fit. Anyway, download away, and let me know if you have questions/problems.

At the end of my last post I cautioned that the CHONE projections (and it applies to ALL projections) were a point solution (albeit the best guess at a point solution) and had to be thought of is concert with some error bars.  In an attempt to get my arms around the ramifications of that point I created a quick little monte carlo simulation in excel/vba to produce some distributions for offensive runs above average.  I’ll quickly outline the basic methodology used, to include my input set, and then I’ll present some initial results.

Since I knew I was going to be using CHONE projections I went back and collected the archived projections from 2009.  I did a quick comparison between the projected wOBAs and the actual wOBAs from this year for various levels of prior experience to get insight into what the standard deviation should be for the distributions that will feed the simulation.  The generalized results are in the table below

Experience SD
None/Low 0.038
Med 0.030
A Lot 0.025

I ran that information through the simulation in combination with this years CHONE projections and found what I though to be a spread that was too wide (both on an aggregate team basis and an individual level).  I have very little to base this off of other than gut feel and combing back through the Fangraph archives of team totals from seasons past.

To address the variances on the individual level (which in turn addressed the aggregate solution) I went from using a normal distribution to a truncated normal by placing upper and lower bounds on the simulated wOBA (I implemented this using a re-draw method vice a rounding method).  The upper and lower bounds were influenced by reviewing the 2009 data and looking for caps that existed for various projected production levels (i.e. a projected 0.300 wOBA never produced beyond a certain actual wOBA). That still did not give results on the aggregate that passed the “smell test”, so I cut the standard deviations in half.  This last move was rather arbitrary on my part, and I plan to do some robustness testing along with analyzing a larger data set than just 2009.  That being said, the simulation was “done” and is at a state where I didn’t mind putting results out for all to see.

For today I have three scenarios to display reults from

  1. Signing Matt Holliday and going with internal options at the rest of the positions
  2. Going entirely with internal options (basically Freese at 3rd and Craig in left)
  3. Same as 2 except downward adjusting the Freese and Craig Projections as they “feel” a bit high

After the jump I’ll have more input data and the results.

Continue reading »

First off a hello to all you PAH9ers; second, thanks to Erik for letting me tag along over here.

For those that haven’t heard of me I used to write here.  I’ll basically be doing the same thing over here that I did over there, which means a steady diet of numbers, graphs, and pitch f/x.  Some of it will be broad spectrum, but most of it will be Cards focused (or at least brought back to the Cards).  Now back to your regularly scheduled analysis.

CHONE projections for hitters are out, so now we can provide another piece of the 2010 projected value puzzle.  First just a quick table of the “relevant” Cardinals and their projected wOBA (for full stat lines go poke around the linked site)

Name 2010 Projected wOBA
Albert Pujols 0.433
Ryan Ludwick 0.356
Colby Rasmus 0.334
Skip Schumaker 0.330
Yadier Molina 0.329
Brendan Ryan 0.310
Julio Lugo 0.311
Allen Craig 0.346
David Freese 0.337

It’s good (but maybe overly optimistic?) to see both Freese and Craig come out as better than league average performers offensively. That really bodes well for Freese as all signs point to him being a capable (i.e. league average-ish defender).  It’s much less stellar to see Boog so low.

Keep reading after the jump for some FA projections along with a little more Cards analysis

Continue reading »

© 2011 Gas House Graphs Suffusion theme by Sayontan Sinha