If you’ve read some of my previous stuff around the internet you likely know that I enjoy using simple simulations to do analysis. With the regular season upon us, I thought it would be interesting to run one such simulation before each series to serve as our PAH9 version of a series preview.
First then a quick overview of the simulation. It is built similar to the one that I described at Fangraphs
In order to run the simulation needs the [...] teams true talent win percentage. The simulation is a simple Monte Carlo that determines the winner of each game using random draws bounced up against log5 based winning percentages. For example, if we want to simulate the outcome of a game between Team A that has a 0.600 true talent win percentage and Team B that has a 0.450 win percentage, we first calculate the probability that A beats B using the log5 equation linked above. That calculation says that Team A should have a 0.647 winning percentage against Team B. To simulate a game between these teams then, the simulation draws a random number between 0 and 1 and if the number is less than or equal to 0.647 then Team A wins, otherwise Team B wins.
So, how did I derive the true talent win percentages? The process is similar to the one used by JinAZ over at BtB to do his power ranking, only I use projections instead of results (as the season continues I’ll probably use updated projections). To approximate runs scored I take the eight starting position players and calculate the projected runs/game for that lineup by plugging each player’s CHONE projection into the Runs Created formula that Fangraphs uses, and scaling that to a single game. To approximate runs allowed I take the (starter’s projected ERA/.92)*IP/start+(bullpen projected ERA/.92)*(9-IP/start) and from that I subtract the team’s defensive runs saved per game. These two calculations are plugged into PythagenPat to find the each team’s true talent win% with the respective lineup/SP combinations. On top of that %, I add a 0.040 home field advantage for the home team. Plug those into the simulation and you get the following for the Cards D-Backs series
The chart is Cards wins across the x-axis and frequency on the y-axis, so you can see that the simulation says that Cards should sweep about 18% of the time. The most common set of events was the Cards winning the first and third games, and losing the middle game to Dan Haren. This series of events happened ~20% of the time.
For the curious, here’s the table of true talent win % (with home field advantage) I derived
|Game 1||Game 2||Game 3|
So there ya have it… the Cards should win 2 of 3…. Again (well 40% of the time anyhow).
For any of you team specific bloggers that stumble upon this, if you want a copy of the simulation (it’s in excel using vba) just drop me a line and I can get you a copy (after I pretty it up a little and add a GUI like function).