# Saber 101 & Saber 201

This is a reprint from a fanpost I did over at Viva El Birdos. Though it’d be good to archive it here as well for future use….

In a thread yesterday there was a request for a sabermetrics primer, so I thought I’d take the lazy way out and just link some of the extraordinary work people have done in other places. I’ll break this into two primary sections 1) A Saber 101 set of links for those that want to understand sabermetrics better 2) A Saber 201 for those that want to start doing some sabermetric research on their own.

Saber 101 Links

1) Alex Remington’s series at yahoo sports. I must admit I haven’t read through the entire series, but it has been highly recommended by people I respect a great deal. Up to this point he’s covered the following

- BABIP Batting Average on Ball in Play
- OPS+ Adjusted On Base Plus Slugging compared to league average (the B-Ref way)
- FIP Fielding Independent Pitching
- wOBA Weighted On Base Average
- WPA Win Probability Added
- WAR Wins Above Replacement

Alex’s series isn’t done, so I’ll update with additional links as he continues

2) Michael Jong’s Sabermetrics 101 blog at fanhuddle. I especially reccomend the piece on Linear Weights. Throughout the articles he has and his glossaries there are a ton of other very good links.

3) Tango’s stuff. Tango just recently answered sets of questions from folks that aren’t convinced that sabermetrics isn’t all it’s cracked up to be. Both are good introductions to various topics, and create good discussion. First were ten questions from Mike Silva, and then there was a set of questions from a BCB member. Additionally Tango hosts a wiki that would be a good resource to peruse.

4) Fangraph’s value series. Dave Cameron describes how Fangraph’s goes about calulating WAR for hitters and pitchers to include defining replacement value and looking at position adjustments.

5) Pitch F/X.Our very own vivaelpujols had a great primer on it as a fanshot on this very site.

I’m sure there’s a bunch more that folks will add in the comments, but this will be plenty to get you started.

Now for the Saber 201 stuff.

You probably need 2 things to start doing your own sabermetric research 1) data 2) analytical tools and I’ll try to provide a set of links for both. First the data question

1) Fangraphs can provide a lot of data that the aspiring saberist needs. It’s got its version of WAR, wOBA, UZR, pitch type linear weights, batted ball profiles, various projection systems, and even summary type pitch f/x data. It’s a great place to start while your getting your feet wet.

2) Rally’s historical WAR data. Want to compare Pujols to Musial? Here’s where to start. You can purchase the whole database in csv format or search out the guys you want to look at for free. A lot of the Hall of Fame analysis on the saber side has been done using Rally’s data.

3) An actual database. Colin describing the process for a PC, and Sky for a MAC. These methods both require you to learn SQL along the way, but are very valuable tools of the trade if you want to do any sort of complex querying of your data

4) Pitch F/X. For the non data-base inclined you can get individual game information from Brooks Baseball or do some more complex querying using this new tool and get an excel type output. For those that want their own database, follow vivaelpujols’ primer found here (make sure to read through the comments as Mike Fast comes by to help out).

That’s probably enough (or even too much) to get you started. Now you need some tools to do the analysis

1) Spreadsheets. Probably the most basic tool in the toolbox of the saberist. Use excel or open office versions, whatever floats your boat. Most places allow you to download excel friendly data, so it’s a fairly seamless transition.

2) Statistical packages. If excel doesn’t have enough horsepower to do what you want then there are open source statistical packages that you can download and use. Both R and gretl are good places to start. R is more powerful, but has a slightly steeper learning curve. Gretl is a little easier, but not as powerful (I’m going on others opinions here as I haven’t extensively used either, some R, not much gretl at all).

I think that’s all I’ve got for now. Feel free to add your own links in the comments.

Also, I should mention H/Ts all around, notably Tango, the BtB crew (I grabbed a bunch from the nominations in the

sabermetric writing awards).

**But wait, there’s more. Pulled from**** this fanpost ****at Lookout Landing.**

· Statistics/Sabermetrics

o When samples become reliable

o Regression

§ Groups of players and regression towards the mean

o General Statistics

§ Ball in Play (BIP) statistics

· Voros McCracken’s Introduction to Batting Average on Balls in Play (BABIP)

o Defense Independent Pitching (1999)

o Much Control to Hurlers Have? (2001)

· Digging deeper: luck, fielding, and park factors

· Baseball Prospectus roundtable on BABIP

§ Linear Weights

· Empirical linear weight values, 1999-2002

· Offense

o Concepts

o Statistics

§ wOBA

§ More

· wRC+?

o Hit f/x

· Pitching

o Concepts

§ How can we tell if a pitcher is any good?

o Statistics

§ tRA

§ Fielding Independent Pitching (FIP)

o Miscellaneous

§ The importance of fastball velocity

§ Swinging strikes and strikeout rates

§ Linear weights and curveball movement

§ Pitch type linear weights explained

§ Pitchers, homeruns, and flyballs

§ The League Average Pitcher

· Part I

· Part II

§ Pitchers, homeruns, and flyballs

o Pitch f/x

§ Command and the catcher’s target

§ Understanding pitch f/x graphs: location vs. movement

· Defense

o Concepts

§ Defense and inferential statistics

§ Sabermetrics 101: Evaluating Fielding

§ How much is a great fielder worth?

§ Excellent fielding presentation

§ What do you regress defensive metrics to?

o Statistics

§ Everything you need to know in one presentation

§ Ultimate Zone Rating (UZR)

· Creator Mitchel Lichtman’s explanation (advanced)

o Part I

o Part II

· Correlation and sample size, 2008 to 2009

§ Probabilistic Model of Range (PMR)

· PMR charts (through 2008)

§ John Dewan’s +/-: See The Fielding Bible Website

o Miscellaneous

§ Do fielders with good range commit more errors? No.

· Win Probability Added (WPA)

o What WPA can tell us about players

o Leverage Index (LI)

§ Crucial Situations (Tango)

· Part I

· Part II

· Part III

§ Unleveraging win probability (WPA/LI)

§ LI, relievers, and the Hall of Fame

· Wins Above Replacement (WAR)

§ Pitcher win values

· Part I

· Part II

· Part III

· Part IV

· Part V

· Part VI

· Part VII

§ Hitter win values

· Part I

· Part II

· Part III

· Part IV

· Part V

· Part VI

· Part VII

o Tom Tango addresses WAR misconceptions

§ Win values correlation to wins & Pythagorean record

o WAR and Salary

§ Offense by position group by decade

§ Historical positional adjustments

o 2009 replacement level position players

· Game Theory

o Bunting

· Miscellaneous

o “Clutch”

o Chemistry

§ Measuring clubhouse chemistry

o Plate discipline

§ Plate discipline year-to-year correlations

§ Plate discipline to event correlations

o Everything you wanted to know about the Pythagorean method

o Evaluating umpires with pitch f/x

o WAR by age for the Hall of Fame

o Every single one of Dave Allen’s Fangraph’s posts because he makes incredible charts like this andthis and this

· Data & Analysis Sources

o General

§ Run Expectancy, Run Frequency, Runs Created & Linear Weights Generator (using Markov chains)

§ How to build a pitch database

§ wOBA to WAR conversion spreadsheet

o Pitch f/x tools

§ Joe Leftkowitz’s Pitch F/x Tool

§ Brooks Baseball’s individual game analyzer

§ Josh Kalk’s pitch f/x tool (2008 only)

§ How to create a pitch f/x database

·Websites to know