Saber 101 & Saber 201

This is a reprint from a fanpost I did over at Viva El Birdos.  Though it’d be good to archive it here as well for future use….

In a thread yesterday there was a request for a sabermetrics primer, so I thought I’d take the lazy way out and just link some of the extraordinary work people have done in other places.  I’ll break this into two primary sections 1) A Saber 101 set of links for those that want to understand sabermetrics better 2) A Saber 201 for those that want to start doing some sabermetric research on their own.

Saber 101 Links

1)  Alex Remington’s series at yahoo sports.  I must admit I haven’t read through the entire series, but it has been highly recommended by people I respect a great deal.  Up to this point he’s covered the following

  • BABIP Batting Average on Ball in Play
  • OPS+ Adjusted On Base Plus Slugging compared to league average (the B-Ref way)
  • FIP Fielding Independent Pitching
  • wOBA Weighted On Base Average
  • WPA Win Probability Added
  • WAR Wins Above Replacement

Alex’s series isn’t done, so I’ll update with additional links as he continues

2)  Michael Jong’s Sabermetrics 101 blog at fanhuddle.  I especially reccomend the piece on Linear Weights.  Throughout the articles he has and his glossaries there are a ton of other very good links.

3)  Tango’s stuff.  Tango just recently answered sets of questions from folks that aren’t convinced that sabermetrics isn’t all it’s cracked up to be.  Both are good introductions to various topics, and create good discussion.  First were ten questions from Mike Silva, and then there was a set of questions from a BCB member.  Additionally Tango hosts a wiki that would be a good resource to peruse.

4)  Fangraph’s value series.  Dave Cameron describes how Fangraph’s goes about calulating WAR for hitters and pitchers to include defining replacement value and looking at position adjustments.

5)  Pitch F/X.Our very own vivaelpujols had a great primer on it as a fanshot on this very site.

I’m sure there’s a bunch more that folks will add in the comments, but this will be plenty to get you started.

Now for the Saber 201 stuff.

You probably need 2 things to start doing your own sabermetric research 1) data 2) analytical tools and I’ll try to provide a set of links for both.  First the data question

1)  Fangraphs can provide a lot of data that the aspiring saberist needs.  It’s got its version of WAR, wOBA, UZR, pitch type linear weights, batted ball profiles, various projection systems, and even summary type pitch f/x data.  It’s a great place to start while your getting your feet wet.

2)  Rally’s historical WAR data.  Want to compare Pujols to Musial?  Here’s where to start.  You can purchase the whole database in csv format or search out the guys you want to look at for free.  A lot of the Hall of Fame analysis on the saber side has been done using Rally’s data.

3)  An actual database.  Colin describing the process for a PC, and Sky for a MAC.  These methods both require you to learn SQL along the way, but are very valuable tools of the trade if you want to do any sort of complex querying of your data

4)  Pitch F/X.  For the non data-base inclined you can get individual game information from Brooks Baseball or do some more complex querying using this new tool and get an excel type output.  For those that want their own database, follow vivaelpujols’ primer found here (make sure to read through the comments as Mike Fast comes by to help out).

That’s probably enough (or even too much) to get you started.  Now you need some tools to do the analysis

1)  Spreadsheets.  Probably the most basic tool in the toolbox of the saberist.  Use excel or open office versions, whatever floats your boat.  Most places allow you to download excel friendly data, so it’s a fairly seamless transition.

2)  Statistical packages.  If excel doesn’t have enough horsepower to do what you want then there are open source statistical packages that you can download and use.  Both R and gretl are good places to start.  R is more powerful, but has a slightly steeper learning curve.  Gretl is a little easier, but not as powerful (I’m going on others opinions here as I haven’t extensively used either, some R, not much gretl at all).

I think that’s all I’ve got for now.  Feel free to add your own links in the comments.

Also, I should mention H/Ts all around, notably Tango, the BtB crew (I grabbed a bunch from the nominations in the
sabermetric writing awards).

But wait, there’s more. Pulled from this fanpost at Lookout Landing.

· Statistics/Sabermetrics

Probability vs. Certainty

When samples become reliable

o Regression

§ Basic concept

§ True talent

§ More regression

§ Groups of players and regression towards the mean

§ Regression vs. Progression

§ Reliability of statistics

o General Statistics

§ Ball in Play (BIP) statistics

· Brief explanation

· Voros McCracken’s Introduction to Batting Average on Balls in Play (BABIP)

Defense Independent Pitching (1999)

Much Control to Hurlers Have? (2001)

· Digging deeper: luck, fielding, and park factors

· Baseball Prospectus roundtable on BABIP

· Expected BABIP for pitchers

· Home/road BABIP splits

· BIP information dump

·  Why DIPS does what it does

§ Linear Weights

· Empirical linear weight values, 1999-2002

· Offense

o Concepts

o Statistics

§ wOBA

· Brief explanation

· History of wOBA

· Usefulness of wOBA

· Getting to know wOBA

§ More

· wRC and wRAA

· wRC+?

o Hit f/x

§ Introduction

· Pitching

o Concepts

§ Evaluating pitcher talent

§ How can we tell if a pitcher is any good?

o  Statistics

§ tRA

· Introduction

· Explained further

· Explanation without numbers

§ Fielding Independent Pitching (FIP)

· Brief explanation

o Miscellaneous

§ The importance of fastball velocity

§ Swinging strikes and strikeout rates

§ Break vs. Movement

§ Linear weights and curveball movement

§ Run value by pitch location

§ Pitch type linear weights explained

§ Age vs. fastball speed

§ Strikeouts & groundballs

§ Pitchers, homeruns, and flyballs

§ The League Average Pitcher

· Part I

· Part II

§ Pitchers, homeruns, and flyballs

o Pitch f/x

§ System diagram

§ Command and the catcher’s target

§ Park adjustments

§ Understanding pitch f/x graphs: location vs. movement

· Defense

o Concepts

§ Evaluating Defense

§ Defense and inferential statistics

§ Sabermetrics 101: Evaluating Fielding

§ How much is a great fielder worth?

§ Valuing defense

§ Excellent fielding presentation

§ UZR vs. PMR

§ What do you regress defensive metrics to?

o Statistics

§ Everything you need to know in one presentation

§ Ultimate Zone Rating (UZR)

· Simple explanation

· Intermediate explanation

· Creator Mitchel Lichtman’s explanation (advanced)

Part I

Part II

· Correlation and sample size, 2008 to 2009

§ Probabilistic Model of Range (PMR)

· PMR charts (through 2008)

§ John Dewan’s +/-: See The Fielding Bible Website

o Miscellaneous

§ Fielding age curve

§ Do fielders with good range commit more errors? No.

· Win Probability Added (WPA)

Brief explanation

Further explanation

What WPA is and isn’t

Addressing misconceptions

What WPA can tell us about players

WPA is not predictive

o Leverage Index (LI)

§  Crucial Situations (Tango)

· Part I

· Part II

· Part III

§ Leverage Index chart

§ Unleveraging win probability (WPA/LI)

§ LI, relievers, and the Hall of Fame

· Wins Above Replacement (WAR)

Brief explanation

How to calculate WAR

§ Pitcher win values

· Part I

· Part II

· Part III

· Part IV

· Part V

· Part VI

· Part VII

· Year-to year correlations

§  Hitter win values

· Part I

· Part II

· Part III

· Part IV

· Part V

· Part VI

· Part VII

· Part VIII

· Year-to year correlations

Tom Tango addresses WAR misconceptions

Team WAR vs. actual wins

§ WAR: it works

§ Win values correlation to wins & Pythagorean record

Career WAR vs. Win Shares

WAR and relievers

o WAR and Salary

§ Linear relationship

§ The dollar value of a win

Positional adjustments

§ Explanation/misconceptions

§ Offense by position group by decade

§ Historical positional adjustments

2009 replacement level position players

· Game Theory

Bunting

Strategic walks

The suicide squeeze

When to bring in a reliever

Pitching & game theory

Stealing home

· Miscellaneous

o “Clutch”

§ Tom Tango on clutch hitting

§ ‘Clutchiness’ breakdown

§ Team clutch hitting

o Chemistry

§ Evaluating chemistry

§ Measuring clubhouse chemistry

o Plate discipline

§ Plate discipline year-to-year correlations

§ Plate discipline to event correlations

Playoff experience

Comparing win % estimators

Evaluating managers

Sabermetric Primer

Evaluating a Trade

Everything you wanted to know about the Pythagorean method

Evaluating umpires with pitch f/x

WAR by age for the Hall of Fame

Every single one of Dave Allen’s Fangraph’s posts because he makes incredible charts like this andthis and this

· Data & Analysis Sources

o General

§ Run Expectancy, Run Frequency, Runs Created & Linear Weights Generator (using Markov chains)

§ BIP Spraychart application

§ How to build a pitch database

§ wOBA to WAR conversion spreadsheet

§ Weibull worksheet

§ Win Expectancy finder

§ Historical WAR database

o   Pitch f/x tools

§ Joe Leftkowitz’s Pitch F/x Tool

§ Brooks Baseball’s individual game analyzer

§ Texas Leaguer’s F/X Tool

§ Josh Kalk’s pitch f/x tool (2008 only)

§ How to create a pitch f/x database

·Websites to know

www.fangraphs.com

www.thehardballtimes.com

www.insidethebook.com/ee

www.beyondtheboxscore.com

www.billjamesonline.net

www.baseball-reference.com

www.firstinning.com

www.baseballmusings.com

www.statcorner.com

Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2011 Gas House Graphs Suffusion theme by Sayontan Sinha