The Pain Guy at VEB seems to always stir up the scouts vs. stats debates.  The current version stems from a post analyzing Pagnozzi’s swing.  My thoughts are in the comments there.  However what I’d like to discuss is the way stats guys can either hypothetically (because we don’t have access to scouts) or actually (because we have pitch f/x data and the FSR) leverage the scouting world to improve the way us stat-heads see it.

Before I dig too deep into that question I need to take a slight tangent to talk about the way sabermetricians do their projecting/forecasting.  The basic formula is to take a weighted average of past data, regress those against some population average, and apply an aging curve.  So where do scouts come into play?  I think there are opportunities to leverage scouting data in all 3 steps.  I’ll address them in order.

  1. Getting a weighted average – Generally projection systems take 3-4 years of data weighting the most current information the most and gradually decreasing weight the further back the data comes from.  Scouting data can be added in at this step by pointing out opportunities to over/under weight recent data because of things like mechanical/philosophical changes or injuries.  Now this is a slippery slope as overweighting recent results based on philosophical changes can get you a Kyle Lohse extension, but used correctly there could be some value there.
  2. Regression to the mean – In my opinion this is where the saberist can get the most bang for his buck by leveraging scouting data/information.  The question with regression to the mean is what mean to regress to.  You want to regress to a mean of a population that the player belongs to; that population could be all of MLB (like MARCEL), players with similar builds, histories (like PECOTA), or similar stuff for pitchers (Like Nick Steiner did).  I think that using scouting data like Nick did for pitchers is likely the next step in projections.  I don’t know of any projections that currently are as in depth as what Nick did, but there are a few that use fastball velocity (MGLs for example).  I do something similar in my defensive projections, using the Fans Scouting Report as a proxy for actual scouting reports.  I wonder if a similar thing could be done for hitters using “swing type” buckets.
  3. Aging – Undoubtedly players of different skill sets and types age differently.  The problem becomes binning players into certain types.  I’d guess that having scouts input on this grouping process would be helpful.

I know that the Cardinals say they leverage scouts in their analytical models, which makes me happy.  Hopefully we can get Mo to maybe pay a little more attention to the analytical department (I’m looking at you Feliz and Miles).


Steve Sommer

Simulation analyst by day, father and baseball nerd by night

More Posts - Twitter

3 Responses to “Scouts vs. Stats and Searching for Synergies”

  1. Steve,
    This is great stuff… I suppose the problem is when you have faulty scouting. With TPG’s stuff, he passes off conjecture as fact, when we don’t actually have any way to quantify that. He thinks he can fix Miles and Pagnozzi when they already have pretty good swings… I would be far more optimistic if they DIDN’T have great swings

    Most of that was irrelevant to your post. I guess that the conclusion is that we have to decide what kind of scouting we use to leverage the stats

    • Thanks. Clearly faulty scouting would make the above process useless; however linking the scouts and stats guys would allow for things to be less conjecture and more concrete. I think an interesting experiment would be to have TPG bin guys by swing types / flaws (maybe even by year) and then do some analysis on the numbers of those guys.

Leave a Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2011 Gas House Graphs Suffusion theme by Sayontan Sinha