Over at BtB I did a post on clustering (grouping for the non-math inclined) hitters based on various stats (typically batted ball information combined with discipline information). I ended up discussing the clusters that resulted from using line drive rate (LD%), home run per fly ball (HR/FB%), and walk rate (BB%). The following is a table that summarizes the qualifying Cardinals

Name LD% HR/FB% BB% Cluster wOBA Cluster wOBA
Albert Pujols 16% 20% 17% 7 0.449 0.398
Matt Holliday 16% 13% 11% 6 0.390 0.363
Ryan Ludwick 19% 12% 8% 5 0.336 0.362
Colby Rasmus 20% 9% 7% 3 0.311 0.330
Skip Schumaker 22% 5% 9% 9 0.336 0.343
Yadier Molina 20% 5% 9% 9 0.337 0.343

So what does this mean? Basically if you average across people with similar LD%, HR/FB%, and BB% to the players listed you get the wOBA in the last column (click here to see the entire sets of clusters based off of various stat sets). Is this predictive? Are those that are below the cluster due for an improvement, while those above due for regression? As Tango points out, not necessarily. There’s clearly bias due to the data sets used to generate the clusters, so there’d have to be more work done on finding the correct data elements to include in the cluster. With that being said, it’s still interesting to see what types of players are clustered together.


On an unrelated note, tomorrow I’ll be attending the St. Louis Chapter of SABR’s hot stove luncheon to include sitting on a panel that discusses Cardinal blogging. I’ll be sure to report back with any interesting tidbits from the day.


Final unrelated note. This website amuses me.

Steve Sommer

Simulation analyst by day, father and baseball nerd by night

More Posts - Twitter

Leave a Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2011 Gas House Graphs Suffusion theme by Sayontan Sinha