Before we start evaluating prospects, we should get a sense of the
offensive environments in which they played.
Thanks
to Fangraphs, we never have to guess how much a player’s performance is influenced
by the league’s run environment; we just have to compare their wRC+ or ERA- (or
other +/- stat) to the league average of 100. But Fangraphs only publishes
advanced pitching statistics for the major leagues, so let’s take a step back
and look at the league’s overall run environment before inspecting or comparing
minor league pitcher’s stat lines. We really should be investigating league and
park specific run environments regardless, but it has to be the starting point
when there are no league-adjusted stats readily available.
Basic
league batting, pitching, and fielding stats for the 2015 season are available
for all leagues on Baseball-reference here,
but looking at raw count totals is not very insightful. So I converted the raw
count totals to per-game rate stats, added a few missing stats, and created bar
graphs so we’re not just stuck looking at numbers. Also, to avoid overloading
this page with graphs, I uploaded all of that data into the interactive Tableau
public worksheet below. Let me know if the worksheet is giving you problems or
slowing things down too much, because otherwise I’m looking forward to incorporating this feature more in the future.
With 19 leagues and 20 stats to explore, I'll mostly leave it to you to find what you're looking for, but let's look at a few patterns, starting with the R/G table, which should be the default setting for this
worksheet. There’s a difference of exactly two runs per game between the
highest scoring league (Rookie Pioneer League at 5.73 R/G) and the lowest
scoring league (A+ Florida State League at 3.73 R/G), and the league exactly in
between those leagues (AAA Pacific Coast League at 4.73 R/G) is known for its
high scoring environment. Of the full season leagues, the PCL had the second
highest run-scoring environment, with only the California league (4.9 R/G)
scoring more often. The South Atlantic League and Texas League scored about as
often as the AL, and the Southern League was closest to the NL. When
you switch to the H/G table, you see that 3 leagues stick out as above average
(PCL, California and Pioneer), and that the GCL sticks out as below average. There
isn’t much variation in doubles per game across all leagues, but triples per
game tend to decrease at higher levels in the minors, and homeruns tend to
increase. The Florida State League is a pitcher’s haven because it’s where
homeruns go to die, and this leads to announcers getting excited about deep fly
outs too often. The FSL was also the only league to have a lower BB% than the major leagues, while the California league was the only full season league to have a higher K% than the major leagues. I also included 'AB/XBH' and 'PA/HR' options, if you prefer to
look at the rates that way.
The
above data will give you an idea about the frequency of events across
affiliated leagues, but what if you want to know the value of those events?
Again, Fangraphs publishes this data for the major leagues (see their guts page), but we’re SOL
on the minor league side. Fortunately, linear weights are pretty easy to
calculate if you have “tidy” play-by-play data. Unfortunately, “tidy” play-by-play
data isn’t easy to find for the minor leagues. I ended up using the pitchRx
package with RStudio to scrape the necessary minor league data from MLB, but that site appears to be
synced from minor league gameday/gamelog pages, and so there is occasionally
some missing info. Still, there was enough to replicate most of the guts page
for each league, and while not perfect, the linear weights below will calculate
wOBA to within a few points of the Fangraphs reported wOBA. Also, I didn’t include an
average wOBA option in the Tableau worksheet above, but you can get that info
from the first column in the table below.
The final column, labeled “Delta Mean,” is the average absolute difference between Fangraphs wOBA and the wOBA value calculated with those linear weights (I only included position players for the comparison). As you can see, except for the GCL/AZL leagues, these weights will generally get you within .005 of the Fangraphs reported wOBA if you choose to use them (say, with the splits info from the www.baseball-reference.com minors pages). I think the only reason there is a difference is because Fangraphs scales their weights so that the runSB value is always 0.200, but the stolen base info I scraped didn’t match league totals (the other stats were fine), and so I had to scale my weights to the Fangraphs league average instead.
Looking ahead, I’d like to come back
to this topic to look at how the run environments have changed across the minors over
the past few seasons, and to investigate run environments at the ballpark
level. If you’re interested in getting a better idea of how individual
ballparks play now, Minor
League Central has reported one-year park factors for the 2011-2014 seasons,
and will hopefully update with 2015 numbers soon. Assuming the scraped pitchRx data adds up correctly, I'd like to take those park factors one step further by looking at L/R splits too.
Saturn in Asa Smith's Illustrated Astronomy (New York, 1850) @ David Rumsey Map Collection. pic.twitter.com/3BkCCxP5HX
— History of Astronomy (@HistAstro) January 19, 2016
Leave your comment
Post a Comment