library(pitchRx) dat <- scrapeFX(start="2008-01-01", end="2013-01-01") atbats <- dat$atbat pitches <- dat$pitch
dat <- scrapeFX(start="2008-01-01", end="2013-01-01" tables = list(atbat = NULL, pitch = NULL, coach = NULL, runner = NULL, umpire = NULL, player = NULL, game = NULL))
urlsToDataFramecan be used to manipulate any collection of XML files into a list of data frames.
pitchRxcan easily produce two types of strikezone plots:
Do umpires favor home (as opposed to away) pitchers?
Given the umpire has to make a decision, do home pitchers have a higher chance of receiving a called strike?"
A called strike is a case where the batter does not swing and the umpire declares the pitch a strike (which is a favorable outcome for the pitcher).
A ball is an instance where the batter doesn’t swing and the umpire declares the pitch a ball (which is a favorable outcome for the batter).
By restricting ourselves to these two outcomes, we condition upon a situation where the umpire has to make a binary decision about the pitch.
mgcvpackage to visualize the probability of a called strike (given the ump has to make a decision).
pitchFX <- plyr::join(dat$pitch, dat$atbat, by=c("num", "url")) decisions <- subset(pitchFX, des %in% c("Called Strike", "Ball")) decisions$strike <- as.numeric(decisions$des == "Called Strike") strikeFX(decisions, model=gam(strike~s(px)+s(pz), family = binomial(link='logit')), layer=facet_grid(.~stand))
We can also visualize the difference in probabilistic events by adding arguments to
Here we find the probability of a called strike during the top inning minus the probability of a called strike during the bottom inning (top inning == home pitcher).
strikeFX(decisions, model=gam(strike~s(px)+s(pz), family = binomial(link='logit')), density1=list(top_inning="Y"), density2=list(top_inning="N"), layer=facet_grid(.~stand))
strikeFX is nice for visualizing a lot of data (we just visualized over 1.5 million pitches).
PITCHf/x can also be used to regenerate (approximate) pitch trajectories.
It isn’t straightforward to animate millions of pitch trajectories, so we usually restrict our focus to a few cases.
VishnuDarvish - a case study
*Created by Drew Sheppard @DShep25
dat <- scrapeFX(start="2013-04-24", end="2013-04-24") atbats <- subset(dat$atbat, pitcher_name == "Yu Darvish") Darvish <- plyr::join(atbats, dat$pitch, by=c("num", "url"), type="inner")
Darvishcontains info on every pitch thrown by Yu Darvish on April 24th, 2013.
animateFX can be used in a similar fashion to
strikeFX for producing a series of plots that track pitch locations over time.
animateFX animations progress, the pitches are being thrown directly towards you.
animateFX(Darvish, layer=list(theme_bw(), coord_equal(), facet_grid(.~stand)))
Real time animations are hard to digest!
Plotting that many pitches makes it even worse…
animateFX(Darvish, avg.by="pitch_types", layer=list(coord_equal(), theme_bw(), facet_grid(.~stand)))
RH <- subset(Darvish, stand=="R") interactiveFX(RH, avg.by="pitch_types")