Monday, February 02, 2009

BABIP III: High Strikeout Hitters

Brett takes issue (I think) with my latest article on BABIP:

I don't want to spend the rest of the preseason writing about this, but I am intrigued by one thing that Brett said:
As for making contact (eg. Swisher), intuitively I agree with the authors that contact rate and BABIP should be inversely proportional.

They wrote: "One might expect a higher contact rate to lead to a higher BABIP, but the opposite actually seems to be the case. This is likely caused by the correlation between strikeouts and power, since players who swing hard tend to either miss entirely or crush the ball for hits."
Brett can't be specifically talking about low contact hitters, because the top BABIP hitters aren't pure strikeout hitters. Certainly, there are some high strikeout guys on these lists (Hunter Pence 2007, Ryan Howard 2006), but many of the hitters here aren't high strikeout guys.

I am curious about Hardball Times' theory. Are players who are swinging hard "crushing the ball for hits" when they're not swinging and missing?

Top 10 Strikeouts 2008
#
PlayerKBABIP
BA
HR
$
1Mark Reynolds
204.329.239
28$20
2Ryan Howard
199.289.251
48$30
3Jack Cust
197.311.231
33
$16
4Dan Uggla
171.323
.260
32
$22
5Carlos Pena
166.307.247
31$21
6Chris Young
165.304
.248
22
$19
7Adam Dunn
164.262.236
40$21
8Matt Kemp
153.363
.290
18
$33
9Jim Thome
147.276.245
34
$20
10Ryan Ludwick
146.349
.299
37
$32

We've talked a lot about "regression to the mean" the last few days, so it's useful to know that the mean in 2008 for BABIP was .313. That is to say that of the 147 hitters who qualified for the batting title in 2008, the 74th hitter - Adam LaRoche - had a .313 BABIP.

This chart isn't a ringing endorsement for the theory that these guys are "smoking the ball" when they're not missing it. Dunn, Thome, and Howard are the only hitters who are well below the mean, but only Kemp and Ludwick are really well above it.

Top 10 Strikeouts 2007
#
PlayerKBABIP
BA
HR
$
1Ryan Howard
199.336.268
47$32
2Dan Uggla167.286.245
31$16
3Adam Dunn
165.309.264
40
$28
4Jack Cust164.366
.256
26
$16
5Mike Cameron
160.300.242
21$18
6Grady Sizemore
155.334
.277
24
$31
7B.J. Upton
154.399.300
24$30
8Brandon Inge
150.308
.236
14
$10
9Jhonny Peralta146.329.270
21
$16
10Carlos Pena142.305
.282
46
$31

Looking at this backwards, half of the players from the 2008 list repeat on the 2007 list.

2007's mean is about the same as 2008's (.312), but now we have hitters like Upton and Cust who are truly smoking the ball when they're not swinging and missing. With the exception of Pena, there is an incredible amount of variance in the repeaters from one year to the next.

We're still not any closer to figuring out how much of this is good luck one year versus bad luck the next. And I'm not necessarily convinced that Brandon Inge is due some additional luck because he's taking some "good cuts" on the balls he's not completely missing.

Top 10 Strikeouts 2006
#
PlayerKBABIP
BA
HR
$
1Adam Dunn
194.278.234
40$20
2Ryan Howard
181.363.313
58$43
3Curtis Granderson
174.337.260
19
$14
4Bill Hall
162.324
.270
35
$24
5Alfonso Soriano
160.302.277
46$44
6Jason Bay
156.338
.286
35
$31
7Richie Sexson
154.303.264
34$20
8Grady Sizemore
153.342
.290
28
$29
9Nick Swisher
152.287.254
35
$18
10Jhonny Peralta
152.329
.257
13
$7

Once again, more repeaters (Dunn, Howard, Sizemore, and Peralta). Now you've got even more hitters (six) above the mean (.314) for 2006.

One more...

Top 10 Strikeouts 2005
#
PlayerKBABIP
BA
HR
$
1Adam Dunn
168.281.247
40$24
2Richie Sexson
167.307.263
39$26
3Pat Burrell
160.341.281
32
$27
4Preston Wilson
148.317
.260
25
$24
5Brad Wilkerson
147.317.248
11$20
6Troy Glaus
145.287
.258
37
$25
7Jason Bay
142.355.306
32$40
8Brandon Inge
140.315
.261
16
$15
9Alex Rodriguez
139.349.321
48
$49
10Jim Edmonds
139.314
.263
29
$22

These BABIP numbers all look pretty good. But for some of these players (Sexson, Wilkerson, Wilson, Inge), it's the low BA/high K combination - and not the BABIP - that should have lit up like a warning sign ahead.

One significant problem with expecting BABIP to be higher for sluggers is that it loses sight of what a strikeout is.

It's a negative outcome. It's three swings and misses. There might be some hitters taking great cuts and just missing, but there are also a lot of bad swings on strike three, too.

After Pat Burrell's .341 in 2005, he's put up 298, 283, 275. He might be below his expected BABIP, but after three years of putting up BABIPs under 300, it might be time to ask if our expectations are too high.

After Sexson's .217 in 2007, a lot of the numbers wags said Richie would bounce back. He did...to .275, which didn't make enough of an impact.

I think there are hitters who make great contact or swing and miss. But there are also hitters taking lousy swings and putting the ball into play. Don't assume someone who mashes the ball out of the park is making good contact when that ball stays in the yard.

2 comments:

Jason Collette said...

Strangely, there is not a very strong correlation between contact rate and BABIP. I've been playing around with those numbers myself tonight and found that over the last 5 seasons for those batters with 500 or more plate appearances (737 in all), the mean BABIP is .308. This graph show the rather weak correlation for the large sample size. I was rather surprised by the results myself.

Eugene Freedman said...

I think where regression to the mean for pitchers is relevant, the theory from THT says that hitters have their own individual mean to regress to based upon the GB/FB/LD rate, their speed, and their batting side. There are also some park factors.

That's why Jeter and Ichiro always appear at the top of the lists. It's not luck it's design. But, pitchers only have control over three outcomes- HR, Ks, and BB. Everything else is ballpark and defense. When a pitcher has one good year of BABIP, it can be assumed that the next year, if his three true outcomes are identical, is ERA should regress toward the mean b/c of the increased hits. It's all related to DIPSERA or the other similar calculations.

For the hitters, though Burrell seems to have a similar BABIP except for the one year around the league average. That means he was lucky that year, not unlucky the others. His style of swing, footspeed, and right handed bat just to lead to high BABIP. He will regress toward his personal mean, not the league mean.