<$BlogRSDUrl$>

Tuesday, September 23, 2003

YOU MEAN IT COMES IN 31 FLAVOURS?

I seem to have steered you wrong, oh loyal reader.

While doing some more research, I realised that I was interchangeably comparing the totals from two different ways of calculating Runs Created in my previous post about Albert Pujols' great season, PUTTING IT INTO PERSPECTIVE.

It seems that depending on how you swing, there are about 24 different ways of calculating Runs Created--and they're all simply referred to as RC. Honestly.

Anyways, the Runs Created totals in the last chart listing the league Win Shares leaders over the years were calculated using the so-called basic method found at Baseball-Reference:

RC = (H + BB) * TB / TPA
This is the original formula that Bill James pioneered in the early 80s, as the Twins Geek describes in his excellent Intro to Runs Created.

The RC totals in the charts comparing Bonds, Sosa, Gonzalez and Pujols are derrived from a moderately more complicated method calculated on ESPN's website:

RC = [(H + BB + HBP - CS - GIDP) * (TB + .26 * [BB - IBB + HBP] + .52 * [SH + SF + SB])] / (AB + BB + HBP + SH + SF)
(If you're looking for information on how to take this formula and calculate RC27, or a sketchier, easier version called RC25, check out StrikeThree.com.)

This certainly looks more impressive, and the fact that it includes Stolen Bases, helps to slightly discount the effects of Intentional Walks, and gives players credit for Hits By Pitch which some say is a real science (although personally I have always been suspicious of giving players credit for non-intentional sacrifices like Sac Flies--they're no less team dependent than Runs or RBIs), to name three improvements, suggests it's a more valuable metric. (And if ESPN is going to do the calculations for you, who cares that they're more time consuming?)

Yet even this doesn't take into account team factors or a player's real-life performace with runners in scoring position. Fortunately, that's why Bill James devised a new Runs Created formula:

Step 1 - Calculate the A, B and C Factors

Calculate the A, B, and C terms as follows:
A = (H+ BB + HP - GIDP - CS)
B = [ TB + ((BB + HB - IBB) * .24) + (SB * .62) + ((SH + SF) * .5) - (SO * .03) ]
C = (AB + BB + HB + SH + SF)

Step 2 - Calculate initial Runs Created (iRC) by inserting A, B, and C factors into theoretical team context and round off the result:

iRC = [ ((A + (2.4 * C)) * (B + (3 * C))) / (9 * C) ] - .9 * C

Step 3 - Calculate the adjustment for Home Runs with Runners On Base (HR-ROB)

expected HR = (AB-ROB / AB * HR)

HR-ROB = HR - expected HR
(round off)

Step 4 - Calculate the adjustment for batting Average with runners in Scoring Position (AvgSP)

AdjAvgSP=(AvgSP - Avg) * ABSP
(round off)

Step 5 - Calculate the Preliminary RC (PrelimRC)

PrelimRC = iRC + HR-ROB + AvgSP

Step 6 - Calculate the team Reconciliation Factor (RF)

Calculating PrelimRC for all players on the team, and then sum all the rounded individual players' PrelimRC to get the team PrelimRC.

RF = team R / team PrelimRC

Step 7 - Multiply team reconciliation factor times individual player's PrelimRC and round off to get the final RC result

RC = RF * PrelimRC

Simple, right? (For more of the gory details, go here.)

Incidentally, it's my understanding that these new Runs Created provide the framework for calculating offensive Win Shares.

Ok, what I probably should have said beforehand was the reason Runs Created has mutated into these other unwieldly permutations is that while the simple version provides a good thumbnail of a player's offensive value, kinda like OPS, it's not perfect. For one, its usefulness tends to break down evaluating extreme high walk/high slugging players like Ruth, Williams, and Bonds--kinda like OPS. (If you're feeling really lazy, simple RC can more or less be expressed as OBP * SLG * AB. This underlines the fact that Runs Created is essentially the total value of a player's OPS over all his plate appearances, and therefore is a useful tool to quickly help resolve certain debates when you don't have a copy of Baseball Prospectus handy.)

Why? Part of the problem is that at its extremes the formula suggests that these guys are getting on base nearly half the time AND then slugging themselves home to the tune of a run or more per game--as if that player were surrounded by a lineup of juggernauts like himself. (For a real life example, consider the case of Bonds 2001-2003: despite Bonds' superhuman OBP, he hasn't lead the league in Runs, and despite his superhuman SLG, he hasn't come close to leading the league in RBIs.) As Tangotiger points out, this means 1 HR is now worth 3.6 Runs in the Runs Created formula, which clearly defies common sense. That the formula "works" as often as it does and provides a reasonable estimate of a player's offensive value for most players with OBPs between .300 and .400 Tangotiger calls "purely an accident".

Fortunately, there aren't too many of these guys around, and, accident or not, most RC formulas do fine for most purposes. Even the basic formulas

RC = (H + BB) * TB / TPA
and even
RC = OBP * SLG * AB
aren't too much worse than the others unless you're Keith Law or Paul DePodesta (or the next wannabe). And to get a quick sketch of a player's offensive value to go alongside OPS, it can't be beat.

Other systems, such as Baseball Prospectus' Equivalent Runs (where EQR = 5 * OUT * EQA^2.5), Pete Palmer's Linear Weights, David Smyth's Base Runs, and Jim Furtado's Extrapolated Runs, try to overcome the real limitations of Runs Created in different ways. But that's another post for another day.

So, with no further ado, here are the original RC = (H + BB) * TB / TPA results with the resultant difference between ESPN's RC calculations in brackets:

Bonds 2001
206 RC (+11)

Sosa 2001
182 RC (+8)

Gonzalez 2001
172 RC (+6)

Pujols 2003
156 RC (-1)

And earlier I mentioned something about a debate...

Bonds 2003
145 RC (+1)

Seems clear to me.


PUTTING IT INTO PERSPECTIVE

I heard BP’s great Joe Sheehan on FAN590 radio yesterday talking about Pujols’ “historic” season—the only thing standing between Bonds and a record third-straight MVP.

So, just how good is Albert Pujols this year?

Well, if you’re if you’re like me and you think the MVP should go the who has contributed the most to his team’s victories, then Pujols’ ‘03 season back in ‘01 might still only have been good enough for fourth place in that year’s NL MVP race.

PlayerAVG/OBP/SLGOPSAVG/OBP/SLG roadOPS roadTPAHRRRBISB-CSTBBB/KIBB
Barry Bonds 2001.328/.515/.8631.379.321/.514/.8171.3326647312913713-3411177/93 (1.90)35
Sammy Sosa 2001.328/.437/.7371.174.321/.444/.7091.153711641461600-2425116/153 (.76)37
Luis Gonzalez 2001.325/.429/.6881.117.308/.424/.7011.125728571281421-1419100/83 (1.20)24
Albert Pujols 2003.363/.443/.6791.122.329/.403/.6101.013661421301234-138776/60 (1.27)12

Bonds, Sosa, Gonzalez all had career years in 2001 (although Bonds seems determined to have a few more before he’s done), but my guess is that Gonzalez’s year was most impressive in the sense that it represented the highest leap from his then career norm. It’d be interesting to do a comparison.

Pujols is behind them in homers and total bases, and his OPS drops below 1.100 on the road (strange, since Busch stadium has been fairly neutral over the years, slightly favouring pitchers--albeit less so than Pac Bell. Wrigley field is also on the pitching side of neutral, but Bank One ballpark is a definite hitter's park, giving Gonzalez approximately the same equivalent advantage that Pac Bell effectively "disadvantages" Bonds, amazing given the success of Johnson, Schilling, and Kim).

What's really interesting to my eyes is that Pujols is so far behind in intentional walks. That clearly says something about what opposing pitchers think about the rest of Cardinals' lineup. It suggests that Pujols' command of the strike zone is better than Gonzalez (whose BB/K drops below 1 without IBB), and that Pujols is working a little extra harder to earn those stats on more than reputation.

Let's check out to see how all of this translates in the fancier metrics.

PlayerAdOPS+EQAEQRVORPRCRC 27WS
Barry Bonds 2001262.428195154.0195.215.9754
Sammy Sosa 2001201.367164116.0173.911.5142
Luis Gonzalez 2001176.350151100.5166.410.4337
Albert Pujols 2003tbd.36614796.7156.711.1140

Pujols looks much more competitive now, and will probably get another 30 or so plate appearances before the end of the season. I think he's a good bet to make a definitive push ahead of Gonzalez, and there's a chance to reach Sosa.

The punchline, of course, is that Albert Pujols really did come fourth in the 2001 NL MVP with 29 WS (although given Randy Johnson’s 37 WS and Berkman’s 32 WS, he probably didn’t deserve quite so high a finish). But it was still a pretty amazing year, especially for a rookie.

PlayerAdOPS+EQAEQRVORPRCRC 27WS
Albert Pujols 2001158.32912671.61338.3729

2001 might have been an usually strong year for several players, but it's something we've grown accustomed to over the last few years.

Bear in mind that Bill James wrote:

A 30-Win Share season is, in general, an MVP candidate-type season. People have won MVP awards with less; people have failed to win MVP awards with 40 Win Shares. But when a player gets to 30 in an ordinary season, he's probably going to be visible in the MVP voting.
Here are the top ranking players by year in Win Shares since the lockout:

LgYearPlayerPosOPSAdOPS+RCWS
AL2003Alex RodriguezSS1.002tbd13432
NL2002Barry Bonds*LF1.38127518549
AL2002Alex RodriguezSS1.01515214935
AL2001Jason Giambi1B/DH1.13720216238
NL2000Jeff Kent*2B1.02116514737
AL2000Jason Giambi*1B/DH1.12318815638
NL1999Jeff Bagwell1B1.04516914939
AL1999Roberto Alomar2B.95514012735
""Manny RamirezLF1.10517415135
""Derek JeterSS.98916114935
NL1999Mark McGwire1B1.22221717941
AL1998Albert BelleLF1.05517116237
NL1997Tony GwynnLF.95715613439
""Mike PiazzaC1.07016715339
AL1997Frank Thomas1B/DH1.06718114839
NL1996Jeff Bagwell1B1.02117914441
AL1996Alex RodriguezSS1.04516015634
NL1995Barry BondsLF1.00916812536
AL1995Edgar MartinezDH1.10718314538
*won league MVP award

A-Rod's comparatively pedestrian league leading 32 Win Shares this year, albeit still MVP worthy by James' standards, underlines how many historically great seasons we've seen of late. And how, despite even those lofty standards of Sosa, McGwire, Bagwell, Thomas and others, Pujols has been amazing this year. Historically amazing. In fact, he’s got a chance to tie, and maybe even pass, the best non-Barry Win Shares total since ’93 (when Barry had 47 WS).

But given what Barry et al. have done at the plate over the last decade, not to mention Johnson, Martinez, Maddux, Clemens on the mound, it’s perhaps a little understandable that we’ve gotten slightly blaze face to face with another historically amazing season or whatever you want to call it. (Especially when no sexy records like HR, BA, or RBI are broken in the process.)

The fact that Pujols failed to reach .400 or lost out on the Triple Crown should not diminish what he’s done. This is one of the fifty or so great seasons of all time. (Cover your mouth when you yawn.)

As for the MVP, whether you choose to penalize Barry and his 39 WS because of playing time lost to his father’s illness is up to you. But it seems strange to penalize Pujols for the same thing. Because according to Win Share, Pujols has done everything a player born after 1932 (not named Barry Bonds) can be expected to do.


This page is powered by Blogger. Isn't yours?