# One of the Better Statistical Discussions You’ll Read This Year – Well, or at Least Today

Earlier this week, there was a dust-up about the WAR statistic (Wins Above Replacement), which, yes, is quickly becoming a tired old fight.

ESPN’s Jim Caple wrote a nice piece about WAR, but one that ultimately makes an argument that doesn’t need to be made. The essence of the argument was, “WAR shouldn’t be used as the only statistic.” I don’t think anyone would dispute that in this day and age.

So it’s quite a credit to FanGraphs’ Dave Cameron that he was able to write a response piece that didn’t suffer from the same level of “did this really need to be written”-itis. Indeed, Cameron’s article – “What WAR is Good For” – is one of the better statistical analysis pieces I’ve read in quite some time.

The thrust of Cameron’s piece, as indicated by the title, is that WAR – like so many other statistics – is valuable because it seeks to answer a question that we all have about baseball. Indeed, the real value in the article to me is the idea that statistics are answers to questions, and the better statistics tend to be the ones that answer the narrowest, simplest questions.

Here’s a particularly engaging section from Cameron, which will become my go-to attack on W/L record, at least with respect to those persons who appreciate nuance:

Whether it is batting average, strikeout rate, swing percentage, or average velocity, each one was designed to answer a pretty simple question. How often does that player get a hit? How often does he swing? How hard did that pitcher throw his fastball? These are questions that are worth asking, and so we track things in baseball that allow us to answer those questions with data, assuring us of getting a pretty accurate answer in most cases. In fact, I think a litmus test for the usefulness of a statistic can be simply translating the definition into a question and figuring out how often anyone ever asks that question?

Let’s take Wins for a pitcher, for instance. For a long time, they’ve been hailed as one of the most important statistics in baseball, but the actual question that statistic is answering goes something like this.

“How many times did that pitcher complete at least five innings, leave the game with his team having outscored the opponents through the point at which he was removed, and then watch his relievers finish the game for him without surrendering the lead that his teammates helped create in the first place?”

No one would ever ask that question. It’s not something that’s worth knowing, nor does it help anyone understand what actually happened in any real way. And that’s one of the reasons why W-L record for a pitcher has been marginalized and will eventually just be discarded, since it serves no real purpose. It answers a useless question, making it mostly a useless statistic.

Magnificent.

The entire article is well worth your time, especially if you’re endeavoring to broaden your statistical background. Cameron’s approach is largely inclusive – more stats are a good thing – which is a good attitude to have in an age where so many statistics are available to us. In that way, Cameron and Caple ultimately agree: WAR, like many other stats, is best used as a piece of the puzzle. It offers one answer – or, perhaps more accurately, one slice of the answer pie – to one of the questions worth asking.

#### Brett

Brett Taylor is the editor and lead writer at Bleacher Nation, and can also be found as Bleacher Nation on Twitter and on Facebook.

### 59 responses to “One of the Better Statistical Discussions You’ll Read This Year – Well, or at Least Today”

1. Am I the only one whose feed aggregator is picking up new posts way late? And in bunches?

1. Nope. The BN RSS feed has been pretty jerky-jerky for the last week or so.

1. Hrm… I’m positive I wrote herky-jerky. Silly Mac OS autocorrect.

2. You’re making me hungry, man.

1. …and horny, too, I bet.

3. Good to know.

2. I literally LOL’d at the title of the piece. Much better than the ESPN “WAR Horse” attempt. It has been Pocket’d for later reading.

3. Wow. You have to scroll way, way down the referenced page to find an Eric Burden reference.

1. I’ve never been so disappointed in the FanGraphs readership.

4. Read this article the other day and loved it. I totally agree with the part about looking at stats as the answer to a question. I have gotten into debates before with my dad and a few of my older relatives about the validity of W-L, RBI, Etc. and will be using this idea next time the topic comes up. Hopefully they see the light, and when i call home this summer to talk Cubs the conversation is more about FIP and not W-L.

5. I haven’t read Cameron’s piece yet. But, I did read Caple’s article. Here is my take. The problem that Caple really has, and one that I can certainly sympathize with, is that WAR is an aggregate of statistics, it is not really itself a statistic.

There should be little argument of the value of advanced statistics (e.g. xFIP vs. W-L). But the potential problem with WAR (which really shouldn’t be even discussed as a singular since there is more than one WAR) or VORP is that they become dangerous in that they boil down a very complicated set of variables into a single shiny #. We see this play out here as some players “WAR” values are anomolously high (Darwin Barney and Campana come to mind).

While I appreciate the simplicity of a metric that compares the value of a slugging 1B a swift footed CF and a starting pitcher, I also am a baseball fan because of the complexities of the game and the puzzle that is putting together a team. Certainly the creators of WAR and VORP aren’t out to take shortcuts, but these numbers are used that way. In that way, I agree with Caple.

6. Tom Seaver: “good pitchers win games, bad pitchers lose games.”

1. Felix Hernandez bad, Lance Lynn good?

1. Felix’ 2010 season is just so blatantly obviously an example of the problem with W/L. He went 13-12 in 34 starts. In his 9 no decisions he gave up 14 TOTAL RUNS and only gave up 3 runs TWICE. In his 12 losses he gave up 3 or fewer runs 8 times. Arguably he should have lost 4 games and won 30. He had 30 games where he gave up less than 3 runs. Set the bar higher… he gave up 2 or fewer runs 25 times. He won 13 games. An absolutely dominant season and he won 13 games.

1. My favorite evidence against W/L is 2004 Ben Sheets. What a dirty, dirty pitcher he was that year…

1. You are correct about that. Great season even tho he lost 14 games and won 12.

2. Wow, no kidding, almost as good as King Felix. Sheets’ 2004 season:

34 Starts
26 starts 3 runs or less
21 starts 2 runs or less
8 starts with ZERO runs (only got 5 of his wins from those)
1 start gave up 6 runs, 2 starts gave up 5 runs. 31 starts with 4 runs or less.

Dominant. 12 wins.

7. Like the quest for the unified field , some fans are looking for the one universal statistic that will show a players’/teams’ worth. It already exists;it’s called won-loss. You win more, you are a success. You lose more, you’re not.

1. A comment that’s not worth reading about a statistic answering a question that’s not worth knowing.

2. As a team stat, W/L is really the only thing that matters in the end.

As an individual stat, it’s pretty stupid and arbitrary.

8. I completely agree that W/L will be completely abandoned in my lifetime (I’m 44). The best recent example of a statistic disappearing is Game Winning Run Batted In, which was a product of the 80′s and disappeared almost as quickly as it appeared.

1. Eh, I highly doubt the statistic will ever go away, nor would I want it too. The reality is that the W-L statistic (for pitchers of course) is, at the very least an important part of baseball history.

While it may not be a particularly useful statistic, milestone numbers such as 300 wins are an important part of the lore of baseball.

1. Lore should sometimes be left in the past. If a pitcher goes out and gets shilled but his team is hitting well too and wins 15-13, did he really have a good game? On the other hand you can lose 1-0 games. Did the losing pitcher have a bad game? W/L for an individual player is useless. It should only be used for the team.

1. I think W/L shouldn’t be abandoned just realized for what it tells you.

Thinking a 15-14 pitcher is better than a 10-3 pitcher just because they won 5 more games is silly.

Saying that you anticipate Garza winning 20 games this year means that you think he will perform Ace level. Now, looking at his 2012 and saying he only won 10 so clearly he was garbage, is silly.

You can look at a guy’s career who has 300+ wins and know that he was really good for a really long time. But to debate 350 vs. 325 wins being a better or worse pitcher, is silly.

It’s a broad brush stat. It’s like saying a guy is a .300 hitter. Ok, what’s his OBP, what’s his SLG, BB%, K%, etc… It’s a casual, scratch that, lazy, way of saying how good someone is.

2. Here is hoping that quality start is the next “stat” to be abandoned. What a bunch of garbage. Well I only gave up 3 runs in 6 innings so if I had teammates that were worth a crap I should have gotten a win. Bah. OMG I sound like my father.

1. Well… the reason Quality Starts is a decent stat is that we can know the outcomes of games when a pitcher goes X amount of innings and gives up X amount of runs. So if a team wins most of the games where that happens, then you can say a pitcher had a quality start. It’s not at all perfect. It’s certainly better than W/L.

9. Cameron posted another article today about his Top 10 Best Transactions of the Off-Season and the Scott Baker signing comes in at #7. Cameron thinks that if Baker proves he’s healthy and pitches well, the Cubs could sign him to an extension, which would be a nice change from assuming every signing is getting flipped.

http://www.fangraphs.com/blogs/index.php/the-10-best-transactions-of-the-off-season/

10. I take from your bullet that the thrust was that a statistic should answer a narrowly defined question and that the stats when fitted together should form a perception that leads to an over all picture of what is happening.
I believe that the W/L ratio is indeed a good example. Is it designed to indicate how good a pitcher has performed? Probably! But it actually gives a rough indication if a pitcher was good enough to pitch for a particular team. A more clear focus becomes apparent when IP and IP/gm avg is observed.
One size does not necessarily fit all. A set of stats may work in evaluating a certain player but not another. Does the K rate of Adam Dunn have the same significance at that of BJackson?
The science of stats is great. The stats answer a question. A set of stats answers a set of questions designed to show a perceivable picture.
The art of using that science… well, that is where the big bucks are.

1. The problem is that you can have two pitchers earn a W on the same day with dramatically different performances.

Player A: Pitches a perfect game and gets the W
Player B: Pitched the last third of the top of the 8th inning, offense scores a run to put them in the lead during the bottom half of the inning, closer brought in for a save in the 9th, he gets a win.
Player C: Pitches a CG but with only 1 run given up, the offense doesn’t support him and he gets a loss.

Please, tell me how looking at the W column that you can discern ANYTHING about these three pitchers performances?

1. oh, then there is:

Player D: Gives up the tying run but the offense gets a run back in the next half inning. Gets a W.

WTF. Defend that one, W/L folks.

1. http://hardballtalk.nbcsports.com/2012/04/12/jeff-gray-and-the-absurdity-of-pitcher-wins/

Except:

“So, in the span of about 18 hours (Jeff) Gray went from one career win to three career wins, all while facing two hitters and throwing a grand total of three pitches. Obviously he just knows how to win.”

2. Hansman, we are in agreement. Wow. Wins are just stupid. I don’t think they should be discarded, but they should not be used in determining things such as Cy Young, All-star spots, Hall of Fame worthiness, who is the better pitcher and so on. If I pitch 5 innings, give up 8 runs, but my team scores 10 runs in that time frame, I get a win, but the next game, I pitch 8 innings, give up 1 run, and my team gets shut out, I get a loss. It makes no sense at all.

2. Please don’t lose sight of the concept that the object of the game is to out score the other guys over 9 innings.
Please don’t lose sight of the fact that resources are limited.
The ’27 Yanks did not need the equals of Drysdale and Koufax to win. The ’63 Dodgers did not need the equals of Ruth and Gehrig to win.
Jason Marquis was no C.C. However, he was good enough to win with The Cubs and get in around 180 or 190 IP a year. He didn’t put undue pressure on the BP and he won. Was he a good an elite pitcher? NO! Was he good enough for The Cubs. Yes! The won a very good % of the games he started.
W/L % gives a rough indication as to whether or not the pitcher was good enough for the team on which he pitched!

1. So King Felix wasn’t good enough to pitch for the M’s? I really can’t believe that is what you are trying to sell. What I’m guessing you are saying is that if you compare a starting pitchers W/L% versus the teams W/L% then you have an idea if the pitcher is better than the other starters on his team.

11. I did not read the article nor do I know the specifics of what makes up the WAR statistic. However, I will argue that I do think having fewer statistics than more is a good thing. The reason I say that is you can look at 20 different stats, some will be good, some will not be as good for a certain player. What needs to be created is a weighted index of these individual statistics that allows for us to account for the value of each individual trait as it corresponds to the overall value of the player. I believe that is likely what WAR does. However, statistics similar to ERA+ are good examples for comparing players against each other on a similar playing field as it accounts for some variation of teams played or stadium etc. However, I believe these statistics could go deeper to account for many many more variables that affect performance and you could actually create predictive models for what players could do for your team based on specific parts of their actual performance.

12. But if WAR really answers the question it claims to answer, why would you need any other stats? The ultimate goal is to win games. If there is a stat that will accurately tell me how many games a player will help me win, why do I need any other stat? By acknowledging that WAR isn’t the only stat you should look at, aren’t we admitting that it doesn’t do what it claims to do?

1. Yes, you do still need other stats. Suppose that A and B are two 21 year old players have the same WAR. OK, groovy, they’ve contributed the same to winning at 21. However, WAR is the composite of numerous stats. Suppose further that A’s WAR is held down by less power than B whereas B’s WAR is held down by drawing a lot fewer walks than A. Power is something that tends to increase as a player ages, so even if B’s native power will always be a bit greater, A can do a much better job of catching up. However, batting eye is a more fundamental tool, and it’s much less probable that B will catch up to A at all.

So, breaking down WAR components by those that can develop versus those that are more fundamental tools should be very important when evaluating young players.

Similarly, suppose that C and D also are 21 and have the same WAR as A & B. However, C did it with a really high BABiP whereas D actually had a very low BABiP. Well, we actually can project that D will jump ahead of A & B because D was a bit unlucky, and that C will drop behind the other three because he was a bit lucky.

Now, none of that alters what those four 21 year-olds did in that season: all made roughly equal contributions to winning. And this is key: *that is all WAR does for any one season.* However, the other stats leave us with better ideas about which of them will be doing most to help their teams in 3-4 years.

1. That is the wrong way to look at it. WAR accounts for all of those things, so it doesn’t matter if they get the WAR by power or speed,or OBP, they are still worth the same in value of wins.

You did have an interesting point about different parts of a person’s game changing over time. This is true and this is why WAR is an index for their current value, not a predictive measure of value. Those players A and B may change over time, thus their WAR will change, and that is why WAR is a stat that is more valuable than its collective parts.

2. WAR is an abstract constructed from primaries. Well constructed abstracts are very useful. Indeed, we would be pretty much confused in life without them. However, we should not forget that they are tethered to those primaries. That is to say, info is lost in the process of abstract creation and a return to the primaries can indicate just what and how much of that info has been misplaced

1. If you accurately predict the relationship betwee your primaries and your abstract the abstract will provide much more information than any specific primary ever could. For example:

If you consider it as an weighted index based on the primaries you get 2 players:

WAR 4 BA .300, OBP .380, Field %.995
WAR 3.2 BA .325, OBP .380, Field % .915

Now, these are extreme examples I created off the top of my head but it proves point that if you looked at onlyBA player 2 i smuch better and well his OBP is the same so no biggie,but if you look at all 3 traits together player 1 has more value to the team as a whole due to his offense is fine, but his defense is much better.

I agree you can’t ignore the primaries, but that is only because you need them to create the abstract. Once you get the abstract ignoring it for one of the primaries either means youhave no faith in the abstract, or you just don’t care, and that would not be ideal.

1. Leaving aside the poor quality of defensive stats…
Continuing the Dr’s chemistry thing…
Gibbs free energy gives a bunch more info than temp/pressure/vol. Same with the other derived thermo stuff. But if you only look at chemistry through the eyes of Maxwell you may be missing something.
The structure of baseball, much like the economy, is much more dynamic than that of the physical sciences. Things change!
What is the effect of the banned greenie on older players. Will younger players who have never used the substance adjust better when they age? There are a whole lot of questions such as that which could, and probably will, effect stats such as WAR.
A method has to know its limitation. Those using the method must understand where those borders are.

13. Cameron’s title is awesome.

14. While I am not a fan of pitcher W-L as a statistic and agree with the many faults outlined in the comments, the other side of the coin is that it’s an approximation for a pitcher’s performance. Generally, a pitcher with more wins pitched better than a pitcher with fewer wins. Of course, there are exceptions. There are also exceptions with virtually ever other stat, but maybe not as many as wth W-L. In evaluating the validity of a stat, we should analyze its accuracy it evaluating a players performance relative to the margin of error. For example (and these numbers are lazy illustrations only), maybe a pitcher who wins more games than another pitcher actually pitched better than the other pitcher 75% of the time. Maybe a pitcher with better ERA than another pitcher actually pitched better 90% of the time. Maybe a pitcher with a better FIP than another pitcher actually pitched better 98% of the time. It doesn’t mean W-L is meaningless; it’s just not the most accurate measure available. And, the real bottom line is what statistic is the most accurate indicator of who pitched better.

15. I believe it was North side Irish that had a good point earlier. What if Baker, Feldman and DeJesus all have great years leading up to the trade deadline. I’ve always agreed to signing Garza to an extension if he is healthy but the other players I’ve just assumed we flip. Maybe Rizzo and Castro are not living up to the lofty expectations we have for them and are just average. This causes us to be 10 games out at the trade deadline. Given the free agent market next year and the fact that our top prospects are still progressing, due we try to wrap up Baker, Feldman or DeJesus? Love the web site btw.

1. I doubt it. DeJesus is getting old, so I don’t think you want to resign him unless it is a 1 year deal, though his value may be higher as a trade piece if he is doing that well. As for Baker and Feldman. I could see the cubs keeping one of them and flipping the other, but I wouldn’t expect a large extension to any of those three.

1. DeJesus has a 2014 club option for \$6.5 million and a \$1.5 million buyout. FYI, did you know Starlin Castro’s middle name is DeJesus?

2. I think it’s an excellent question, but just to note: The Cubs were 18 games out at the deadline last year. I fully expect them to be better than that, but probably not 8 games better.

Interestingly at the deadline last year Pittsburgh was the 1st Wild Card and 3 games back in the division. After the break they were the 4th worst team in MLB.

16. I was asked if King Hernandez was “good” enough to pitch for The Mariners based upon his W/L
Brett’s bullet specifically stated that stats should be used to answer specific and narrowly defined questions.
It is the mosaic formed from these thread like questions that form the picture of a more broadly asked question.
With that in mind, the question need be asked: How good does a pitcher have to be in order to effectively pitch for Seattle?
Keep in mind that Sandy Koufax effectively pitched for the low scoring Dodgers in the ’60′s and Steve Carlton won 27 games for the Phillies in 1972. Also, for what ever reason, that was a different era. Again, for what ever reason, todays game no longer sees the 332 IP season from a starter. King Hernandez, for what ever reason, may have been over worked at 232 IP during 2012.
*
I did not consider IP/game. Some games the SP’er pitched as few as 3.2 innings
When M’s pitchers allowed:
less than 3 runs the team’s record
54W/26L
3 runs the team’s record
11W/16L
more than 3 runs the team’s record
10W/45L
*
In order for a SP’er to be really effective for this team, he must limit his ER/game to 2 or less.
King Hernandez did that 21 out of 33 times. Since, this is no longer the era of the 332 IP/yr, the 232 innings that he pitched may have had an effect on the 12 games in which he gave up 4 or more runs ( there were no games in which he allowed exactly 3 ER’s ).
*
The other SP’ers for the M’s allowed 2 or fewer runs in a game 59 times. That is a high number that leads to several other questions.
Was the park a factor? Was the travel of other teams a factor? Were the other pitchers really that good? And so on?
*
King’s record was 14 and 9. Had lowered the number of games in which he allowed 4 or more runs from 12 to 8 then there is a good chance his W/L record would have been closer to 17W/6L. That begs the question: could he have lowered the number of those games by pitching fewer innings which may lower his fatigue level on some of those days. Again, for what ever reason, this is not the era of the 337 IP or even the 237 IP season.
*
Was King good enough to effectively pitch for the 2012 M’s based upon his W/L record. A bit less than 2 of every 3 games he started he was very much good enough. He also contributed to games in which he did not pitch by resting the BP with his 232 IP. Perhaps Koufax and Carlton played on better teams? I don’t know. However, they were better for their teams than Hernanez was for the 2012 M’s. But, for what ever reason, this is a different era. Based upon a 14W and 9L ratio on a team that requires its SP’ers to allow 2 or fewer ER’s in a game, yes he was good enough… but not Koufax/Carlton good enough.