6.20.2007

The beauty of the Internet, in one post

There are a lot of reasons I love being a Mariners fan. (Yes, you read that right.) One of the big ones is that while I have no empirical evidence to support this claim, I have to believe that we have to be the most information savvy bunch in the major leagues.

I love reading other blogs, especially U.S.S. Mariner and Geoff Baker's blog at The Seattle Times. (You can find the others I love on the right hand side.) I love reading them because I know a lot about sports, but I love to learn, and I always become a better-educated fan when I visit these sites. USSM's knowledge of baseball built as sabermetricians and Baker's knowledge of baseball and the Mariners built as a beat writer make for tons of good information.

The two had basically coexisted separately. Until recently, when their respective blogs crossed paths in a very public way.

It's why I love the Internet.

It started with this post by Baker, on the likelihood a team with an ERA higher than 4.50 (read: the Mariners) will make the playoffs.

I received a very interesting email today from a friend of this blog, Jack Lattemann, who has done an exhaustive study of whether teams with an earned run average of 4.50 or higher can even post winning records, let alone contend for a playoff spot. Jack has graciously allowed me to pass on his findings. They don't look good on the Mariners, who have a 4.84 team ERA despite a rock solid bullpen. He found that no team before 1969 had qualified for the playoffs. Not surprising, given the two-league format. There were a few more, post-1969, that made it. During the two-division format (four teams making the playoffs) from 1969-1993, the only playoff team with a 4.50 ERA or higher was the 1987 World Series champion Minnesota Twins, who finished 85-77 with a 4.63 ERA.
Dave Cameron over at U.S.S. Mariner -- after praising Baker for his coverage on the blog and the work he's done to interact with fans and the blogging community -- took exception to this kind of analysis, as did a number of commenters on Baker's blog. And with good reason.
Where to start with this paragraph - how about with the glaring, obvious problem, and one that I’ve been railing on for years here - Earned Run Average, by itself, is not any real indicator of pitching quality. It’s just not. I know it’s commonly accepted as the be-all, end-all pitching statistic, but the reliance on this inherently problematic stat has led to more bad analysis over the years than just about any other statistic out there. Using ERA to draw broad conclusions about pitching ability is a great way to be wrong on a large scale.

In reality, ERA kinda sorta measures the ability of the team’s run prevention skills when a specific pitcher is on the hill. ERA doesn’t attempt to separate responsibility for said run prevention between pitcher and defenders. It doesn’t attempt to take into account the context of the run scoring environment. And, just in case those weren’t big enough problems (they are), it introduces the biases of ballpark specific official scorers by excluding “unearned runs”, which are often classified as such due to arbitrary decisions on what constitutes an error.
Cameron goes on to explain that you have to consider run scoring environments of eras of the game before making any kind of cross-era analysis.
(P)icking a random ERA number that reflects “bad pitching” and applying it to any context is going to result in a list that means absolutely nothing. If you want to use ERA to evaluate a pitching staff, you’d be forced to come to the conclusion that the Washington Nationals currently have a better pitching staff than the Chicago White Sox. After all, they have a lower ERA. Of course, everyone understands that there’s a huge difference between pitching in RFK stadium against National League hitters and not facing the DH and facing American League hitters in New Comiskey park. We wouldn’t expect Mike Bacsik to post a 4.59 ERA if he was traded to the White Sox. No one would.
Baker's response on his blog?
Back to yesterday's post, I actually understated the infrequency with which teams have made the playoffs with an ERA of 4.50 or more. While the number of playoff teams with an ERA that high, since the advent of the wild-card, was 13 out of 96, the number of teams making the playoffs with an ERA that high was just 13 out of 354. One in 27. Is this all just a coincidence, as some suggest? Is there really no direct relation between ERA and making the playoffs? Well, let's just say that the more runs you allow, the more you have to score to win. The more wins, the easier to make the playoffs. Could we get scientifically more precise? Of course.

But I had no problem with Lattemann using an ERA of 4.50 and higher as a measuring stick. The average ERA in baseball last year was 4.44 and it's averaged out to roughly that since this decade began. So, anything 4.50 and worse would generally stand to be below average.

Some people objected to using ERA at all, while others say we should have adjusted it for park factors -- which looks at the difference between runs allowed at home versus the road and adjusts statistics accordingly. Well, park factors may have been needed if, say, we'd used ERA to gauge a Cy Young Award race. But not in this case, since we're merely looking at who made the playoffs. In other words, who won more games by scoring more runs than they allowed (or allowing fewer runs than they scored?) Park factors are irrelevant here. You score (and allow) the runs where the games are played and that alone determines who wins. At the end of a season, all MLB cares about in deciding a winner is who won the most games, not how easy or difficult it was to score runs in those games because of ballpark intangibles.
Finally, Baker relented in his next post.
First, let me deal with the issue of the posts from this morning and yesterday. You know, when enough people tell you you're wrong, you start to think that maybe they're on to something. So, I went back and had another look at those numbers using the suggested ERA+ method to account for park factors. The reasoning I listened to, from "Sammy" in the comments thread, Dave at the USS Mariner site, and others, has convinced me that comparing teams using a statistic that could adapt to changing year-to-year run conditions -- rather than a static number that couldn't change -- was the best way to go. ...

And the park factors do matter. I erred in saying they didn't. After all, you compete for the playoffs with other teams. The ability of those teams to score and prevent runs, based on the factors in the ballparks of the day, will have a bearing on it.
USSM's response? Big time kudos.
Geoff has been paid to write about baseball for a long time. He’s a very smart guy, and he puts a lot of work into what he does. And yet, when some fans challenge a point he made, he’s willing to listen, evaluate what they’re saying, and take another look at his stance. Truth is more important than pride, and Geoff proved that in spades.

Baker deserves a lot of credit for taking the time to dive into the issue. He’s already set the bar for Mariner beat-writers to follow, and now he’s just pushing it even higher.

Congratulations to the Seattle Times - you made a fantastic hire.
It's hard to imagine that there are very many cities where virtual conversations such as this one take place. And that, my friends, is why I consider myself fortunate to be a Mariners fan and why I love the Internet.

No comments: