The 25 Best Hitting Seasons by a Catcher:

A Comparison of Runs Created, Batting Runs, and Runs Generated

by Jim McMartin

Paper presented at the 1998 Allen Roth SABR meeting, Los Angeles, August 1, 1998.
(Updated Post-Season 1998)

Please send any comments and/or criticisms to Jim McMartin

 

Many argue that the 1997 MVP award in the National League should have gone to catcher Mike Piazza of the Los Angeles Dodgers rather than to outfielder Larry Walker of the Colorado Rockies. While both hitters had outstanding seasons in 1997, those in Piazza’s corner point out (1) catching is the most demanding fielding position and (2) Piazza’s hitting numbers in 1997 were surely the all-time best by any catcher.

While there is no doubt that catching is the most demanding position in baseball, it is something else to assert that Piazza’s 1997 performance was the best hitting season by any catcher in baseball history. After all, a guy named Johnny Bench had some pretty good seasons. And some other catchers in the Hall of Fame--Mickey Cochrane, Gabby Hartnett, Bill Dickey, Roy Campanella, and Yogi Berra--all swung a pretty good stick during their careers. The main purpose of this research is to determine the top 25 hitting seasons by a catcher.

One of the joys of baseball is that discussions of "who’s the best?" are neverending. Comparing everyday (position) players involves an assessment of the players’ strengths on such dimensions as hitting for average, hitting for power, clutch hitting, speed and base stealing skills, fielding skills, as well as intangible assets such as leadership and being a "team" player. Moreover, since standards of excellence have changed dramatically over the years, we need to relativize a hitter or pitcher’s single season performance to the league average (sort of like grading on the curve) whenever we make cross-era comparisons. Because baseball skills are multi-dimensional and it is difficult, if not impossible, to accurately measure and weigh each dimension, global discussions regarding the best players of all-time are bound to be ongoing.

However, valid comparisons among position players are possible if we narrow our sights to focus on one just aspect of baseball performance, such as hitting. How do we determine who is the best hitter? There are numerous ways to do this and, at present, sabermetricians have yet to agree on a common comprehensive yardstick. In this paper I compare two well-known, comprehensive measures, Runs Created and Batting Runs, and I compare both to a measure I recently developed called Runs Generated.

Comparing the Measures

Based on research described below, the major conclusion I draw is that while Runs Created and Batting Runs are slightly more accurate measures of hitting, Runs Generated is more accessible ... that is, it is considerably easier than the other two for interested fans to compute for themselves. So before we get to the question of which catchers had the best hitting seasons, we first need to take a look at the strengths and weaknesses of these three measures of hitting.

Bill James, the father of sabermetrics, has given us Runs Created as a way to measure hitting. The original formula for Runs Created is [(hits + walks)(total bases)] divided by [AB + BB]. This elegant formula has been replaced by increasingly elaborate ones that take into account more than hitting skills (stolen bases, for example). For example, James’s "technical version" of RC involves this formula:

(H + BB + HBP - CS - GIDP) (TB +.26[BB - IBB + HBP] + .52[SH + SF + SB])

RC (tech) = AB + BB + HBP + SH + SF

James’s 1998 version of Runs Created, Formula #24, found in STATS’ All-Time Major League Handbook, involves 12 variables and 8 constants. It is quite a challenging task for anyone to apply his newest formula to evaluate hitters.

Runs Created is composed of three factors:

A Factor = H + W + HB - GIDP - CS

B Factor = TB + ((W + HB - IBB)*.24) + (SB*.62) + ((SH + SF)*.5) - (.03K)

C Factor = AB + W + HB + SH + SF

Runs Created = [((A + (2.4C))(B + 3*C)) divided by 9C] - .9C

Runs Created is an "absolute" number that is useful for comparing hitters within the same season. Whether it is useful for cross-era comparisons is another matter. In any case, James’s admirable interest in precise measurement has resulted in such compexity that RC, regretably, rules out any chance of active participation by interested fans. Moreover, by including unspecified adjustments for "Home Runs with Men on Base" and "Batting Average with Men in Scoring Position," the interested fan cannot compute RC for himself or herself. RC has become a totally inaccessible measure of hitting. The RC data for catchers were taken from James’s 1998 book.

Pete Palmer and John Thorn, based on their linear-weights solution, have proposed a measure of hitting they call Batting Runs. Batting Runs (see page 571 of the 5th ed. of Total Baseball) requires knowledge of 11 variables and 9 constants:

Batting Runs = (.47)1B + (.78)2B + (1.09)3B + (1.4)HR + (.33)(BB + HB) + (.30)SB - (.60)CS - (.25)(AB-H) - .50 (OOB)

This last term, OOB, may not be familiar to many fans. It stands for Outs On Base, and it equals Hits + Walks + Hit by a Pitch - Left on Base - Runs - Caught Stealing.

Like Runs Created, Batting Runs is a measure of total offensive contribution rather than pure hitting. And like Runs Created, it is not easy for the average fan to compute for himself or herself (e.g., who knows how many men Mike Piazza left on base in 1997?). Batting Runs differs in one very important way from Runs Created: In The Hidden Game of Baseball, Thorn and Palmer tell us that Batting Runs tells us how many runs a hitter contributed above or below the league average. A score of zero batting runs means that the hitter was average for that season. Thus, Batting Runs may be directly compared across seasons since league -wide influences in number of games played, run-scoring ease, or both, have been controlled. This control of league-wide effects is lacking in James’s Runs Created. The Batting Runs data for catchers were obtained from Total Baseball’s web site (www.totalbaseball.com).

I praise Bill James, Pete Palmer and John Thorn for providing us with excellent standards to compare newer systems against. To say that their work is extremely valuable is an understatement. Accurate measures of hitting and total offensive contributions are clearly needed if sabermetrics is to progress. But because both Runs Created and Batting Runs are not easily computable by the interested fan, I wanted to find an adequate measure of hitting that the average fan could use during the season. I was willing to sacrifice some degree of accuracy to gain a more user-friendly measure. Runs Generated is my solution to this problem. Runs Generated is easy to compute. It involves only 4 variables (total bases, walks, at bats, hits) and 2 constants:

Runs Generated = (.35)(Total Bases + Walks) - (.25)(At Bats - Hits)

With Runs Generated, a hitter gets .35 credit for each base he attains (Total Bases plus Walks) and he is penalized -.25 for each out he makes (At Bats - Hits). The .35 weight for Bases Gained was determined by exploratory studies on team data, 1990-1996, using linear regression. The -.25 cost for making an out I took from the Batting Runs formula (see above).

The linear regression studies at the team level indicated that the total number of team runs in a season was best predicted by the equation,

Runs Generated = (.35)(Total Bases + Walks) - (.25)(At Bats - Hits) + K,

where K = the average number of runs scored by a team that season. When Runs Generated is applied to the individual player, K is simply ignored. I interpret Runs Generated as being similar to Batting Runs in that it tells us how many runs above or below average is a given hitter’s performance. As we will see below, RG and BR give very similar numbers for individual hitters. Thus, both Batting Runs and Runs Generated may be directly used for cross-era comparisons, a feature that may or may not be true of Runs Created.

Evaluating the Measures

How can we tell if any system that supposedly measures hitting is any good? The one great advantage of studying measures of hitting ability is that we have a known criterion variable: how many runs the team scored. Any measure of hitting that applies to an individual may be applied to the team as a whole. Thus, we can compare any team measure of hitting against how many runs the team scores to see how well it "tracks" actual runs scored. Obviously, the closer a measure tracks (or "predicts" in the language of statistics) actual runs scored, the better that measure is.

Accuracy of tracking is measured in two ways:

(1) We find the difference between how many runs a team actually scores and how many runs a measure "predicts" it should score. We then compute the standard deviation of those differences. The standard deviation is like earned run average, the smaller the better.

(2) We correlate runs scored with predicted runs. The higher the correlation coefficient, the better is the measure. When we square that correlation coefficient, we get the percentage of variation of team runs that is accounted for (or explained by) that measure. The higher the percentage, the better the measure.

It is clear that Runs Generated is the most user-friendly of these three measures, but how accurate is it? To answer this important question, I did a study of how well these measures predicted each team’s runs in each league over a 20 year period, 1970-1989 (team RC and BR data taken from Total Baseball, 5th Edition). Here is what I found:

Average Sample Standard Deviation

Runs Created = 21.1 Batting Runs = 22.8 Runs Generated = 24.4

Average Squared Correlation Coefficient

Runs Created = 87.8% Batting Runs = 85.3% Runs Generated = 83.9%

These data mean that Runs Created is the most accurate and Runs Generated is the least accurate measure. This is not a surprise. But what is noteworthy is the finding that Runs Generated accounts for almost 84% of the variance in runs scored, a mere 3.9% less than the most accurate measure, Runs Created. This means that Runs Generated allows one to compute easily a hitter’s value without sacrificing very much accuracy. Runs Generated is most applicable to those hitters whose offensive contributions do not include stealing bases or getting hit by a pitch, since it ignores those factors. Runs Generated is a measure of pure hitting rather than total offensive contribution.

Putting It All Together

I applied these 3 measures of hitting to the top seasons of (1) all Hall-of-Fame catchers and to those of (2) Mickey Tettleton, Carlton Fisk, Gary Carter, Ted Simmons, Darrell Porter, Thurman Munson, and Todd Hundley. To qualify, the player had to have caught at least 100 games that season. There were 42 seasons by a catcher with Batting Runs of 25 or more. How can we identify which ones are the top 25?

The obvious way of finding the top 25 seasons would be to add the hitter’s RC, BR, and RG scores and rank order the total. However, simple addition is not the correct way to proceed because these measures are on different scales. The correct way to combine scores from different scales is to convert each score to its "standard score" (also called a z score) and then average the standard scores. Using the data in the 42-season sample , I converted each hitter’s RC, BR, and RG score to its standard score and then found the average standard score across the three measures. The top 25 hitting seasons, ranked by their average z score, are shown in Table 1, which also shows the catcher’s raw scores and the sum of the three scores.

Table 1. Top 25 Hitting Seasons by a Catcher: As Ranked by the Average Standard Score of Runs Created, Batting Runs, and Runs Generated

Rank Name Year RC BR RG Sum Avg z score
1 Mike Piazza 1997 137 62 60 259 3.27
2 Mike Piazza 1996 119 47 45 211 1.53
3 Roy Campanella 1953 120 43 45 208 1.36
4 Bill Dickey 1937 125 37 43 205 1.12
5 Gabby Hartnett 1930 119 35 47 201 1.02
6 Darrell Porter 1979 119 41 38 198 0.97
7 Johnny Bench 1972 110 44 39 193 0.88
8 Mickey Cochrane 1933 105 42 42 189 0.76
9 Roy Campanella 1951 110 42 38 190 0.75
10 Johnny Bench 1970 112 36 36 184 0.47
11 Carlton Fisk 1977 110 40 32 182 0.45
12 Bill Dickey 1936 106 35 40 181 0.39
13 Mickey Cochrane 1932 115 31 36 182 0.30
14 Mickey Cochrane 1931 105 37 34 176 0.23
15 Bill Dickey 1938 106 30 39 175 0.10
16 Roy Campanella 1955 99 37 35 171 0.09
17 Yogi Berra 1950 122 26 29 177 0.01
18 Mickey Cochrane 1930 112 30 31 173 -0.03
19 Johnny Bench 1974 111 33 27 171 -0.06
20 Gary Carter 1982 100 37 28 165 -0.12
21 Mike Piazza 1995 91 37 35 163 -0.13
22.5 Mike Piazza 1993 101 34 30 165 -0.16
22.5 Ted Simmons 1977 103 33 30 166 -0.16
24 Yogi Berra 1956 109 28 29 166 -0.28
25 Ted Simmons 1975 100 36 25 161 -0.29

Table 1 shows that all three measures strongly agree that Mike Piazza’s 1997 season was far and away the best by any catcher in history. No contest. Way to go, Mike!

Table 1 also reveals the serious weakness of Runs Created when it is used for cross-era comparisons. Consider the higlighted RC numbers for Yogi Berra’s 1950 season and Mike Piazza’s 1995 (strike) season. According to Runs Created, Yogi’s RC of 122 in 1950 is the 3rd best hitting season of any catcher and Piazza’s RC of 91 in 1995 is the worst of the 25 hitters shown in Table 2. Both of these results differ greatly from those found by the other two measures, which are both sensitive to league-wide conditions. Both Batting Runs and Runs Generated found that Piazza’s 1995 season was superior to Berra’s 1950 season: Batting Runs sees Piazza’s 1995 season as 11 runs better than Berra’s, while Runs Generated sees Piazza’s 1995 season as 6 runs better than Berra’s 1950 season.

I interpret this discrepancy to mean that Runs Created is invalid for cross-era comparisons. Berra’s 1950 performance was enhanced by the fact that hitters in the American League that year had an easier time of it than either 1949 or 1951. The average AL team in 1950 scored over 5 runs per game, more than one-third more than in 1949 and 1951. Batting Runs and Runs Generated are sensitive to league-wide averages and so Berra’s fine performance is not as far above the league average that year as the same performance in 1949 or 1951. Runs Created is insensitive to league-wide changes and incorrectly credits Berra with an outstanding (rather than a very good) hitting season.

The same argument holds true when assessing the discrepancy regarding Piazza’s performance in the 1995 strike-shortened season. He created only 91 runs that season because teams played only 144 games that year. But Batting Runs and Runs Generated both correctly identify that Mike was responsible for more runs above his league average that year than Yogi was above his league average in 1950. In fact, Piazza in 1995 had a better season in 144 games than Berra did in 154 games in 1950. But Runs Created fails to detect this and, worse, concludes the opposite, that Berra created 31 runs more than Piazza. Whenever we make cross-era comparisons, we always need to take the league averages into consideration in order to avoid making drastically erroneous conclusions.

So how do these 40 very good seasons by catchers compare to each other if we throw out Runs Created and just use Batting Runs and Runs Generated as our measures of hitting performance? Table 2, on the following page, shows the top 25 hitting seasons by a catcher as ranked by the sum of the standard scores of Batting Runs and Runs Generated. The raw scores of RC, BR, and RG are the same as those in Table 1, but now the sum column refers to the sum of the raw scores of Batting Runs and Runs Generated. It can be seen at a glance that the sum of these two measures and the average of their standard scores correspond very closely.

Table 2. The Top 25 Hitting Seasons by a Catcher: As Ranked by the Average Standard Score of Batting Runs and Runs Generated

Rank Name Year RC BR RG Sum Avg z score
1 Mike Piazza 1997 137 62 60 122 3.54
2 Mike Piazza 1996 119 47 45 92 1.67
3 Roy Campanella 1953 120 43 45 88 1.37
4 Mickey Cochrane 1933 105 42 42 84 1.11
5 Johnny Bench 1972 110 44 39 83 1.08
6 Gabby Hartnett 1930 119 35 47 82 0.92
7 Roy Campanella 1951 110 42 38 80 0.88
8 Darrell Porter 1979 119 41 38 79 0.83
9 Bill Dickey 1937 125 37 43 80 0.81
10 Bill Dickey 1936 106 35 40 75 0.51
11 Carlton Fisk 1977 110 40 32 72 0.44
12 Johnny Bench 1970 112 36 36 72 0.37
13.5 Mike Piazza 1995 91 37 35 72 0.36
13.5 Roy Campanella 1955 99 37 35 72 0.36
15 Mickey Cochrane 1931 105 37 34 71 0.31
16 Bill Dickey 1938 106 30 39 69 0.08
17.5 Mickey Cochrane 1932 115 31 36 67 0.00
17.5 Gary Carter 1982 100 37 28 65 0.00
19 Mike Piazza 1993 101 34 30 64 -0.10
20 Ted Simmons 1977 103 33 30 63 -0.19
21 Carlton Fisk 1972 87 37 24 61 -0.25
22 Ted Simmons 1975 100 36 25 61 -0.26
23 Todd Hundley 1997 86 29 34 63 -0.28
24 Ted Simmons 1978 98 33 27 60 -0.33
25 Mickey Cochrane 1930 112 30 31 61 -0.37

As can be seen by comparing Tables 1 and 2, eliminating Runs Created does not change the top three seasons. Mike Piazza again tops the list for his 1997 outstanding season and his 1996 season again ranks second. Roy Campanella’s 1953 season is third best under both procedures.

There are two major differences between Tables 1 and 2. The most important difference is that three seasons have been replaced. Yogi Berra’s 1950 and 1956 seasons no longer rank in the top 25, nor does Johnny Bench’s 1974 season. They are replaced by (names underlined in Table 2): Carlton Fisk in 1972, Todd Hundley in 1997, and Ted Simmons in 1978.

The reason for these differences is obvious: these new kids on the block did not fare nearly as well in Runs Created as the players they replaced, while the newbies’s BR and RG scores equaled or exceeded those they replaced. (Please note that the RG figures in Tables 1 and 2 are shown as whole numbers but that their standard scores were derived from RGs rounded to the first decimal point. This is why Ted Simmon’s 1978 season is fractionally superior to Johnny Bench’s 1974 season.)

The second major difference in the two tables is the considerable rearrangement of certain hitters’ seasons. The biggest change in evaluation is Mike Piazza’s 1995 season. It jumps from being ranked 20th best all-time in Table 1 to being tied for 12th best with Roy Campanella’s 1995 season in Table 2. It is worth noting that Piazza’s and Campy’s BR are identical , as are their RG scores.

A rearrangement on the down side is seen in Mickey Cochrane’s 1930 season: ranked 18th best when Runs Created was part of the evaluation, Cochrane’s 1930 season ranks 25th best under the combined measurement of Batting Runs and Runs Generated. The reason for this drop in rank, once RC is no longer used, is also obvious. In 1930 the average team in the American League scored a whopping 5.4 runs per game. It must have been a hitter’s paradise! Runs Created, as we saw earlier, does not take such league-wide effects into consideration (unlike Batting Runs and Runs Generated). Thus, it is quite right for Cochrane’s 1930 season not to be rated so highly as RC would have us believe, since part of his excellent hitting can be attributed to the relative ease of hitting that season. Batting Runs and Runs Generated both give a more realistic appraisal of Cochrane’s relative placement among these terrific seasons. Thus, if you want to compare players’ hitting performances from one year to the next, Runs Created may be extremely misleading since it ignores potential seasonal changes in run producing ability for the leagues as a whole.

Batting Runs and Runs Generated do control for league-wide changes and both seem to be reasonably accurate when giving comparative information about different hitters who played in different years. Of these two measures, which should you use? If you have access to Total Baseball (either the book or the web site), the easiest thing is to simply look up the player’s Batting Runs. If this resource is not available, or you wish to determine for yourself how your favorite hitter is doing, all you need to know is the hitter’s current At Bats, Hits, Total Bases, and Walks. Then you just plug those numbers into the formula:

Runs Generated = .35(Total Bases + Walks) - .25(At Bats - Hits)

to get a reasonably accurate evaluation of that hitter’s season in the modern live-ball era (unless the player is a great base stealer--always use BR for such a hitter). Runs Generated is an easy-to-use tool that gives useful information about a hitter’s ability to help his team score runs. Under current investigation is the question of the optimal Runs Generated equation for the dead ball era. The above equation applies to all hitters from 1920 to the present.

One final topic: an interesting application for either Runs Generated or Batting Runs is toward the construction of "age norms" for hitters who played the same position. By noting how old each player was in a season, we could answer such questions as "What was the best hitting season by a 30-year-old catcher?" Or, "How does Mike Piazza’s career-to-date compare with the best hitting seasons by any catcher at the same age?"

Table 3 shows the best hitting seasons by a catcher at the same age, as measured in Batting Runs, from ages 20 to 37 (their "baseball "age as of July 1 each year).

Table 3. Best Hitting Seasons by a Catcher, Ages 20 to 37.

Age Name Season Batting Runs
20 Johnny Bench 1968 14
21 Johnny Bench 1969 23
22 Johnny Bench 1970 36
23 Gary Carter 1977 24
24 Johnny Bench 1972 44
25 Ted Simmons 1975 36
26 Mike Piazza 1995 37
27 Mike Piazza 1996 47
28 Mike Piazza 1997 62
29 Roy Campanella 1951 42
30 Mickey Cochrane 1933 42
31 Roy Campanella 1953 43
32 Mickey Cochrane 1935 27
33 Roy Campanella 1955 37
34 Gabby Hartnett 1935 30
35 Carlton Fisk 1983 24
36 Gabby Hartnett 1937 31
37 Carlton Fisk 1985 13

As can be seen in Table 3, Johnny Bench was the best hitting catcher at ages 20, 21, 22, and 24. Roy Campanella and Mike Piazza each currently hold three highest marks for different ages. It may be quite a long while before baseball sees another 28-year-old catcher who has a finer hitting season than Mike Piazza’s in 1997.


1998 Post-Season Update

Who were the best hitting catchers in 1998? As of this writing (mid-November 1998), we still do not have the official 1998 BR and RC numbers for major league hitters. Since Runs Generated is easy to compute, here are the Runs Generated by the only catchers in the major leagues last season to amass at least 502 plate appearances:

Name Team AB Hits Total Bases Walks RG
Mike Piazza Mets 561 184 320 58 38
Ivan Rodriguez Rangers 579 186 297 32 17
Jason Kendall Pirates 535 175 253 51 16
Javy Lopez Braves 489 139 264 30 15
Charles Johnson Dodgers 459 100 175 45 -13

In spite of playing for three teams in 1998 (Dodgers, Marlins, and Mets), Mike Piazza is the best hitting catcher in the major leagues for the sixth straight season. His RG score of 38 ties him for the all-time 10th best hitting season by a catcher with Darrell Porter’s 1979 and Roy Campanella’s 1951 seasons. Now that he has signed the richest contract in baseball history, it should surprise no one if Mike sets a new record next year for the best hitting season by any 30-year-old catcher.




BACK TO THE TOP



HOF List of Catchers All-Star Gold Glove League Awards Batting Fielding World Series Greatest Catcher Records HOME - Front Page 1000 Games Caught Equipment Trivia Notable Catchers Articles Links Quotes Quiz Skills & Drills 19th Century Current Catchers Rotisserie Feedback AAGPBL MISC