Streamlining the pythagorean theorem of baseball
Mathematicians test simplified formula to predict winning baseball percentages
Is your local Major League Baseball team better than its record suggests? Math researchers are considering alternatives to the Pythagorean Theorem of Baseball, devised by baseball statistician Bill James. Introduced in the 1980s, the "theorem" predicts the winning percentage of a baseball team based on how many runs the team scores--and how many runs it allows.
Websites, including ESPNs, often include the Pythagorean prediction of the winning percentage of a team during the season. Fans compare the Pythagorean Theorem to the actual winning percentage, in an effort to determine if a team is under- or over-achieving.
When a team scores fewer runs than it allows, the Pythagorean model predicts that the team should have a losing record. For the 2001 season, the New York Mets allowed more runs than they scored and had a winning record; they did much better than the Pythagorean model predicted. So they can be considered an overachieving team. Because the Colorado Rockies scored more runs than they allowed but had a losing record, they were possibly an underachieving team.
Now, Michael Jones and Linda Tappin of Montclair State University in New Jersey have devised mathematically simpler alternatives to the Pythagorean Theorem of Baseball.
To predict the winning percentage of a team, one new model simply uses a little addition, subtraction, and multiplication. It starts with the total runs scored by the team in all its games (Rs), and subtracts the runs it allows (Ra), and then multiplies it by a number called "beta" (B) which is chosen to produce the best results. For the 1969-2003 seasons, the optimal values of B range from 0.00053 to 0.00078, with an average of 0.00065.
Adding 0.5 to the result gives the predicted winning percentage of the team. The resulting formula looks like this:
The estimated winning percentage, P = 0.5 + B*(Rs-Ra)
Because they only use addition, multiplication, and subtraction, these formulas are known as "linear functions"-the simplest kind of equations in mathematics.
In contrast, the original Pythagorean Theorem of Baseball is more complex. It uses exponents: Runs scored and runs allowed are squared-raised to the second power. The resulting formula is: P=[Rs2/(Ra2+Rs2)]
The equation gets its name because of its similarity to the Pythagorean Theorem in geometry, which relates the lengths of the sides in a right triangle as a2 + b2=c2, where a and b are the shorter sides and c is the longest side (the hypotenuse).
Because the Pythagorean theorems use exponents, these formulas are "nonlinear" equations, which are generally more complex than linear formulas.
So was the original Pythagorean Equation of Baseball needlessly complicated? Does the linear equation do just as good a job?
For the baseball seasons between 1969-2003 the linear formula works almost as well in its predictions as the original Pythagorean theorem, Jones and Tappin reported at this winters Joint Mathematics Meetings in Phoenix. The one real exception is the 1981 season when there was a baseball strike.
While Tappin and Jones have only analyzed whole seasons with their new formula, they are exploring how well it works for seasons-in-progress. If their formula meets with continued success, you may soon find it on your favorite sports website.
Ben Stein | EurekAlert!