For inferring the biological tree of life, simple is better
UO researchers report the old-fashioned way is more accurate for comparing DNA sequences
A study published this week in Nature (Nov. 21) shows that the most widely used method for constructing the tree of life from DNA sequences is prone to error. However, a simpler method, largely abandoned in recent years, turns out to be far more accurate.
These surprising findings may change the way evolutionary biologists infer the relationships among species - a cornerstone of modern biology - according to researchers at the University of Oregon.
Joe Thornton, a UO assistant professor of biology, and Bryan Kolaczkowski, a graduate student in computer and information science, used a small supercomputer to simulate the evolution of thousands of gene sequences on a hypothetical evolutionary tree. They examined which methods for inferring historical relationships correctly recovered that tree from the simulated data.
They found that a simple logical method known as maximum parsimony is far more accurate under a wide range of conditions than the state-of-the-art technique known as maximum likelihood, which uses a mathematic model of the evolutionary process. "It turns out that the complicated method performs well when reality is simple, but the simpler method is much more accurate when reality is complex," Thornton said.
During the past decade, maximum likelihood has eclipsed maximum parsimony as a tool for evolutionary biologists, largely because of studies that found it to be a more accurate and powerful tool.
Thornton and Kolaczkowski were not convinced by these studies, which simulated evolution using a simplistic and unrealistic process in which the various parts of a gene evolve at the same rate in all species. So they evaluated, for the first time, the performance of the methods when the evolutionary process changes over time, as it is known to do. "Maximum likelihood often gets the wrong tree because it assumes evolution can be accurately captured in a statistical model, but the assumptions of those models are usually wrong," Thornton said. "Parsimony makes fewer assumptions, so it can cope with a complex reality better."
The results published in Nature suggest that scientists should bring back the maximum parsimony method and interpret with caution results from the maximum likelihood method, many of which have been accepted as established knowledge, Thornton said.
Why is an accurate evolutionary tree important? Inferences scientists make about human biology from model organisms such as mice or zebra fish, are based on an understanding of the evolutionary relationships between species. When scientists compare the embryonic development among species, their conclusions all depend upon accurate knowledge of the evolutionary tree. Thornton and Kolaczkowski performed their experiments on a small supercomputer they assembled in the Thornton laboratory by linking together a cluster of Apple personal computers.
The project was funded in part by a National Science Foundation grant to the UO to train graduate students like Kolaczkowski in interdisciplinary research in evolution, development and genomics. Thornton, who joined the UO faculty in 2002, is an assistant professor of biological sciences in the UO Department of Biology s Center for Ecology and Evolutionary Biology. His laboratory combines computational work on phylogenetic methods with experimental studies to understand how genes evolve new functions. He is particularly interested in the evolution of hormones and the receptor proteins that mediate their effects on the body.
In 2003, Thornton resurrected the ancient gene for the first steroid hormone receptor - which last existed more than 600 million years ago - in the laboratory and expressed it in cultured cells to understand its function, work that was published in the journal Science. He also is an expert in environmental health and policy. He is the author of "Pandoras Poison: Chlorine, Health, and a New Environmental Strategy" (MIT Press), a landmark book on the effects of toxic chemicals on human health and the environment. He pointed out a parallel between this work and his research on evolutionary tree building.
"Many dangerous chemicals are licensed today by calculating how much of each one we can safely be exposed to, using an assumption-laden, statistical method called quantitative risk assessment," he said. "The problem is that the assumptions of risk assessment models are routinely violated, and safe levels often turn out to be hazardous. In many cases, we could prevent damage from these chemicals altogether by using safer alternatives, but these are often overlooked because of governments reliance on risk assessment. Here again, simpler would lead to a better outcome."
Melody Ward Leslie | EurekAlert!