But the tools used to align genomes from different species have serious quality-control issues, according to a study published online this week in the journal Nature Biotechnology.
"We discovered that there's a disturbingly low level of agreement between genome alignments produced by different tools," said corresponding author Martin Tompa, a UW professor of computer science and engineering and of genome sciences. "What this should suggest to biologists is that they should be very cautious about trusting these alignments in their entirety."
This is especially true when comparing distantly related species, and in regions of the genome that do not code for a protein, he said.
Aligning genomes, while simple in theory, is difficult in practice. Aligning more than two sequences becomes much harder with every additional sequence. At the scale of a mammal's entire genome, all of its genetic code, finding the optimal alignment of many genomes is far beyond the capabilities of any computer, Tompa said.
Various software tools instead use strategic shortcuts.
"At a high level the tools are very similar," Tompa said. "They make different decisions at the lower, more detailed levels, and those decisions seem to have widespread effect on the outcome."
The new paper compared the alignments from a previous study in which four research teams each took the same 1 percent of the human genome and aligned it to the genomes of 27 other vertebrate animals, ranging from mouse to elephant.
"This is a marvelous dataset," Tompa said. "It's a very large-scale multiple sequence alignment, done by four expert teams using four different tools, all of them working on the same input sequences."
However, the new study found that the resulting alignments were quite different. The authors also compared the coverage of each tool, meaning how much of the human DNA it was able to match to each other species, as well as what fraction of alignments were suspiciously close to a random match.
The best-performing tool was the newest one, Pecan, developed by the European Bioinformatics Institute.
"Our study pretty clearly points to Pecan as being the highest-quality alignment of the four tools we compared," Tompa said. It aligned as much of the human genome to other species as any of the other tools, and its matches were considerably more reliable, especially between more distantly related species.
The other tools in the study were Threaded Blockset Aligner (or TBA), Multiple Limited Area Global Alignment of Nucleotides (or MLAGAN) and Mavid. All four are free programs developed by academic institutions, Tompa said.
"I'm hoping that the designers of these tools will take a very close look at our paper and might be able to improve their tools as a result," he said. "I think we're all interested in having a better understanding of which methods work the best and how to make them better."
The lead author is Xiaoyu Chen, a UW doctoral student in computer science and engineering. The research was funded by the U.S. National Institutes of Health and the Natural Sciences and Engineering Research Council of Canada.
For more information, contact Tompa at 206-543-9263 or email@example.com.
The article is posted (subscription required) at http://www.nature.com/nbt/journal/vaop/ncurrent/abs/nbt.1637.html
Hannah Hickey | EurekAlert!
A novel synthetic antibody enables conditional “protein knockdown” in vertebrates
20.08.2018 | Technische Universität Dresden
Climate Impact Research in Hannover: Small Plants against Large Waves
17.08.2018 | Leibniz Universität Hannover
There are currently great hopes for solid-state batteries. They contain no liquid parts that could leak or catch fire. For this reason, they do not require cooling and are considered to be much safer, more reliable, and longer lasting than traditional lithium-ion batteries. Jülich scientists have now introduced a new concept that allows currents up to ten times greater during charging and discharging than previously described in the literature. The improvement was achieved by a “clever” choice of materials with a focus on consistently good compatibility. All components were made from phosphate compounds, which are well matched both chemically and mechanically.
The low current is considered one of the biggest hurdles in the development of solid-state batteries. It is the reason why the batteries take a relatively long...
New design tool automatically creates nanostructure 3D-print templates for user-given colors
Scientists present work at prestigious SIGGRAPH conference
Most of the objects we see are colored by pigments, but using pigments has disadvantages: such colors can fade, industrial pigments are often toxic, and...
Scientists at the University of California, Los Angeles present new research on a curious cosmic phenomenon known as "whistlers" -- very low frequency packets...
Scientists develop first tool to use machine learning methods to compute flow around interactively designable 3D objects. Tool will be presented at this year’s prestigious SIGGRAPH conference.
When engineers or designers want to test the aerodynamic properties of the newly designed shape of a car, airplane, or other object, they would normally model...
Researchers from TU Graz and their industry partners have unveiled a world first: the prototype of a robot-controlled, high-speed combined charging system (CCS) for electric vehicles that enables series charging of cars in various parking positions.
Global demand for electric vehicles is forecast to rise sharply: by 2025, the number of new vehicle registrations is expected to reach 25 million per year....
17.08.2018 | Event News
08.08.2018 | Event News
27.07.2018 | Event News
20.08.2018 | Life Sciences
20.08.2018 | Information Technology
20.08.2018 | Power and Electrical Engineering