Panning for gold in the maize genome
New approaches yield gene-rich regions, accelerate sequencing
Decoding of a variety of plant genomes could accelerate due to two complementary methods that remove from analysis vast stretches of DNA that do not contain genes.
The approaches, applied jointly in efforts to determine the gene sequences in maize, are described in the Dec. 19 issue of the journal Science. The evaluation of these methods and the assembly of the resulting sequences were undertaken by two groups led by researchers from The Institute for Genomic Research (TIGR) in Rockville, Md., and Cold Spring Harbor Laboratory in New York.
The research was funded by the National Science Foundations Plant Genome Research Program.
Only about a quarter of the maize genome codes for genes, and these are found in small clusters scattered through a mixture of non-coding DNA and transposons (mobile DNA segments). Two different methods tested by the TIGR group successfully captured parts of the maize genome containing genes. The gene-sequences are of most interest because they provide the specific blueprint for an organisms development, structure and physiology.
With so much non-gene sequence to deal with, it has not been feasible to sequence and assemble the whole maize genome with current technologies. Thus, it is a major shortcut to capture only the portion of the maize sequence containing its genes without having to sequence the entire genome.
"Collecting the maize genes for sequencing is like panning for gold," said Jane Silverthorne, program director for NSFs plant genome program. "Just as gold can be separated from the surrounding rock because it is denser, maize genes can be separated from the surrounding DNA by their chemical and sequence properties."
The first method tested, called methylation filtration, removes sequences that contain a chemical modification (methylation) found on most of the repeated sequences and transposons, leaving behind the proverbial gold of genes. It was developed by a team led by Robert Martienssen and W. Richard McCombie at Cold Spring Harbor Laboratory.
The second method, developed by researchers at the University of Georgia, removes the repeated sequences by separating the DNA into "high-copy," gene-poor segments and "low-copy," gene-rich segments.
Led by Cathy Whitelaw, the research team at TIGR compared sequences obtained by the two methods. About one fourth of the genes in each collection matched known gene sequences. About 35 percent of the genes were represented in both collections.
Each method was found to enrich for distinct but complementary regions of maizes 10-chromosome genome. Combined, the methods could cut the amount of sequencing necessary to find all of the maize genes to about one-fourth of what it would take to sequence the entire genome.
As both methods yielded short stretches of sequence, a major challenge was to reassemble these into complete genes. To do this, the Cold Spring Harbor group lined up the sequence pieces from maize along the rice genome sequence, a deep draft of which was completed in 2002 by an international consortium. The researchers then reassembled selected sets of sequence fragments into complete genes. This approach will be an important part of assembling the short pieces of DNA yielded by the two enrichments methods into complete gene clusters.
According to Silverthorne, "Together, these findings suggest that scientists could be able to sift out the approximately 450 million base pairs of DNA containing the genes from the maize genome and then reassemble the sequence. Such a comprehensive genomic resource would provide growers and breeders a wealth of tools to improve maize, as well as other cereal crops."
Other collaborators in the study included the Donald Danforth Plant Science Center and Orion Genomics, both of St. Louis, Mo.
Sean Kearns | NSF