NHGRI selects 13 new targets for large-scale sequencing program
Strategic mix to include gibbon, bats, sloth and mosquitoes
The National Human Genome Research Institute (NHGRI), one of the National Institutes of Health (NIH), announced today that the Large-Scale Sequencing Research Network will target 13 more organisms as part of its ongoing effort to produce genomic data that will expand biological knowledge and improve human health.
The National Advisory Council for Human Genome Research (NACHGR), which is a federally chartered committee that advises NHGRI on program priorities and goals, recently approved a comprehensive plan that adds two new sets of sequencing targets. The strategy includes a mix of whole genome sequencing, genome mapping and sequencing of genomic regions chosen for their scientific merits.
"We want to set the stage for a greater understanding of the major biological innovations that have occurred throughout evolution, with emphasis on learning more about our own genome. Genomic information from a wide array of species already is proving useful in many areas of biomedical research that may benefit both humans and animals," said Mark S. Guyer, Ph.D., director of NHGRIs Division of Extramural Research.
The first group of targets consists of nine mammals. Eight mammals will be sequenced at low-density draft coverage, created by sequencing their genomes two times over. This sequencing strategy, begun last year on another set of mammals, is used primarily to identify features that are similar, or conserved, among the genomes of the human and other mammals. Sequences that have been conserved throughout evolution often reveal important functional regions of the human genome. Initial data show that using low-density sequencing for such comparisons is almost as effective as more costly, high-density sequencing. The eight mammals are: the 13-lined ground squirrel (Spermophilus tridecemlineatus), the megabat (Cynopterus species), the microbat (Microchiroptera species), the tree shrew (Tupaia belangeri), the bushbaby (Otolemur garnettii), the hyrax (Procavia capensis), the pangolin (Manis species) and the sloth (Bradypus or Choloepus species).
The ninth mammal is the Northern white-cheeked gibbon (Nomascus leucogenys). This non-human primate species belongs to a major evolutionary branch that has not yet had the genome of any of its members sequenced. The gibbon genome is unique because it contains many chromosomal rearrangements, which makes it valuable for studying how such rearrangements have contributed to the evolution and speciation of humans and other non-human primates. To identify chromosomal rearrangements, researchers will need to sequence only small portions of the gibbon genome through a process known as Bacterial Artificial Chromosome (BAC)-end sequencing.
Also selected in the latest round were four non-mammalian organisms. Three of the organisms have been targeted for six-fold, or "high-quality draft," sequencing. They are: the M and S strains of a malaria-carrying mosquito (Anopheles gambiae) and a roundworm (Heterorhabditis bacteriophora). Researchers will also construct a physical map of the zebra finch (Taeniopygia guttata) genome, paving the way for later efforts to sequence selected regions of the songbirds genome.
"Sequencing the genomes of a diverse set of organisms is a powerful tool to understand the biological processes at work in human health and illness," said NHGRI Director Francis S. Collins, M.D., Ph.D. "Comparative genomics has proven to be one of the most effective strategies for revealing the important structural and functional elements of the human genome sequence."
The mosquito, Anopheles gambiae, is medically significant because it can harbor the malaria parasite, Plasmodium falciparum, and is the primary transmitter of the disease in Africa. Approximately 30 out of 500 Anopheles species are known vectors of human malaria. Malaria afflicts up to 300 million people and kills more than 1 million people a year. In 2002, Celera Genomics completed a draft genome sequence of a laboratory strain of Anopheles gambiae, a hybrid of two naturally occurring species, known as "M" and "S" strains, that can transmit malaria. Researchers expect that sequencing the genomes of the M and S strains, both of which are insecticide-resistant, will help to further efforts to combat malaria. The sequence data will also provide new insights into key structural and functional elements of the Anopheles genome.
The roundworm genome is important because its sequence will aid in comparative annotation of the genomes of other roundworms that have been sequenced, including that of Caenorhabditis elegans, one of the most important model systems for understanding animal development. In addition, Heterorhabditis bacteriophora is widely used as a biological control agent because it kills agricultural insect pests. Understanding its genome will increase insights into biological control, as well as into basic mechanisms of parasitism that may be applicable to parasites of humans.
The zebra finch was chosen because it is a major model system for understanding brain development, learning and memory. Because it is related to the chicken, whose genome has already been sequenced, it will be possible to leverage the chicken genome to get more insight from the zebra finch map.
It has been shown that most segments of the human genome originated long before humans themselves. Consequently, scientists will use the genome sequences of the non-mammalian animals to learn more about how, when and why the genomes of humans and other mammals came to be composed of certain DNA sequences, as well as to gain new insights into the organization of genomes.
Sequencing efforts will be carried out by the NHGRI-supported Large-Scale Sequencing Research Network, which consists of five centers: Agencourt Bioscience Corp., Beverly, Mass.; Baylor College of Medicine, Houston; the Broad Institute of MIT and Harvard, Cambridge, Mass.; The J. Craig Venter Science Institute, Rockville, Md.; and Washington University School of Medicine, St. Louis. Assignment of each organism to a specific center or centers will be determined at a later date.
NHGRIs process for selecting sequencing targets begins with two working groups comprised of experts from across the research community. Each of the working groups is responsible for developing a proposal for a set of genomes to sequence that would advance knowledge in one of two important scientific areas: understanding the human genome and understanding the evolutionary biology of genomes. A coordinating committee then reviews the working groups proposals, helping to fine-tune the suggestions and integrate them into an overarching set of scientific priorities. The recommendations of the coordinating committee are then reviewed and approved by NHGRIs advisory council, which in turn forwards its recommendations regarding sequencing strategy to NHGRI leadership.
The genomes of a number of organisms have been or are being sequenced by the large-scale sequencing capacity developed by the Human Genome Project. These include the dog, the mouse, the rat, the chicken, the honey bee, ten fruit flies, the sea urchin, two puffer fish, two sea squirts, two roundworms, over a dozen fungi, bakers yeast and the bacterium Escherichia coli. Organisms currently in the NHGRI sequencing pipeline include: the macaque, the orangutan, the cow, the platypus, the red flour beetle, several additional species of fungi and the domestic cat. A complete list of organisms and their sequencing status can be viewed at www.genome.gov/10002154.
Geoff Spencer | EurekAlert!