Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Integrating the world’s scientific databases through ontology

24.10.2011
Ontological systems under development at the RIKEN BioResource Center promise to revolutionize life science research by completely changing the way researchers share data
Hiroshi Masuya
Unit Leader
Technology and Development Unit for Knowledge Base of Mouse Phenotype
RIKEN BioResource Center
Researchers at the RIKEN BioResource Center led by Hiroshi Masuya of the Technology and Development Unit for Knowledge Base of Mouse Phenotype are developing a system that will be able to bring together all of the information saved in databases around the world to be accessible from a single terminal. This system will allow scientists to select the information necessary for their research instantaneously from any database in the world, analyze it, and display the results in a readily usable format. The key to the system is ontology, a philosophy dating back to the time of Aristotle but with technological relevance today. Ontological systems promise to revolutionize the way we share data, and the technology is attracting attention across the globe.

Barriers to research

There are many steps researchers must take in making their own experimental plans, including checking databases to find and analyze research trends in relevant fields and choosing the appropriate experimental materials. It is also necessary to compile papers and other reference materials, and review and interpret their contents. However, regarding the issue of the wordings used to describe pathologic conditions and other characteristics of laboratory animals, for instance, different researchers use somewhat different definitions. This linguistic vagueness makes it necessary to analyze the experimental methodology and context and reinterpret the terms in all cases. A great deal of time is taken with these painstaking preparatory arrangements before determining the optimum experimental methodology.

“In biology, there are numerous databases for genes, proteins, diseases and the like around the world, and they all operate separately. A researcher who wants to investigate a particular subject must search all the databases that seem to be appropriate one by one. In addition, each individual database has its own attributes. Because the databases are designed to be used in distinct ways that are suited to different research areas, it takes a great deal of time for researchers in other areas to become familiar with databases in areas other than their own,” points out Masuya.

Ontology—correlating the essential nature of things

“Ontological technology allows computers to automatically arrange and extract the desired data so that the preparatory work for any investigation comes very easy,” Masuya explains, “The term ontology has its origin in a Greek philosophical term meaning existence. In bioinformatics, ontology refers to the classification of concepts and terms and how to describe their relationships and systems.”

In 2010, Masuya and his colleagues created the RIKEN Integrated Database of Mammals. The database incorporates YAMATO-GXO (‘Yet Another More Advanced Top- level Ontology-Genetics Ontology’), an ontology tool they developed jointly with Riichiro Mizoguchi at the Institute of Scientific and Industrial Research (ISIR) of Osaka University. “We integrated the 18 major databases of the world using YAMATO-GXO. Our mammalian database is based on RIKEN’s Scientists’ Networking System (SciNetS).” Developed by a team led by Tetsuro Toyoda, director of the RIKEN Bioinformatics And Systems Engineering Division (BASE), SciNetS can accommodate a wide variety of data, including ontological data, facilitating the integration of developed databases. To date, RIKEN’s nine databases in biology have been integrated. They succeeded in integrating as many as 900,000 data items from 18 databases by incorporating YAMATO-GXO into SciNetS and other databases. “It is quite painstaking for a single researcher to find the data they want from among 900,000 entries. However, the RIKEN Integrated Database of Mammals makes it easy to obtain the data they want in a somewhat automatically analyzed form.” This database is currently under development and expansion

Rapid integration of databases

Technology for knowing the meanings of words and sentences and forming correlations among them may sound relatively simple, but ontology is in reality a very profound activity. “Ontology is philosophy. It is underlain by a philosophical system that has been unbroken since the time of the ancient Greek philosopher Aristotle (BC 384–322). It took five years for us to be able to understand information technology based on a philosophy that has been nurtured over such a long historical period.”

According to Masuya, “Ontology is used to teach the computer about this world.” For example, the human being is a primate, a mammal, an organism and an animal. It is characterized by bipedal locomotion, a brain weighing 1,250 grams on average, five fingers on each limb, two eyes and so on. By fractionalizing things like this and systematizing the essential meanings, a more fundamental ‘superordinate concept’ is created. If systematized, even databases with different logical structures can be combined relatively easily with ontology serving as a ‘translator’.

Before ontology was integrated into practical applications, databases could not be linked together unless their logical structures, or intrinsic habits, were coordinated in all cases. The need to build other databases to separately connect different meanings was unavoidable. That work is painstaking and time-consuming. “Thanks to YAMATO-GXO, we were able to develop the RIKEN Integrated Database of Mammals, which integrates 18 databases, in just half a year.”

Building on this achievement, in fiscal 2011 RIKEN launched the ‘Biological and Environmental Phenomes Integration Database’, a database integration promotion program sponsored by the Japan Science and Technology Agency in a joint initiative with Toyoda of the BASE. “This program will integrate nationally available data on ‘phenotypes’, which represent the characters manifested by the action of genes, and information on measurement techniques. We are working on developing a database that allows even a measurement technique with use limited to a particular area to be used in other areas, allowing it to contribute to advances in biology at large.”

The attraction of ontology

Ontology research is currently attracting worldwide attention. The concept of gene ontology was first proposed in 1995 by Michael Ashburner of the University of Cambridge in the UK, and gene ontology even now represents a major technical breakthrough for the standardization and massive compilation of biological information. The introduction of this approach resulted in an explosion in research using DNA microarrays—chips that allows investigators to determine how a large number of genes are expressed, and the intensity of expression, at one time. Using gene ontology, for example, it is possible to collate the availability of all reports on the functions of the gene expressed. With the spread of DNA microarray technology, a new discipline called transcriptomics emerged to analyze when, where and at what levels the more than 20,000 human genes are expressed, and to determine what is meant by the expression. “The microarray could not have become such a powerful research tool without gene ontology,” says Masuya. Linkage of the two distinct technologies, microarrays and ontology, has been promoting advances in the new research domain of transcriptomics.

The trend of the times is also boosting ontology. It has been shown that in research into genes and proteins, causality does not always stand in a one-to- one relationship between cause and result. This is because many genes and biomolecules are involved in the processes for the generation of each protein. Additionally, techniques for visualizing the behaviors of many genes and biomolecules are already available. “By using an ontology-based integrated database, we can get a listing of the results from the concurrent functioning of multiple genes out of the vast amount of data obtained, rather than the one-to-one matched data on gene functions in conventional databases. Ontology is expected to really lead future research.” Because it is capable of easily identifying disease-causing genes and proteins from among the vast number of biomolecules, ontology is expected to lead to major breakthroughs in the acceleration of new drug development and phase I clinical trials. While information is increasing explosively in the research domain, ontology that links a wide variety of databases can be described as a hidden but powerful tool that leads research activities that are prone to become chaotic.

Identifying knockout mouse phenotypes within an international framework

“We will proceed to develop ontology to standardize international mouse information,” says Masuya. His laboratory has been requested to join the International Mouse Phenotyping Consortium (IMPC) to clarify the relationships between genes and phenotypes by examining all the phenotypes in knockout mice that have been manipulated to systematically delete each gene in the mouse genome. Mice represent a number of similarities (homologies) with humans in terms of the number and kinds of genes, as well as biological events and disease processes. The large project aims to link human diseases and phenotypes of knockout mice. “Currently, laboratories all over the world are working to design knockout mice and utilize them as investigational materials independently. However, a major loss of information resides here.”

In conducting experiments, researchers create knockout mice that fit their research themes. For example, researchers studying limb development may generate a knockout mouse by inactivating a relevant gene. If researchers cannot find any morphological abnormality in the limb, they often give up on investigating that mouse further. However, a lot of genes have multiple functions. For instance, many signaling molecules involved in limb development are also involved in other biological processes in another organ—a fact that could be easily overlooked and a discovery that might never get published, even though the finding may have made all the difference to a physician struggling to elucidate metabolic disorders in a patient.

“Such occurrences have been prevalent since the birth of the first knockout mouse. The IMPC offers a decisive solution to this situation.” In the large- scale project with its huge budget of nine million dollars, more than 20,000 mouse genes are being knocked out one-by-one to comprehensively analyze basic phenotypes and determine their influences on the mammalian body. The project also includes the development of an ontology-incorporating database and provides free access to information on the associations of the genes with biological phenomena and diseases. RIKEN’s BRC is going to join the IMPC in a collaboration between the Technology and Development Team for Mouse Phenotype Analysis led by Shigeharu Wakana, the Experimental Animal Division headed by Atsushi Yoshiki, and Masuya’s Technology and Development Unit for Knowledge Base of Mouse Phenotype. Once this information network is built, it will be possible to list all knockout mice that help research into a particular human disease from the database. “The network will enable us to select ‘all’ mice serving as disease models that exhibit similar symptoms, and even ‘potential models’ that exhibit near-morbid conditions. This encompassing ‘all’ is of paramount importance, and reducing the unidentified portions will dramatically move forward the whole field of research into disease.”

A powerful tool that will lead research activities

Database integration using ontology has the potential to bring many breakthroughs. In a hospital, for example, physicians could download a listing of everything from the names of candidate diseases to the likely progression of the condition, candidate medications and therapeutic guides. Such an integrated database would make it possible to investigate therapeutic approaches to coping with complications from all angles using information from the component databases. “Our ultimate goal is to create a tool that will serve as the guide to researchers’ activities by presenting information even at levels beyond human ponderings, and deducing and displaying potentially useful search results in an easily understandable way,” says Masuya.

About the Researcher

Hiroshi Masuya

Hiroshi Masuya was born in Osaka, Japan, in 1968. He graduated from the Biological Institute at Tohoku University and obtained his PhD in 1996 from the School of Life Science at the Graduate University for Advanced Studies. He joined the National Institute of Genetics in 1997 and became a research fellow of the Japan Society for the Promotion of Science in 1998. In 1999, he joined the RIKEN Genomic Sciences Center as a research scientist. Since 2008, he has been unit leader of the Technology and Development Unit for Knowledge Base of Mouse Phenotype at the RIKEN BioResource Center. His interests include bioinformatics and mammalian genomics

gro-pr | Research asia research news
Further information:
http://www.riken.jp
http://www.researchsea.com

More articles from Life Sciences:

nachricht Symbiotic bacteria: from hitchhiker to beetle bodyguard
28.04.2017 | Johannes Gutenberg-Universität Mainz

nachricht Nose2Brain – Better Therapy for Multiple Sclerosis
28.04.2017 | Fraunhofer-Institut für Grenzflächen- und Bioverfahrenstechnik IGB

All articles from Life Sciences >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Making lightweight construction suitable for series production

More and more automobile companies are focusing on body parts made of carbon fiber reinforced plastics (CFRP). However, manufacturing and repair costs must be further reduced in order to make CFRP more economical in use. Together with the Volkswagen AG and five other partners in the project HolQueSt 3D, the Laser Zentrum Hannover e.V. (LZH) has developed laser processes for the automatic trimming, drilling and repair of three-dimensional components.

Automated manufacturing processes are the basis for ultimately establishing the series production of CFRP components. In the project HolQueSt 3D, the LZH has...

Im Focus: Wonder material? Novel nanotube structure strengthens thin films for flexible electronics

Reflecting the structure of composites found in nature and the ancient world, researchers at the University of Illinois at Urbana-Champaign have synthesized thin carbon nanotube (CNT) textiles that exhibit both high electrical conductivity and a level of toughness that is about fifty times higher than copper films, currently used in electronics.

"The structural robustness of thin metal films has significant importance for the reliable operation of smart skin and flexible electronics including...

Im Focus: Deep inside Galaxy M87

The nearby, giant radio galaxy M87 hosts a supermassive black hole (BH) and is well-known for its bright jet dominating the spectrum over ten orders of magnitude in frequency. Due to its proximity, jet prominence, and the large black hole mass, M87 is the best laboratory for investigating the formation, acceleration, and collimation of relativistic jets. A research team led by Silke Britzen from the Max Planck Institute for Radio Astronomy in Bonn, Germany, has found strong indication for turbulent processes connecting the accretion disk and the jet of that galaxy providing insights into the longstanding problem of the origin of astrophysical jets.

Supermassive black holes form some of the most enigmatic phenomena in astrophysics. Their enormous energy output is supposed to be generated by the...

Im Focus: A Quantum Low Pass for Photons

Physicists in Garching observe novel quantum effect that limits the number of emitted photons.

The probability to find a certain number of photons inside a laser pulse usually corresponds to a classical distribution of independent events, the so-called...

Im Focus: Microprocessors based on a layer of just three atoms

Microprocessors based on atomically thin materials hold the promise of the evolution of traditional processors as well as new applications in the field of flexible electronics. Now, a TU Wien research team led by Thomas Müller has made a breakthrough in this field as part of an ongoing research project.

Two-dimensional materials, or 2D materials for short, are extremely versatile, although – or often more precisely because – they are made up of just one or a...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Fighting drug resistant tuberculosis – InfectoGnostics meets MYCO-NET² partners in Peru

28.04.2017 | Event News

Expert meeting “Health Business Connect” will connect international medical technology companies

20.04.2017 | Event News

Wenn der Computer das Gehirn austrickst

18.04.2017 | Event News

 
Latest News

Wireless power can drive tiny electronic devices in the GI tract

28.04.2017 | Medical Engineering

Ice cave in Transylvania yields window into region's past

28.04.2017 | Earth Sciences

Nose2Brain – Better Therapy for Multiple Sclerosis

28.04.2017 | Life Sciences

VideoLinks
B2B-VideoLinks
More VideoLinks >>>