Researchers at the Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) and at the Jülich Supercomputing Centre (JSC) of Forschungszentrum Jülich have used their substantial computing grid infrastructures for a new application in scientific computing: the large-scale annotation of biomedical and chemical texts and images in pharmaceutical patents. This will allow patent searches of an unparalleled power. Now, queries provide interesting insights into intersections between biology and chemistry, and the analysis of chemistry is truly multi-modal in the sense that text- and image-based information can be analyzed simultaneously.
More than 50,000 patents describing inventions in pharmaceutical chemistry have been processed on the large-scale computing grid infrastructures at SCAI and JSC. Automated "named entity recognition" services have identified and annotated:
o biological entities in text (e.g. protein names; gene names; gene polymorphisms; cell types),
o medical entities in text (e.g. disease names; pathology terms; risk factor terminology) as well as
o chemical information in text (e.g. drug names; expressions following the naming standards of the International Union of Pure and Applied Chemistry (IUPAC)) and
o images (e.g. chemical structure depictions).
The grid middleware UNICORE (Uniform Interface to Computing Resources) was used to manage the annotation services in the grid infrastructure, to control the streams of input and output data from the patents database to the annotation services, and to monitor the overall progress.
"This large-scale experiment opens new perspectives in scientific computing," says Prof. Dr. Martin Hofmann-Apitius, head of the Department of Bioinformatics at Fraunhofer SCAI. "This type of application goes way beyond the usual simulation applications that we are used to in the scientific computing community."
So far, text mining applications have only been run on bibliographic databases of life sciences and biomedical information such as MEDLINE. But the extension towards a multimodal analysis including annotation of text- and image-based information in full text documents on grid infrastructures has never been done before.
"We are pleased to see that our institute, which has a strong record in numerical simulation, has contributed to a new field of applications for supercomputers: what we call knowledge computing is likely to become a new discipline on its own," emphasizes Prof. Dr. Ulrich Trottenberg, Director of Fraunhofer SCAI.
"UNICORE made it possible to run this experiment at such a large scale in computing grid infrastructures at SCAI and JSC," says Dr. Achim Streit, head of Distributed Systems and Grid Computing at JSC. "The powerful workflow and data management capabilities of UNICORE allowed to annotate the patents in a seamless and automated way. A supercomputer connected by UNICORE to the infrastructure of the German Grid Initiative (D-Grid) was used to perform the knowledge extraction. This initial step of the experiment demonstrates what is possible today and shows the potential for more complex production runs in the future, using HPC systems connected in grid infrastructures".
"This is a very good example of how powerful supercomputers at JSC equipped with world-class grid technologies like UNICORE can generate synergies to enable new fields of research. I am proud that JSC is a member of the international UNICORE open source community and leads its development," explains Prof. Dr. Dr. Thomas Lippert, Director of JSC.
The team at SCAI, led by Dr. Marc Zimmermann for the image analysis annotators and by Dr. Juliane Fluck and Dr. Christoph Friedrich for the text analytics part, is currently working on the in-depth analysis of the meta-information generated in the course of this large-scale in silico-experiment. Their colleague on the side of JSC in Jülich, Mathilde Romberg, is happy that after weeks of intensive work the first "production runs" have been completed. However, the teams on both sides know that there are another 1.5 million patents waiting for them.Contact:
Michael Krapp | Fraunhofer Gesellschaft
Stable magnetic bit of three atoms
21.09.2017 | Sonderforschungsbereich 668
Drones can almost see in the dark
20.09.2017 | Universität Zürich
Our brains house extremely complex neuronal circuits, whose detailed structures are still largely unknown. This is especially true for the so-called cerebral cortex of mammals, where among other things vision, thoughts or spatial orientation are being computed. Here the rules by which nerve cells are connected to each other are only partly understood. A team of scientists around Moritz Helmstaedter at the Frankfiurt Max Planck Institute for Brain Research and Helene Schmidt (Humboldt University in Berlin) have now discovered a surprisingly precise nerve cell connectivity pattern in the part of the cerebral cortex that is responsible for orienting the individual animal or human in space.
The researchers report online in Nature (Schmidt et al., 2017. Axonal synapse sorting in medial entorhinal cortex, DOI: 10.1038/nature24005) that synapses in...
Whispering gallery mode (WGM) resonators are used to make tiny micro-lasers, sensors, switches, routers and other devices. These tiny structures rely on a...
Using ultrafast flashes of laser and x-ray radiation, scientists at the Max Planck Institute of Quantum Optics (Garching, Germany) took snapshots of the briefest electron motion inside a solid material to date. The electron motion lasted only 750 billionths of the billionth of a second before it fainted, setting a new record of human capability to capture ultrafast processes inside solids!
When x-rays shine onto solid materials or large molecules, an electron is pushed away from its original place near the nucleus of the atom, leaving a hole...
For the first time, physicists have successfully imaged spiral magnetic ordering in a multiferroic material. These materials are considered highly promising candidates for future data storage media. The researchers were able to prove their findings using unique quantum sensors that were developed at Basel University and that can analyze electromagnetic fields on the nanometer scale. The results – obtained by scientists from the University of Basel’s Department of Physics, the Swiss Nanoscience Institute, the University of Montpellier and several laboratories from University Paris-Saclay – were recently published in the journal Nature.
Multiferroics are materials that simultaneously react to electric and magnetic fields. These two properties are rarely found together, and their combined...
MBM ScienceBridge GmbH successfully negotiated a license agreement between University Medical Center Göttingen (UMG) and the biotech company Tissue Systems Holding GmbH about commercial use of a multi-well tissue plate for automated and reliable tissue engineering & drug testing.
MBM ScienceBridge GmbH successfully negotiated a license agreement between University Medical Center Göttingen (UMG) and the biotech company Tissue Systems...
19.09.2017 | Event News
12.09.2017 | Event News
06.09.2017 | Event News
21.09.2017 | Physics and Astronomy
21.09.2017 | Life Sciences
21.09.2017 | Health and Medicine