Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Unidentified spectra detector

28.06.2016

Detecting millions of consistently unidentified spectra in vast tracts of proteomics data is possible with a new algorithm developed at EMBL-EBI

A new algorithm clusters the millions of peptide mass spectra in the PRIDE Archive public database, making it easier to detect millions of consistently unidentified spectra across different datasets. Published in Nature Methods, the new tool is an important step towards fully exploiting data produced in discovery proteomics experiments.


"Creating a sensible subset of spectra to start an in-depth analysis of unidentified spectra has been very challenging," says Juan Antonio Vizcaino of EMBL-EBI.

Credit: Illustration by Spencer Phillips, EMBL-EBI

On average, almost three quarters of spectra measured in discovery proteomics experiments remain unidentified, regardless of the quality of the experiment, as they cannot be interpreted by standard sequence-based search engines.

Alternative approaches to improve the rate of identification exist, but are fraught with disadvantages including ambiguous results. In today's study, researchers working on the PRIDE Archive public repository of proteomics data present a large-scale 'spectrum clustering' solution that takes advantage of the growing number of mass spectrometry (MS) datasets to systematically study millions of unidentified spectra.

"MS experiments produce huge amounts of data, but identifying meaningful sequences that could be assigned to specific biological functions can be troublesome," says Johannes Griss, formerly at EMBL-EBI in the UK and now at the Medical University of Vienna, Austria.

"Discovery proteomics is a mature technology, and it's crucial that we are able to exploit the data efficiently."

One of the challenges with these technologies is that a large proportion of the data generated can't be interpreted, as they correspond to peptides that have not yet been observed and are not available in databases. Such spectra could correspond to peptide variants derived from individual generic variation, or to peptides containing post-translational modifications, which are essential for the biological functions of proteins.

"What we have now is an algorithm that shows us patterns, or groups of spectra, that we've consistently missed, and helps us figure out which ones are good enough to pursue," adds Johannes. "It's a valuable tool that helps us unpick what's going on in proteomics, so we can better understand basic biological processes."

The team used the approach to recognise 9 million consistently unidentified spectra, which can make post-translational modifications and peptides containing sequence variants more discoverable. They identified three distinct sets of spectra: those that have been incorrectly identified, those that are not of high enough quality to identify properly, and those that are truly unidentified. They also combined their new approach with other methods to identify roughly 20% of the originally unidentified spectra in the public archive.

"Discovery proteomics is a mature technology, and it's crucial that we are able to exploit the data efficiently - but creating a sensible subset of spectra to start an in-depth analysis of unidentified spectra has been very challenging," says Juan Antonio Vizcaíno, who leads the Proteomics team at EMBL-EBI. "We developed a comparatively lightweight computational approach that makes it much easier to detect sequences that have been incorrectly identified, or consistently observed but not identified. These ready-to-use collections of commonly unidentified spectra are a resource for the community, so that we can all pool our efforts to find lasting solutions for proteomics research."

The new algorithm will be used to improve quality control in the PRIDE Archive. The complete spectrum clustering results are available through the PRIDE Cluster resource, which aims to simplify further investigation into unidentified spectra.

###

Source article: Griss J., et al. (2016). Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nature Methods (in press). DOI: 10.1038/nmeth.3902

Media Contact

Mary Todd Bergman
mary@ebi.ac.uk
44-788-137-7941

http://www.ebi.ac.uk 

Mary Todd Bergman | EurekAlert!

More articles from Life Sciences:

nachricht New risk factors for anxiety disorders
24.02.2017 | Julius-Maximilians-Universität Würzburg

nachricht Stingless bees have their nests protected by soldiers
24.02.2017 | Johannes Gutenberg-Universität Mainz

All articles from Life Sciences >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Breakthrough with a chain of gold atoms

In the field of nanoscience, an international team of physicists with participants from Konstanz has achieved a breakthrough in understanding heat transport

In the field of nanoscience, an international team of physicists with participants from Konstanz has achieved a breakthrough in understanding heat transport

Im Focus: DNA repair: a new letter in the cell alphabet

Results reveal how discoveries may be hidden in scientific “blind spots”

Cells need to repair damaged DNA in our genes to prevent the development of cancer and other diseases. Our cells therefore activate and send “repair-proteins”...

Im Focus: Dresdner scientists print tomorrow’s world

The Fraunhofer IWS Dresden and Technische Universität Dresden inaugurated their jointly operated Center for Additive Manufacturing Dresden (AMCD) with a festive ceremony on February 7, 2017. Scientists from various disciplines perform research on materials, additive manufacturing processes and innovative technologies, which build up components in a layer by layer process. This technology opens up new horizons for component design and combinations of functions. For example during fabrication, electrical conductors and sensors are already able to be additively manufactured into components. They provide information about stress conditions of a product during operation.

The 3D-printing technology, or additive manufacturing as it is often called, has long made the step out of scientific research laboratories into industrial...

Im Focus: Mimicking nature's cellular architectures via 3-D printing

Research offers new level of control over the structure of 3-D printed materials

Nature does amazing things with limited design materials. Grass, for example, can support its own weight, resist strong wind loads, and recover after being...

Im Focus: Three Magnetic States for Each Hole

Nanometer-scale magnetic perforated grids could create new possibilities for computing. Together with international colleagues, scientists from the Helmholtz Zentrum Dresden-Rossendorf (HZDR) have shown how a cobalt grid can be reliably programmed at room temperature. In addition they discovered that for every hole ("antidot") three magnetic states can be configured. The results have been published in the journal "Scientific Reports".

Physicist Dr. Rantej Bali from the HZDR, together with scientists from Singapore and Australia, designed a special grid structure in a thin layer of cobalt in...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

Event News

Booth and panel discussion – The Lindau Nobel Laureate Meetings at the AAAS 2017 Annual Meeting

13.02.2017 | Event News

Complex Loading versus Hidden Reserves

10.02.2017 | Event News

International Conference on Crystal Growth in Freiburg

09.02.2017 | Event News

 
Latest News

Stingless bees have their nests protected by soldiers

24.02.2017 | Life Sciences

New risk factors for anxiety disorders

24.02.2017 | Life Sciences

MWC 2017: 5G Capital Berlin

24.02.2017 | Trade Fair News

VideoLinks
B2B-VideoLinks
More VideoLinks >>>