Forum for Science, Industry and Business

Sponsored by:     3M 
Search our Site:

 

Unidentified spectra detector

28.06.2016

Detecting millions of consistently unidentified spectra in vast tracts of proteomics data is possible with a new algorithm developed at EMBL-EBI

A new algorithm clusters the millions of peptide mass spectra in the PRIDE Archive public database, making it easier to detect millions of consistently unidentified spectra across different datasets. Published in Nature Methods, the new tool is an important step towards fully exploiting data produced in discovery proteomics experiments.


"Creating a sensible subset of spectra to start an in-depth analysis of unidentified spectra has been very challenging," says Juan Antonio Vizcaino of EMBL-EBI.

Credit: Illustration by Spencer Phillips, EMBL-EBI

On average, almost three quarters of spectra measured in discovery proteomics experiments remain unidentified, regardless of the quality of the experiment, as they cannot be interpreted by standard sequence-based search engines.

Alternative approaches to improve the rate of identification exist, but are fraught with disadvantages including ambiguous results. In today's study, researchers working on the PRIDE Archive public repository of proteomics data present a large-scale 'spectrum clustering' solution that takes advantage of the growing number of mass spectrometry (MS) datasets to systematically study millions of unidentified spectra.

"MS experiments produce huge amounts of data, but identifying meaningful sequences that could be assigned to specific biological functions can be troublesome," says Johannes Griss, formerly at EMBL-EBI in the UK and now at the Medical University of Vienna, Austria.

"Discovery proteomics is a mature technology, and it's crucial that we are able to exploit the data efficiently."

One of the challenges with these technologies is that a large proportion of the data generated can't be interpreted, as they correspond to peptides that have not yet been observed and are not available in databases. Such spectra could correspond to peptide variants derived from individual generic variation, or to peptides containing post-translational modifications, which are essential for the biological functions of proteins.

"What we have now is an algorithm that shows us patterns, or groups of spectra, that we've consistently missed, and helps us figure out which ones are good enough to pursue," adds Johannes. "It's a valuable tool that helps us unpick what's going on in proteomics, so we can better understand basic biological processes."

The team used the approach to recognise 9 million consistently unidentified spectra, which can make post-translational modifications and peptides containing sequence variants more discoverable. They identified three distinct sets of spectra: those that have been incorrectly identified, those that are not of high enough quality to identify properly, and those that are truly unidentified. They also combined their new approach with other methods to identify roughly 20% of the originally unidentified spectra in the public archive.

"Discovery proteomics is a mature technology, and it's crucial that we are able to exploit the data efficiently - but creating a sensible subset of spectra to start an in-depth analysis of unidentified spectra has been very challenging," says Juan Antonio Vizcaíno, who leads the Proteomics team at EMBL-EBI. "We developed a comparatively lightweight computational approach that makes it much easier to detect sequences that have been incorrectly identified, or consistently observed but not identified. These ready-to-use collections of commonly unidentified spectra are a resource for the community, so that we can all pool our efforts to find lasting solutions for proteomics research."

The new algorithm will be used to improve quality control in the PRIDE Archive. The complete spectrum clustering results are available through the PRIDE Cluster resource, which aims to simplify further investigation into unidentified spectra.

###

Source article: Griss J., et al. (2016). Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nature Methods (in press). DOI: 10.1038/nmeth.3902

Media Contact

Mary Todd Bergman
mary@ebi.ac.uk
44-788-137-7941

http://www.ebi.ac.uk 

Mary Todd Bergman | EurekAlert!

More articles from Life Sciences:

nachricht Complementing conventional antibiotics
24.05.2018 | Goethe-Universität Frankfurt am Main

nachricht Building a brain, cell by cell: Researchers make a mini neuron network (of two)
23.05.2018 | Institute of Industrial Science, The University of Tokyo

All articles from Life Sciences >>>

The most recent press releases about innovation >>>

Die letzten 5 Focus-News des innovations-reports im Überblick:

Im Focus: Molecular switch will facilitate the development of pioneering electro-optical devices

A research team led by physicists at the Technical University of Munich (TUM) has developed molecular nanoswitches that can be toggled between two structurally different states using an applied voltage. They can serve as the basis for a pioneering class of devices that could replace silicon-based components with organic molecules.

The development of new electronic technologies drives the incessant reduction of functional component sizes. In the context of an international collaborative...

Im Focus: LZH showcases laser material processing of tomorrow at the LASYS 2018

At the LASYS 2018, from June 5th to 7th, the Laser Zentrum Hannover e.V. (LZH) will be showcasing processes for the laser material processing of tomorrow in hall 4 at stand 4E75. With blown bomb shells the LZH will present first results of a research project on civil security.

At this year's LASYS, the LZH will exhibit light-based processes such as cutting, welding, ablation and structuring as well as additive manufacturing for...

Im Focus: Self-illuminating pixels for a new display generation

There are videos on the internet that can make one marvel at technology. For example, a smartphone is casually bent around the arm or a thin-film display is rolled in all directions and with almost every diameter. From the user's point of view, this looks fantastic. From a professional point of view, however, the question arises: Is that already possible?

At Display Week 2018, scientists from the Fraunhofer Institute for Applied Polymer Research IAP will be demonstrating today’s technological possibilities and...

Im Focus: Explanation for puzzling quantum oscillations has been found

So-called quantum many-body scars allow quantum systems to stay out of equilibrium much longer, explaining experiment | Study published in Nature Physics

Recently, researchers from Harvard and MIT succeeded in trapping a record 53 atoms and individually controlling their quantum state, realizing what is called a...

Im Focus: Dozens of binaries from Milky Way's globular clusters could be detectable by LISA

Next-generation gravitational wave detector in space will complement LIGO on Earth

The historic first detection of gravitational waves from colliding black holes far outside our galaxy opened a new window to understanding the universe. A...

All Focus news of the innovation-report >>>

Anzeige

Anzeige

VideoLinks
Industry & Economy
Event News

Save the date: Forum European Neuroscience – 07-11 July 2018 in Berlin, Germany

02.05.2018 | Event News

Invitation to the upcoming "Current Topics in Bioinformatics: Big Data in Genomics and Medicine"

13.04.2018 | Event News

Unique scope of UV LED technologies and applications presented in Berlin: ICULTA-2018

12.04.2018 | Event News

 
Latest News

When corals eat plastics

24.05.2018 | Ecology, The Environment and Conservation

Surgery involving ultrasound energy found to treat high blood pressure

24.05.2018 | Medical Engineering

First chip-scale broadband optical system that can sense molecules in the mid-IR

24.05.2018 | Physics and Astronomy

VideoLinks
Science & Research
Overview of more VideoLinks >>>