SeqScreen can reveal ‘concerning’ DNA

Rice University computer scientists and their collaborators have developed SeqScreen, a program to screen short DNA sequences, whether synthetic or natural, to determine their toxicity.
Credit: Treangen Lab/Rice University

Open-source program IDs synthetic, naturally occurring gene sequences.

It’s a given that certain bacteria and viruses can cause illness and disease, but the real culprits are the sequences of concern that lie within the genomes of these microbes.

Calling them out is about to get easier.

Years of work by Rice University computer scientists and their colleagues have led to an improved platform for DNA screening and pathogenic sequence characterization, whether naturally occurring or synthetic, before they have the chance to impact public health.

Computer scientist Todd Treangen of Rice’s George R. Brown School of Engineering and genomic specialist Krista Ternus of Signature Science LLC led the study that produced SeqScreen, a program to accurately characterize short DNA sequences, often called oligonucleotides.

Treangen said SeqScreen is intended to improve the detection and tracking of a wide range of pathogenic sequences.

“SeqScreen is the first open-source software toolkit that is available for synthetic DNA screening,” Treangen said. “Our program improves upon the previous state of the art for companies, individuals and government agencies for their DNA screening practices.”

The study, which began as high-risk, high-payoff research funded by the National Intelligence Agency’s IARPAprogram in 2017, appears in the journal Genome Biology.

SeqScreen takes advantage of work by partners at Austin, Texas-based company Signature Science to curate a database of thousands of gene sequences representing 32 types of virulence functions. “This curated database took years of biocuration and review to develop, and is at the core of the training data of SeqScreen’s machine learning algorithm,” Treangen said.

The company collaborated with Treangen last year to find SARS-CoV-2 mutations that may have made the omicronvariant more resistant to antibodies, including those from vaccinations. “SeqScreen came first, and some of its ideas carried over to the COVID project,” he said. “But SeqScreen is much broader in scope.

“We focus on identifying functions of sequences of concern — which we call FunSoCs — whereas previous screening approaches were more concerned with looking at ‘are you this bacterium?’ or ‘are you this virus?’” Treangen said. “SeqScreen doesn’t focus on the names of which bacteria or viruses are in your sample. Rather, we want to know if there are sequences in that sample that could be harmful, such as toxins that can destroy human cells.”

Focusing on functions of concern is important, he said, because bacteria readily exchange DNA via horizontal gene transfer.

“We have highlighted examples in the publication of bacteria whose genomes are essentially identical, except one has a sequence of concern, such as a toxin, that the other does not,” Treangen said. “SeqScreen really hones in on the presence or absence of functions that represent virulence factors.”

He said SeqScreen will also aid in the detection of novel or emerging pathogens from the environment.

Rice graduate students Advait Balaji and Bryce Kille are co-lead authors of the paper. Co-authors are Rice postdoctoral fellow Leo Elworth, alumni Zhigin Qian and Dreycey Albin, and Santiago Segarra, an assistant professor of computer science; Anthony Kappell and Gene Goldbold of Signature Science; Madeline Diep of Fraunhofer USA Center Mid-Atlantic, Riverdale, Maryland; and Daniel Nasko, Nidhi Shah and Mihai Pop of the University of Maryland.

The research was supported by The Intelligence Advanced Research Projects Activity (IARPA) via the Army Research Office (W911NF-17-2-0089).

-30-

Read the abstract at https://rdcu.be/cP2NI.

Access SeqScreen at https://gitlab.com/treangenlab/seqscreen.

Video SeqScreen poster session: https://www.youtube.com/watch?v=bqkjRxs-lIE

This news release can be found online at https://news.rice.edu/news/2022/seqscreen-can-reveal-concerning-dna.

Follow Rice News and Media Relations via Twitter @RiceUNews.

Related materials:

Omicron mutations may help SARS-CoV-2 evade antibodies: https://news.rice.edu/news/2021/omicron-mutations-may-help-sars-cov-2-evade-antibodies-0

Functional Genomic and Computational Assessment of Threats (IARPA program): https://www.iarpa.gov/research-programs/fun-gcat

Treangen Lab: https://www.treangenlab.com

Department of Computer Science: https://csweb.rice.edu

George R. Brown School of Engineering: https://engineering.rice.edu

Images for download:

https://news-network.rice.edu/news/files/2022/06/0606_SEQSCREEN-1-web.jpg

Rice University computer scientists and their collaborators have developed SeqScreen, a program to screen short DNA sequences, whether synthetic or natural, to determine their toxicity. (Credit: Treangen Lab/Rice University)

https://news-network.rice.edu/news/files/2022/06/0613_SEQSCREEN-2-WEB.jpg

CAPTION: Todd Treangen. (Credit: Rice University)

Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation’s top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 4,240 undergraduates and 3,972 graduate students, Rice’s undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction and No. 1 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplinger’s Personal Finance.

Journal: Genome Biology
DOI: 10.1186/s13059-022-02695-x
Method of Research: Data/statistical analysis
Subject of Research: Not applicable
Article Title: SeqScreen: accurate and sensitive functional screening of pathogenic sequences via ensemble learning
Article Publication Date: 20-Jun-2022
COI Statement: A.K., G.G., and K.T. are full-time employees of Signature Science LLC, a subsidiary of Southwest Research Institute. M.D. is a full-time employee of Fraunhofer USA Center Mid-Atlantic CMA. All other authors declare that they have no competing interests.

Media Contacts

Mike Williams
Rice University
mikewilliams@rice.edu
Office: 713-348-6728

Jeff Falk
Rice University
jfalk@rice.edu
Office: 713-348-6775

Media Contact

Mike Williams
Rice University

All latest news from the category: Life Sciences and Chemistry

Articles and reports from the Life Sciences and chemistry area deal with applied and basic research into modern biology, chemistry and human medicine.

Valuable information can be found on a range of life sciences fields including bacteriology, biochemistry, bionics, bioinformatics, biophysics, biotechnology, genetics, geobotany, human biology, marine biology, microbiology, molecular biology, cellular biology, zoology, bioinorganic chemistry, microchemistry and environmental chemistry.

Back to home

Comments (0)

Write a comment

Newest articles

Detection of cancer biomarkers from blood samples

… using nanopore-based DNA computing technology. Cholangiocarcinoma, also known as bile duct cancer, is a cancer type with a characteristically high mortality. At the time of diagnosis, most bile duct…

Thin-film photovoltaic technology combines efficiency and versatility

Researchers at KIT develop perovskite/CIS tandem solar cells with an efficiency of nearly 25 percent; combination of materials enables mobile applications. Stacking solar cells increases their efficiency. Working with partners…

Tough new robots …

… will aim to think and act for themselves in the most hazardous places on Earth – and beyond. Manchester experts are combining expertise in “hot-robotics” and AI to make…

Partners & Sponsors