Rice University computer scientist Todd Treangen was awarded a $600,000 contract from Intelligence Advanced Research Projects Activity (IARPA) to develop new DNA screening tools.
This award is the second phase of a $2.7 million dollar contract from the Functional Genomic and Computational Assessment of Threats (Fun GCAT) program, which is managed by IARPA within the office of the director of National Intelligence.
“The goal of this project is to develop a robust software solution for assessing the threat potential of short DNA sequences targeted for rapid yet sensitive screening of oligonucleotides,” the assistant professor of Computer Science said.
“Specifically, we aspire to enhance DNA screening of sequences that might accidentally, or intentionally, be altered to result in a synthetic biological threat,” he said.
To accurately characterize threat potential of short DNA sequences, Treangen partnered with experts in biology, machine learning, statistics, and software engineering. Christopher Jermaine, professor of Computer Science, explained why machine learning plays an important role in the research.
“Machine learning is an important part of determining the threat level, because when a database is searched for matches to a potentially dangerous sequence, more than one close match can be returned. Each match comes with a rich set of data, including English text, that can provide a clue as to the level of danger. Figuring out how to combine all of this data to synthesize a final threat level is a challenging learning problem,” Jermaine said.
“Our multidisciplinary team is highly motivated to improve bio-threat detection methods through the development of new tools. These tools are divided in two foundational tasks in bioinformatics which are taxonomic assignment and functional characterization,” Treangen said.
Large-scale data analysis in computational biology and bioinformatics present different challenges. Treangen’s research focuses on developing thrifty algorithms and data structures that covers the areas of multiple genome alignment, multiple genome assembly and multiple genome annotations.
“A few application areas of my research include microbial forensics, microbial ecology, and rapid pathogen screening. Furthermore, I have prioritized working hand in glove with biologists to develop bioinformatics software and analysis pipelines geared towards facilitating both exploratory and hypothesis-driven biological research,” Treangen said.