The development of computational tools and methods to analyze and interpret DNA methylation has earned Rice University computer scientist Vicky Yao a prestigious National Science Foundation CAREER Award.
The five-year award, this one for $790,000, is granted to fewer than 400 American academics each year who are expected to make significant contributions to their fields of study.
Yao, an assistant professor of computer science at Rice’s George R. Brown School of Engineering, plans to develop machine learning methods and build open source software to help biomedical researchers analyze DNA methylation, an important biological process by which a methyl group is added to a cytosine, one of DNA’s four bases. These small modifications affect gene expression and show region-specific patterns. Yet they’re dynamic, changing with age and in response to environmental factors such as air quality, diet and exercise.
This interests Yao, who wants to sift through the more than 28 million DNA methylation sites in the genome to find “fingerprints” representative of distinct tissues and cell types and how these translate into essential downstream functions.
“I’m grateful for the NSF award because this is somewhat of a new direction for me,” said Yao, who joined Rice in 2019 with backing from the Cancer Prevention and Research Institute of Texas and has co-authored high-profile papers applying machine learning methods to uncover once-hidden molecular processes responsible for arthritis and neurological disease.
Methylation occurs throughout the body, and gaining a better understanding of this fundamental biological process will help researchers who study development, aging and disease, she said.
“DNA methylation is a natural interface between the environment and what happens on the DNA level, and there can be many downstream effects,” Yao said. “You inherit your DNA from your parents -- your A, C, G and Ts -- and these are fixed aside from mutations which can cause disease. But methylation is a natural way to change or reverse things without adjusting the actual genome.
“It plays such a big role in regulation that it is often referred to as the ‘fifth base of DNA,’” she said. “Methylation clearly can change whether a gene is expressed or not, but it’s also relatively stable. This means we can use it as a biomarker to help orient where we are in the body and, interestingly, begin to pinpoint how environmental stimuli affect our cells.”
She said much of her research takes advantage of public genomics data repositories that span a wide variety of conditions and experimental setups. “One of the challenges is to combine different data types that measure methylation marks in different regions of the genome,” Yao said. “We need to first develop computational methods to integrate the data from different platforms to get a more complete picture of DNA methylation across the genome in different cells.”
Another part of the project will be to build software tools that allow biomedical researchers, even those with no programming experience, to explore patterns involving methylation and how to take advantage of them.
Yao said her group will adapt deep learning methods to infer methylation patterns, find location-specific hallmarks of methylation in healthy tissue and cells and tie these CpG sites — adjacent cytosine and guanine base pairs that are most often altered through methylation — with specific biological functions.
“Getting this grant is really exciting for my group,” she said. “This project will open up new research directions that enable us to work on a lot of interesting downstream applications, like how environmental factors can affect individual cells.”
*Photo credit: Ruth Dannenfelser.