Interpreting the genome for insights into life and health

By Steven Schultz
April 07, 2020

Olga Troyanskaya uses computing power to glean useful insights from an enduring mystery: How can the genome, the vast set of instructions that lies within every cell, be interpreted to understand, prevent, and treat disease?

Since the human genome was decoded in 2003, scientists have labored to relate its 3 billion bits of information to health and illness. Using machine learning and artificial intelligence, the Troyanskaya lab sheds light

Scientific plot of blue and red dots on a black background

on the entire genome and the networks of interactions within it. Their work has produced important insights into cancer, autism, heart disease, and other disorders. 

In one recent study, the group analyzed the genomes of nearly 1,800 families and found thousands of mutations that could contribute to autism. These mutations were not in genes themselves, but in the largely uncharted stretches of DNA that regulate how genes function. 

In another area of work, the lab has analyzed more than 10,000 publicly available data sets and created hundreds of maps of cells and circuits that allow scientists to predict how the effects of mutations would ripple through the many processes in various tissues or organs, such as the brain or liver. 

“Data-driven machine learning approaches are absolutely critical for turning the vastness of publicly available data into knowledge for health and medicine,” said Troyanskaya, a professor of computer science who is jointly appointed at Princeton’s Lewis-Sigler Institute for Integrative Genomics and the Simons Foundation in New York.