Match making: Computer Science connects genes to tissues

Olga Troyanskaya and her colleagues are making great strides in dissecting tiny collections of human cells, but they are not using scalpels.

They are using data.

Troyanskaya, a postdoctoral fellow and graduate students at Princeton have developed a system that allows computers to “virtually dissect” a kidney in a way that surgery cannot. The computer uses data from an array of measurements in kidney biopsies to separate cells mathematically and identify genes that are turned on in a specific cell type.

“We call it in-silico nano-dissection,” said Troyanskaya, a professor of computer science and the Lewis-Sigler Institute for Integrative Genomics.

The method has proven to be far faster and significantly more effective than current techniques. In findings published recently in the journal Genome Research, Troyanskaya’s group and a team of researchers at the University of Michigan led by Matthias Kretzler reported that they had identified 136 genes involved in the creation of a critical kidney cell called a podocyte. In decades of research, only 46 had been previously identified.

“The potential for this is huge,” said Behzad Najafian, a University of Washington assistant professor of pathology who specializes in renal pathology. “I believe this novel technique, which is a significant improvement in cell lineage-specific gene-expression analysis, will not only help us understand the pathophysiology of kidney diseases better through biopsy studies, but also provides a strong tool for discovery or validation of cell-specific urine or plasma biomarkers.”

The researchers focused on the glomerulus, an area of the kidney where the podocyte cells filter the waste from blood that will eventually leave the body as urine. One of the main reasons the researchers chose to track the podocytes is that the tiny cells are frequently involved in kidney disease. The researchers wanted to identify genes active in the podocytes and thus determine which genes cause the cell to be able to perform the podocyte’s filtering function, differentiating it from other cell types in the kidneys.

It is not an easy job: even a biopsy precise enough to sample only the glomerulus leaves doctors with a mix of four cell types including the podocyte. This yields activity measurements for tens of thousands of molecular markers, called RNA.

“It’s a little more complicated than this, but you can think of RNA as the instructions that come from the DNA, and we need to identify which of these instructions are active in the podocytes” said Casey Greene, who worked on the project as a postdoctoral researcher with Troyanskaya and is now an assistant professor of genetics at Dartmouth College.

Kretzler, a professor of internal medicine and computational medicine, and his team in Michigan first obtained data from the biopsies of 452 patients, each containing RNA from roughly 20,000 genes. The more RNA found in the sample from a particular gene, the more active that gene.

By searching for patterns among the patients’ data, the team identified 136 genes linked to the podocytes. Two of those genes have been shown experimentally to be able to cause kidney disease. The computer’s identification of genes linked to podocytes was verified by staining the cell samples with antibodies – each of which reacts to a specific protein constructed from the RNA instructions. The researchers found that the computer’s predictions were 65 percent accurate. The accuracy of the best existing method, which involves experimentally isolating the podocyte cells in mice and measuring their expression patterns, is only 23 percent.

Troyanskaya said the goal is to train the computer to come up with a mathematical formula that identifies links between similar patterns and what distinguishes them from other, unrelated patterns. It is essentially the general type of approach that companies use to evaluate customers’ buying habits to suggest new movies or purchases.

“The genes that we know are specifically active in podocytes – they are the movies that we like,” Troyanskaya said.

Although the researchers used kidney cells, Troyanskaya said the program also will work with other cell types, including other solid tissues that cannot be experimentally microdissected in humans. The program is available free to researchers on Princeton’s website.

In addition to Greene, Troyanskaya’s team at Princeton included Young-suk Lee and Qian Zhu, graduate students in computer science and genomics. The group collaborated with Kretzler’s team at Michigan as well as researchers at the University of Zurich and at Fondazione IRCCS Ca’ Grande Ospedale Maggiore Policlinico in Milan.