How the Cancer Genome Atlas Drives Groundbreaking Research
Search For Schools
When you click on a sponsoring school or program advertised on our site, or fill out a form to request information from a sponsoring school, we may earn a commission. View our advertising disclosure for more details.
“My career to this point has been predicated on the fact that there is publicly available data to answer these questions. It’s been helpful to me not having to invest the money in generating that data. Funding is tight now, so I don’t know how many more of these projects there are in the future.”
Jeffrey Damrauer, PhD, Assistant Professor and Researcher in the Department of Medicine at the University of North Carolina Lineberger Comprehensive Cancer Center
Bioinformatics is an interdisciplinary field that combines biology, computer science, and statistics to analyze and interpret biological data. But for the field to advance, it needs one critical component: data.
The Cancer Genome Atlas (TCGA) was a landmark research project that aimed to comprehensively understand the genomic alterations associated with various types of cancer. It was a collaborative effort between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It was a massive data-gathering undertaking.
In 2005, TCGA sought to collect and analyze large-scale genomic data from thousands of tumor samples across multiple cancer types. The project focused on characterizing the genetic mutations, gene expression patterns, and epigenetic modifications present in cancer cells. This comprehensive approach gave researchers valuable insights into the underlying molecular mechanisms driving cancer development and progression.
“The goal was to take the most common tumor types and fully molecularly profile them. It ended up being 33 tumor types. All those tumor types have some level of clinical annotation, DNA sequencing, RNA sequencing, and, later on, methylation analysis. The idea was that if we take all of this and integrate it together, what can we learn holistically about these cancers?” asks Dr. Jeffrey Damrauer, assistant professor and researcher in the Department of Medicine at the University of North Carolina Lineberger Comprehensive Cancer Center.
TCGA was launched shortly after the Human Genome Project was completed in 2003. It was a massive undertaking with lofty goals: “The idea was that although we have the power individually to look at these different tumors, what if we take all of the molecular features, combine them, and then do an analysis to see the variation across tumors? Do all tumors have the same recurrent mutations? Is there a subset that has this mutation versus that mutation? And how does their outcome vary? Or how does that affect the genes that are expressed?” asks Dr. Damrauer.
Meet the Expert: Jeffrey Damrauer, PhD
Dr. Jeffrey Damrauer is an assistant professor and researcher in the Department of Medicine at the University of North Carolina Lineberger Comprehensive Cancer Center. He also serves as a co-leader for the Bladder Cancer Research Program, where he plays a pivotal role in overseeing the group’s bioinformatics endeavors.
Dr. Damrauer has a long track record in bladder cancer research, including identifying and characterizing molecular subtypes of muscle-invasive bladder cancer. He also discovered a unique gene expression signature demonstrating predictive capabilities regarding the response to BCG therapy. The research he has accomplished has relied heavily on the data produced for The Cancer Genome Atlas.
The Importance of TCGA
The importance of TCGA, while obscure to the general population, cannot be overstated. Not only was this a huge undertaking financially, but also logistically: “There’s no way this could have been done without the support of the National Institute of Health and the National Cancer Institute. Sometimes government bureaucracy gets in the way, but sometimes you need that institutional knowledge of how to execute a massive international collaboration to even just get it off the ground,” says Dr. Damrauer. The collaboration fostered as part of this project resulted in significant research and many published papers, some of which had more than 300 authors from different institutions.
There were many logistical hurdles tackled for TCGA: “How do we collect the tumors? How do we centrally review them? Where are they going to live? Who will ship them to whatever institution is doing a certain analysis? All of that takes money and knowledge,” explains Dr. Damrauer. “If it were left up to a private company like pharma, that data would not become public because they paid for it. Whereas the National Cancer Institute generated it for the community.”
“They knew they were one of the only agencies that could fund a project this large, and recognized the need to do it for the sake of science. They didn’t know the end result, but just having the data would allow researchers not to have to spend their funds doing this,” he adds. “My career to this point has been predicated on the fact that there is publicly available data to answer these questions. It’s been helpful to me not having to invest the money in generating that data. Funding is tight now, so I don’t know how many more of these projects there are in the future.”
Public Access to TCGA Data
One of the most important aspects of TCGA has been access to the data after the fact: “Because it was an NIH-funded project, all the data is publicly available. Following the publication of the papers, the data is available for anyone to use. A real strength of it was not only are they describing the tumor, but you have access to the raw data collected,” explains Damrauer.
“TCGA has accounted for the publications directly from the working group, but then probably tens of thousands of other publications which have used that data. It’s a massive resource both for the primary generation of hypotheses and answering questions, but also for validation data. When we have our own cohort and see a pattern, we can then go back to that data to benchmark it against it.”
Improvements in Cancer Care Thanks to TCGA
The data gathering accomplished by TCGA has had a measurable impact on cancer treatment: “The major classes of cancer therapy include chemotherapies, which are cytotoxic and are nondiscriminatory cell killing compounds, immunotherapies which ramp up the immune system to targeted specific cell types, and what is commonly known as precision medicine, or targeted therapy. The notion of targeted therapy is we can identify a vulnerability within a cancer cell that is not present within a normal cell. These drugs go in and specifically inhibit a specific protein. It is often referred to as a silver bullet drug because it is one compound with a long, durable response to it. Many of these newer targeted therapies are due to TCGA,” says Dr. Damrauer.
The depth of the data gathered for TCGA has also allowed researchers to compare genes from different cancers and discover similarities, like two classes of proteins present in both bladder and breast cancer, opening up treatment options. “TCGA launched all of these new alterations that people are now asking how can we target those? Because the TCGA tumor type groups are so large, we can see what proportion of patients have and prioritize them,” shares Dr. Damrauer.
There are two options for prioritizing the work uncovered by TCGA. “First, you can say, ‘This mutation is occurring in 60 percent of patients, so we should try to target that’, or you could say, ‘We have this drug that works against this mutation in this cancer type, but because we sequence so many of these cancers in TCGA, we now have uncovered some patients have that mutation in a different cancer type.’ Even though those tissues may not be related, and no one would have ever thought to use breast cancer drugs to treat prostate cancer, we see that there’s similar alteration, so maybe we can repurpose that drug to do this,” he adds.
The Genome Data Analysis Network (GDAN)
While the work for TCGA is complete, many projects and studies have spun off from it, including the Genome Data Analysis Network (GDAN). “There’s a second iteration of TCGA, called the Genome Data Analysis Network (GDAN). What that’s doing is the second step. Now that we’ve done all the sequencing, we are looking for what other things we can extract from that data and what other things that we can use those experiences to help inform,” explains Dr. Damrauer.
He continues, “For example, there’s a subset of lung cancers that didn’t have the classic driver mutations you would expect. So the question is, if we then went into those in the extensive sequencing, can we find other things that account for why those were caused? Other than what’s traditionally thought of because we know that it doesn’t have those traditional mutations.”