Scientists develop faster, smarter way to classify tumors using single-cell technology

Dr. Stephen Lin, CIRM Senior Science Officer

By Dr. Stephen Lin

Single-cell.  It is the new buzzword in biology.  Single-cell biology refers to the in-depth characterization of individual cells in an organ or similar microenvironment.  Every organ, like the brain or heart, is composed of thousands to millions of cells.  Single-cell biology breaks those organs down into their individual cell components to study the diversity within those cells.  For example, the heart is composed of cardiomyocytes, but within that bulk population of cardiomyocytes there are specialized cardiomyocytes for the different chambers of the heart and others that control beating, plus others not even known yet.  Single-cell studies characterize cell-to-cell variability in the body down to this level of detail to gain knowledge of tissues in a way that was not possible before.   

The majority of single-cell studies are based on next generation sequencing technologies of genetic material such as DNA or RNA.  The cost of sequencing each base of DNA or RNA has dropped precipitously since the first human genome was published in 2000, often compared to the trend seen with Moore’s Law in computing.  As a result it is now possible to sequence every gene that is expressed in an individual cell, called the transcriptome, for thousands and thousands of cells.   

The explosion of data coming from these technologies requires new approaches to study and analyze the information.  The scale of the genetic sequences that can be generated is so big that it is often not possible anymore for scientists to interpret the data manually as had been traditionally done.  To apply this exciting field to stem cell research and therapies, CIRM funded the Genomics Initiative which created the Centers of Excellence in Stem Cell Genomics (CESCG).  The goal of the CESCG is to create novel genomic information and create new bioinformatics tools (i.e. computer software) specifically for stem cell research, some of which was highlighted in past blogs.  Some of the earliest single-cell gene expression atlases of the human body were created under the CESCG. 

The latest study from CESCG investigators creates both new information and new tools for single-cell genomics.  In work funded by the Genomics Initiative, Stephen Quake and colleagues at Stanford University and the Chan-Zuckerberg Biohub studied tumor formation using single-cell approaches.  Drawing from one of the earliest published single-cell studies, the team had surveyed human brain transcriptome diversity that included samples from the brain cancer, glioblastoma. 

Recognizing that the data coming from these studies would eventually become too large and numerous to classify all of the cell types by hand, they created a new bioinformatics tool called Northstar to apply artificial intelligence to automatically classify cell types generated by single-cell studies.  The cell classifications generated by Northstar were similar to the original classifications created manually several years ago including the identification of specific cancerous cells. 

Some of the features that make Northstar a powerful bioinformatics tool for these studies are that the software is scalable for large numbers of cells, it performs the computations to classify cells very fast, and it requires relatively low computer processing power to go through literally millions of data points. 

The scalability of the tool was demonstrated on the Tabula Muris data collection, a single-cell compendium of 20 mouse organs with over 200,000 cells of data.  Finally, Northstar was used to classify the tumors from new single-cell data generated by the CESCG via samples of 11 patient pancreatic cancer patients obtained from Stanford Hospital.  Northstar correctly found the origins of cancerous cells from the specific diagnoses of pancreatic cancer that the patients had, for example cancerous cells in the endocrine cell lineage from a patient diagnosed with neuroendocrine pancreas cancer.  Furthermore, Northstar identified previously unknown origins of cancerous cell clusters from other patients with pancreatic cancer.  These new computational tools demonstrate how big data from genomic studies can become important contributors to personalized medicine.

The full study was published in Nature.

CIRM & CZI & MOU for COVID-19

Too many acronyms? Not to worry. It is all perfectly clear in the news release we just sent out about this.

A new collaboration between the California Institute for Regenerative Medicine (CIRM) and the Chan Zuckerberg Initiative (CZI) will advance scientific efforts to respond to the COVID-19 pandemic by collaborating on disseminating single-cell research that scientists can use to better understand the SARS-CoV-2 virus and help develop treatments and cures.

CIRM and CZI have signed a Memorandum of Understanding (MOU) that will combine CIRM’s infrastructure and data collection and analysis tools with CZI’s technology expertise. It will enable CIRM researchers studying COVID-19 to easily share their data with the broader research community via CZI’s cellxgene tool, which allows scientists to explore and visualize measurements of how the virus impacts cell function at a single-cell level. CZI recently launched a new version of cellxgene and is supporting the single-cell biology community by sharing COVID-19 data, compiled by the global Human Cell Atlas effort and other related efforts, in an interactive and scalable way.

“We are pleased to be able to enter into this partnership with CZI,” said Dr. Maria T. Millan, CIRM’s President & CEO. “This MOU will allow us to leverage our respective investments in genomics science in the fight against COVID-19. CIRM has a long-standing commitment to generation and sharing of sequencing and genomic data from a wide variety of projects. That’s why we created the CIRM genomics award and invested in the Stem Cell Hub at the University of California, Santa Cruz, which will process the large complex datasets in this collaboration.”  

“Quickly sharing scientific data about COVID-19 is vital for researchers to build on each other’s work and accelerate progress towards understanding and treating a complex disease,” said CZI Single-Cell Biology Program Officer Jonah Cool. “We’re excited to partner with CIRM to help more researchers efficiently share and analyze single-cell data through CZI’s cellxgene platform.”

In March 2020, the CIRM Board approved $5 million in emergency funding to target COVID-19. To date, CIRM has funded 17 projects, some of which are studying how the SARS-CoV-2 virus impacts cell function at the single-cell level.

Three of CIRM’s early-stage COVID-19 research projects will plan to participate in this collaborative partnership by sharing data and analysis on cellxgene.   

  • Dr. Evan Snyder and his team at Sanford Burnham Prebys Medical Discovery Institute are using induced pluripotent stem cells (iPSCs), a type of stem cell that can be created by reprogramming skin or blood cells, to create lung organoids. These lung organoids will then be infected with the novel coronavirus in order to test two drug candidates for treating the virus.
  • Dr. Brigitte Gomperts at UCLA is studying a lung organoid model made from human stem cells in order to identify drugs that can reduce the number of infected cells and prevent damage in the lungs of patients with COVID-19.
  • Dr. Justin Ichida at the University of Southern California is trying to determine if a drug called a kinase inhibitor can protect stem cells in the lungs and other organs, which the novel coronavirus selectively infects and kills.

“Cumulative data into how SARS-CoV-2 affects people is so powerful to fight the COVID-19 pandemic,” said Stephen Lin, PhD, the Senior CIRM Science Officer who helped develop the MOU. “We are grateful that the researchers are committed to sharing their genomic data with other researchers to help advance the field and improve our understanding of the virus.”

CZI also supports five distinct projects studying how COVID-19 progresses in patients at the level of individual cells and tissues. This work will generate some of the first single-cell biology datasets from donors infected by SARS-CoV-2 and provide critical insights into how the virus infects humans, which cell types are involved, and how the disease progresses. All data generated by these grants will quickly be made available to the scientific community via open access datasets and portals, including CZI’s cellxgene tool.