Already, research teams have made discoveries using these well-curated data sets. One discovered that the broken gene linked to cystic fibrosis is expressed by a type of cell scientists had never come across before, while another identified the respiratory cells that are most vulnerable to SARS-CoV-2. Others are using the data to discover new options for splicing genes to potentially correct disease-causing mutations in specific cells.
These discoveries are the first step in developing treatments for diseases—and we believe that AI can significantly speed up researchers’ rate of discoveries going forward.
The compute
To create a virtual cell, we’re building a high-performance computing cluster with 1000+ H100 GPUs that will enable us to develop new AI models trained on various large data sets about cells and biomolecules—including those generated by our scientific institutes. Over time, we hope, this will enable scientists to simulate every cell type in both healthy and diseased states, and query those simulations to see how elusive biological phenomena likely play out—including how cells come into being, how they interact across the body, and how exactly disease-causing changes affect them.
Our computing cluster won’t be as large as those used in the private sector for commercial products, but once it’s up and running, it will be one of the world’s largest AI clusters for nonprofit scientific research. This will be an important resource for academic teams that are ready to use data sets in new ways but are held back by the prohibitive cost of accessing the latest AI technology. Like our other tools, these digital cell models, and their associated data and applications, will be openly accessible to researchers worldwide.
The people
Generating these data sets, building this computing cluster, and using AI for biology is the kind of multidisciplinary, collaborative effort that defines our work.
Our Biohub Network has brought together experts from different disciplines and institutions to tackle some of science’s biggest and riskiest challenges, which couldn’t be solved in traditional academic settings. Through projects like CELLxGENE, researchers around the world have helped build a single-cell data corpus—a testament to how effectively a shared resource for open science can grow with more collaborators contributing resources and brainpower.
When CZI first launched our science work in 2016, we committed to a big goal: to help the scientific community cure, prevent, or manage all disease by the end of this century. We believe this goal is possible and will be significantly advanced if leading scientists and technologists work together to make the most of the opportunities created by AI. We can start by unlocking the mysteries of our cells, and that can lead to work that helps end many diseases as we know them.
Priscilla Chan is cofounder and co-CEO of the Chan Zuckerberg Initiative. Priscilla’s work with patients and students in communities across the Bay Area as a pediatrician and teacher has informed her desire to make learning more personalized, find new paths to manage and cure disease, and expand opportunity for more people. Priscilla earned her BA in biology at Harvard University and her MD at UC San Francisco (UCSF).
Mark Zuckerberg is cofounder and co-CEO of the Chan Zuckerberg Initiative. As the founder, chairman, and chief executive officer of Meta, Mark brings a commitment to empowering people and building communities, and deep technical experience, to CZI’s work. Mark studied computer science at Harvard University before moving to Palo Alto, California in 2004.