Table Of Contents:
Why spatial transcriptomics demands better data infrastructure
What’s next for spatial data infrastructure?
Click here to watch the full Tech Talk.
Why spatial transcriptomics demands better data infrastructure
Spatial transcriptomics is more than an exciting new modality—it’s quickly becoming the expected one. As researchers study gene expression, they want to preserve spatial context, integrate with single-cell data and ask more complex biological questions than ever before. But with the power of spatial transcriptomics comes greater complexity and new challenges: bigger files, more modalities, more formats and more demand for real-time access to data. And as today’s data infrastructure struggles to address these challenges, the future of life sciences demands a more scalable approach.
These are the urgent issues we tackled in our latest TileDB Tech Talk focused on spatial data. We had the pleasure of speaking alongside Max Lombardo from the Chan Zuckerberg Initiative (CZI) and Peng He from the University of California San Francisco (UCSF) as we explored what it really takes to drive discovery by powering spatial transcriptomics at scale.
From single cell to spatial: Why our data models had to evolve
When TileDB-SOMA was built, the goal was to create a flexible, language-agnostic and cloud-native framework for managing single-cell and multiomics data. But spatial experiments changed the game. These complex datasets come with multiscale images, coordinate systems and spatial metadata—none of which fit neatly into the rows-and-columns world of typical single-cell formats.
That’s why we extended our SOMA model. Now, researchers can store not just count matrices, but also spot locations, multi-resolution image pyramids and even derived metadata like cell-type scores across experiments and species. Whether you’re working with Visium output or planning to use imaging-based assays like Xenium down the line, the groundwork is in place for you to maximize your discovery on scalable data infrastructure built for spatial transcriptomics.
Powering the CZI spatial census with TileDB
What’s remarkable is not just the scale of more than 100 million cells and counting, but how standardized and open the data is. You can filter by tissue type, developmental stage or ontology-mapped cell type. You can query only the metadata or export full experiments for downstream analysis. You can even stream data in batches for ML workflows. This unlocks a wide range of options to empower researchers, which is exactly what they need from their data infrastructure.
Unlocking the science that matters with searchable metadata
But infrastructure isn’t the point. Discovery is.
What’s next for spatial data infrastructure?
We’re just getting started. We’re actively expanding our support for additional assay types, working on polygon-based geometries for imaging assays and gathering feedback from users to prioritize what’s next. If you’re in this space and have input, we want to hear from you.
At TileDB, our mission is to give researchers tools that match the ambition of their science. Spatial data requires more than storage—it requires structure, standards, and seamless integration. We’re proud to be building that foundation, in collaboration with amazing partners and an incredible community. Click here to watch the full Tech Talk.
Meet the authors
Devika Garg
Director of product marketing