News

TileDB x Databricks Partner to Power Multimodal Data for Agentic AI in Healthcare + Life Sciences. Read the news

5 min read

Data Management
Data Science
Tech

Single-Cell analysis: uses, techniques, steps and more

Originally published: Feb 6, 2026

Table Of Contents:

What is single-cell analysis?

What is single-cell analysis used for?

Why is single-cell analysis important?

What techniques are used in single-cell analysis?

What are the steps involved in single-cell analysis?

What databases are used for single-cell analysis?

What software is used for single-cell analysis?

Single-cell analysis refers to experimental and computational methods used to measure molecular features like gene expression, chromatin accessibility, or protein abundance in individual cells. Single-cell analysis is important because it reveals unique cellular traits that bulk genetic sequencing can miss, enabling researchers to map cell states, identify rare cell types with important functions, and better understand complex biological processes like cancer and other diseases. Single-cell analysis is used across microbial genomics, immunology, developmental biology, and cancer research to characterize cell lineages, tumor microenvironments, host–pathogen interactions, and other dynamics. 

Common single-cell analysis techniques include scRNA-seq, scATAC-seq, spatial transcriptomics, and multimodal single-cell assays. Analyzing single-cell data typically involves isolating individual cells from a population, then extracting and processing genetic material, then sequencing the library of the cell, and finally performing computational analysis of the single-cell data to gain useful insights. Widely used databases for single-cell analysis include the CZ CELLxGENE Census, the Human Cell Atlas, and Gene Expression Omnibus. Let’s begin this guide to single-cell analysis with its definition and use in life sciences research.

What is single-cell analysis?

Researchers have used single-cell analysis for a wide variety of life sciences research. Some examples include dissecting tumor heterogeneity in cancer by mapping malignant and immune cell statesand charting cell-type landscapes across tissues and microbial communities by examining high-throughput single-cell sequencing in marine organisms.

What is single-cell analysis used for?

Single-cell analysis is used to better understand the molecular and functional diversity within cell populations, helping researchers identify rare cell types, reconstruct developmental lineages, examine disease mechanisms, and understand how cells change in response to environmental changes or therapeutic stimuli. By exploring biology at the resolution of one cell, single-cell analysis provides insights that bulk genetic sequencing cannot capture, especially in heterogeneous systems such as tumors, inflamed tissues, and microbial ecosystems. 

Key uses of single-cell analysis include:

  • Microbial genomics: Characterizing strain diversity and metabolic capabilities within complex microbial communities. This helps researchers understand ecological interactions and identify low-abundance species that have an outsized influence on community behavior.

  • Immunology: Defining immune cell states and activation pathways during infection, vaccination, and/or immunotherapy. This enables precise mapping of immune dynamics to better understand disease mechanisms as well as identify functional subpopulations linked to pathology.

  • Developmental biology: Reconstructing cell-fate trajectories and identifying progenitor populations across developmental stages. This helps resolve lineage relationships to show researchers how these cells drive tissue formation.

  • Cancer research: Profiling tumor heterogeneity and the tumor microenvironment to identify treatment-resistant or metastatic subclones. These insights support biomarker discovery and help clinicians deliver more effective targeted therapies for specific cancers.

Why is single-cell analysis important?

Single-cell analysis is important because it offers a high-resolution view of molecular information from individual cells, which reveals things like hidden heterogeneity, unique cell states, and rare sub-populations. This generates insights that bulk genome sequencing of millions of cells cannot, showing researchers critical variations in heterogeneous systems like tumors to help develop new drugs and targeted therapies. For example, in the study “Single‑Cell Transcriptomic Analysis of Tumor Heterogeneity” researchers used high-throughput single-cell RNA-sequencing (RNA-seq) to statistically analyze intratumoral diversity in cancer tissues.

Microbiologists have also used single-cell analysis to research microbial species that cannot be grown in lab conditions. The paper “High‑throughput, single‑microbe genomics with strain resolution” describes how researchers developed a high-throughput technique called Microbe-seq to analyze single bacterial cells from the human gut biome. This provided insights into strain-level diversity, revealing a phage association as well as the limits on horizontal gene-transfer events between strains.

By enabling these insights, single-cell analysis helps life sciences research better understand drug and disease mechanisms, develop precise therapeutic targets, and map complex ecosystems at scale. Next, let’s examine key techniques used in single-cell analysis.

What techniques are used in single-cell analysis?

The techniques used in single-cell analysis depend on factors like the biological questions researchers are pursuing, the molecular modality of interest (such as RNA, chromatin, protein or spatial context), the required resolution, and practical factors like throughput, cost, and sample quality. The following approaches each illuminate different layers of cellular biology, and together can enable a holistic view of cell state, identity, and function.

scRNA-seq (single-cell RNA sequencing)
scRNA-seq measures gene expression by capturing the transcriptome of individual cells to define cell types, states, and transcriptional programs. As described in a Frontiers in Genetics study, this technique usually involves isolating cells, reverse-transcribing their mRNA, barcoding transcripts, and then sequencing them at scale. This produces an expression matrix that researchers then filter, normalize, cluster and otherwise examine to identify biologically meaningful populations. scRNA-seq is key for mapping tissues, quantifying cellular responses to gene or drug treatments, reconstructing developmental trajectories, and identifying rare cell types. Because of its high throughput and well-established computational ecosystem, scRNA-seq is one of the most widely used single-cell technologies.

scATAC-seq (single-cell assay for transposase-accessible chromatin sequencing)
scATAC-seq profiles chromatin accessibility at single-cell resolution by using Tn5 transposase to insert sequencing adapters into open regions of the genome. As a 2023 study explains, these accessible sites reveal regulatory elements such as promoters and enhancers, enabling researchers to infer transcription-factor activity and the architecture regulating cell states. After library preparation and sequencing, bioinformatic pipelines quantify accessibility peaks, reduce dimensionality, and cluster cells based on regulatory profiles. scATAC-seq is especially valuable for studying gene-regulatory dynamics, linking regulatory elements to phenotypic outcomes, and complementing transcriptomic analyses to build models of cell identity.

Spatial transcriptomics
Spatial transcriptomics captures gene expression within the intact tissue architecture to preserve the spatial relationships that influence cell behavior. Methods range from slide-based barcoded arrays to in situ hybridization and imaging-based transcriptomics. A 2022 Genome Medicine study describes how pairing expression profiles with physical coordinates enables researchers to map cell–cell interactions, niche environments, and microanatomical structures. Researchers often integrate spatial data with scRNA-seq to refine cell-type annotations and identify region-specific states. This technique is crucial for understanding tissues where the microenvironment plays a defining role such as developing organs and neural circuits.

Multimodal single-cell assays
Multimodal assays measure multiple molecular layers from the same cell, such as combined RNA and chromatin accessibility, RNA and surface protein abundance (also called CITE-seq), or joint epigenomic and transcriptomic readouts. By capturing several modalities at the same time, these methods link regulatory mechanisms to functional outputs within each cell. Analysis typically requires integrative computational tools that align modalities, infer regulatory networks, and resolve mixed cellular states at high resolution. As a 2021 paper describes, this multimodal approach enables deeper mechanistic understanding of cell lineage, immune responses, and other complex processes.

What are the steps involved in single-cell analysis?

The process of single-cell analysis begins with single-cell sequencing and involves five key steps: isolating individual cells from a population, then extracting, processing and amplifying genetic material from each cell, then preparing a sequencing library using the single cell’s genetic material, sequencing the library of the cell with a next-generation sequencer, and finally performing computational analysis of the single-cell genomics data. 

Here are more details into the five steps of single-cell analysis:

  1. 1

    Isolating individual cells from a population Single-cell analysis begins by separating individual cells from a heterogeneous sample so that each can be profiled independently. How researchers isolate the cells depends on sample type, cell viability requirements, and downstream assays. For example, Fluorescence-activated cell sorting (FACS) uses fluorescent antibodies and laser detection to sort live cells by surface markers with high precision. Other methods include microfluidic droplet systems that encapsulate single cells in nanoliter droplets alongside reagents for large-scale parallel processing, or laser capture microdissection to extract individual cells or micro-regions directly from fixed tissue sections.

  2. 2

    Extracting, processing and amplifying molecular material After isolation, researchers assess cell integrity via imaging or viability dyes before lysing each cell to release its molecular contents. Lysis can be chemical, enzymatic, or mechanical depending on the protocol and the analyte of interest (RNA, DNA, chromatin, or proteins). Because single cells contain small quantities of nucleic acids, amplification is essential. DNA-focused assays use whole-genome amplification, while RNA-based workflows begin with reverse transcription followed by PCR or in-vitro transcription to generate sufficient cDNA. This amplification step converts extremely small input amounts into workable material for downstream library preparation while preserving the cell-specific molecular signature.

  3. 3

    Preparing a sequencing-ready library To enable accurate downstream profiling, researchers next convert amplified material into a sequencing-ready library. This involves attaching unique cell-specific barcodes that enable reads to be traced back to their originating cell along with platform-specific adapter sequences. Library construction may also include fragmentation, size selection, and quality-control checkpoints to confirm that the library represents each cell’s true molecular state. The result is a barcoded library in which every molecule retains a link to its cell of origin, forming the foundation for high-resolution analysis.

  4. 4

    Sequencing the library with high-throughput platforms Prepared libraries are then loaded onto next-generation sequencing (NGS) platforms such as Illumina or PacBio. These platforms are chosen based on read length, accuracy requirements, and assay type. For example, Illumina platforms immobilize DNA fragments on a flow cell, amplify them into clusters and sequence them via sequencing-by-synthesis using fluorescently labeled nucleotides detected cycle-by-cycle. This delivers reads containing both molecular sequences and barcodes needed for cell assignment. Then the raw data enters computational pipelines for alignment, quantification, and downstream analyses to reveal cellular heterogeneity, gene-expression patterns, regulatory programs, or mutational landscapes.

  5. 5

    Performing computational analysis of single-cell data Once sequencing is complete, computational analysis transforms raw data into biologically interpretable insights. Pipelines typically begin with demultiplexing, alignment, and quantification to generate a cell-by-feature matrix. Rigorous quality control removes low-quality cells, doublets, and technical artifacts. Normalization, dimensionality reduction, and clustering reveal underlying cell populations and states. Researchers then apply differential expression, trajectory inference, regulatory-network analysis or multimodal integration to characterize cell lineage relationships, identify key drivers of phenotype, and map cellular heterogeneity with high resolution.

What databases are used for single-cell analysis?

A single-cell database is a specialized repository that stores, organizes, and shares access to high-resolution datasets generated from single-cell sequencing, imaging, and multimodal assays. These database platforms help researchers across the world explore cell types, compare datasets, validate findings, and reuse publicly available data for new analyses. They often include standardized metadata, annotation frameworks, and visualization tools to ensure their data follows FAIR principles. This helps scientists query gene expression patterns, examine chromatin accessibility, map spatial organization across tissues, and explore other research questions. 

Some key databases used for single-cell analysis include:

  • Chan-Zuckerberg Initiative CELL by GENE Census: As the world’s largest public single-cell repository, the CZ CELLxGENE Census contains more than 110 million cells and is considered the highest quality dataset describing healthy human cells. Powered by TileDB, CZ CELLxGENE Census has been used as input to powerful biological large language models, including CZI’s TranscriptFormer model.

  • Human Cell Atlas (HCA): A global consortium that aggregates single-cell data to create comprehensive reference maps of all human cell types across tissues and developmental stages. The HCA recently entered into partnership with UNESCO to help advance open science across the world.

  • Gene Expression Omnibus (GEO): This widely used NCBI repository is an international public database that freely shares bulk and single-cell transcriptomic datasets submitted by the life sciences community. GEO supports metadata-rich uploads and reanalysis to help users easily query, locate, review, and download studies and cell gene expression profiles of interest.

  • Single Cell Portal (Broad Institute): To accelerate single-cell research, this interactive platform supports single-cell omics research with built-in visualization and comparison tools across a large repository of single-cell datasets.

What software is used for single-cell analysis?

Because of the massive scale of raw single-cell sequencing data, specialized software is essential for transforming this data into actionable insights. These computational tools help analyze data through normalization, dimensionality reduction, clustering, trajectory inference, and large-scale multimodal integration. These tools also streamline visualization, annotation and reproducibility so scientists can easily compare results across experiments and teams. Widely used tools include: 

  • Seurat – an R-based framework with extensive workflows for scRNA-seq, spatial and multimodal analysis. 

  • Bioconductor – a comprehensive ecosystem of R packages such as SingleCellExperiment and scran for statistical modeling and reproducible pipelines.

  • Scanpy – a Python-based toolkit optimized for large datasets, offering efficient preprocessing and advanced visualization.

TileDB’s solution for single-cell analysis unifies single-cell data in flexible and powerful multi-dimensional arrays. This cloud-native format enables researchers to easily manage RNA-seq data and count matrices by storing any number of samples in a compressed and lossless manner, speeding analysis and saving on storage and compute costs. TileDB also offers language APIs for R, Python and C++ and supports interoperability with Seurat, Bioconductor, and Scanpy, enabling researchers to use their preferred software and tools. Lastly, to optimally structure single-cell data for performance, TileDB offers a novel data model built for single-cell data called TileDB SOMA. This flexible, extensible, and open-source data model is built to represent the annotated matrices commonly used in single cell research.

With the launch of TileDB Carrara, life sciences researchers now have a platform built for the complexity and scale of single-cell datasets. Carrara unifies data organization, computational workflows, and interactive analysis inside one governed environment to streamline the daily research work of a single-cell analyst. To learn more about how TileDB facilitates single-cell research and analysis, take a look at our single-cell solution brief.

Meet the authors