Specialized Population and State Analysis

Finding the cells that matter most

In many experiments, the most biologically interesting cells are not the most abundant ones. Rare progenitor populations, transitional states that exist only briefly during a dynamic process, and stress-response states induced by disease or treatment may represent a small fraction of total cells but carry disproportionate biological significance. Specialized population analysis applies methods designed to detect, characterize, and analyze these populations reliably.

Rare cell type identification

COMET (COMbinatorial Expression Triage) identifies rare populations by systematically evaluating multi-marker combinations rather than relying on single markers, improving sensitivity for populations that lack a single defining gene.

For experiments where the primary question is which cell types differ most between conditions, Augur ranks cell types by their separability across conditions, allowing computational resources to be focused on the populations most likely to be biologically informative. When experiments include paired or multimodal data (CITE-seq, Multiome), surface protein or chromatin accessibility information provides additional axes of discrimination that can reveal rare populations invisible to transcriptomics alone.

Doublet detection and quality control

DoubletFinder remains the standard tool for doublet detection in droplet-based single-cell data, using a simulation-based approach to estimate the doublet rate and flag likely doublets for removal. scDblFinder is a faster Bioconductor alternative that performs well in benchmarks and integrates directly with SingleCellExperiment objects. Proper doublet removal is particularly important in rare cell type analyses, where artifactual doublets are most likely to be mistaken for genuine biological populations.

Ambient RNA correction

Ambient RNA contamination, free-floating mRNA from lysed cells that gets captured in droplets alongside real cells, is a systematic artifact in droplet-based single-cell data that can inflate the apparent expression of highly expressed genes in cell types where they are not actually expressed. SoupX and CellBender estimate and remove ambient RNA contamination, with CellBender using a deep learning model that jointly corrects for both ambient RNA and empty droplets. This correction step is increasingly standard for publication-quality single-cell data.

Imputation for sparse data

Single-cell RNA-seq data is inherently sparse. For some analyses, particularly visualization and trajectory inference in populations with low sequencing depth, imputation can improve the signal-to-noise ratio. MAGIC and SAVER are the most widely used imputation tools. Imputation should be used selectively and is not appropriate as a preprocessing step for differential expression analysis, where it can inflate statistical confidence.

Transitional and stress states

Cells under stress, hypoxia, metabolic perturbation, or active signaling often adopt transcriptional states that are distinct from both their resting identity and their differentiated progeny. Identifying and characterizing these states requires careful integration of pseudotime inference, gene ontology analysis, and comparison with published stress response signatures. 3DG annotates these populations with reference to the current literature and distinguishes genuine biological states from technical artifacts where possible.

Privacy Overview
3D Genomics Logo

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

Analytics

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.