Anthropology and Human Genetics

Breadcrumb Navigation


The effect of background noise and its removal on the analysis of single-cell expression data

by P. Janssen, Z. Kliesmete, B. Vieth, X. Adiconis, S. Simmons, J. Marshall, C. McCabe, H. Heyn, J. Levin, W. Enard, and I. Hellmann



BACKGROUND: In droplet-based single-cell and single-nucleus RNA-seq experiments, not all reads associated with one cell barcode originate from the encapsulated cell. Such background noise is attributed to spillage from cell-free ambient RNA or barcode swapping events. Here, we perform an in-depth characterization of this background noise exemplified by three single-cell RNA-seq (scRNA-seq) and two single-nucleus RNA-seq (snRNA-seq) replicates of mouse kidneys cells. For each experiment, kidney cells from two mouse subspecies were pooled and this genetic variation allows us to identify cross-genotype contaminating molecules and estimate the levels of background noise.

RESULTS: We find the degree of background noise to be highly variable across replicates and individual cells, making up on average 3-35% of the total counts (UMIs) per cell, thus affecting the specificity and detectability of cell type specific marker genes. In search of the source of the background noise, we compare cell-free droplet, uncontaminated endogenous expression profiles and contamination profiles and find that the majority of the contamination most likely originates from ambient RNA. Finally, we use our genotype-based estimates to evaluate the performance of three methods (CellBender, DecontX, SoupX) that are designed to quantify and remove background noise. We find that CellBender provides the most precise estimates of background noise levels and also yields the highest improvement for marker gene detection. By contrast, clustering and classification of cells are fairly robust towards background noise and only small improvements can be achieved by background removal that may come at the cost of distortions in fine structure.

CONCLUSION: Our findings help to better understand the extent, sources and impact of background noise in single-cell experiments and provide guidance on how to deal with it.

Preprint on BioRxiv