TITLE: “scDesign3: an all-in-one statistical framework that generates realistic single-cell omics data and infers cell heterogeneity structure.”
ABSTRACT: The generation of realistic synthetic data is essential for benchmarking numerous computation tools developed for single-cell omics data. Here we propose an all-in-one statistical framework that generates single-cell omics data from various cell heterogeneity structures, including discrete cell types, continuous cell trajectories, and spatial cell locations. Our framework uses a unified probabilistic model with accessible likelihood. This probabilistic formulation is advantageous in that it enables a straightforward discernment of the heterogeneity structure that best fits a single-cell omics dataset, by leveraging the statistical model selection principle.