המחלקה למדעי המחשב - אוניברסיטת בר-אילן
Department of Computer Science - Bar-Ilan University
קולוקוויום במדעי המחשב
Computer Science Colloquium
ONLY VIA ZOOM - https://us02web.zoom.us/j/83383478356
Dr. Boris Landa
Will lecture on
When Sinkhorn-Knopp meets Marchenko-Pastur: Diagonal Scaling Reveals the Rank of a Count Matrix
An important task in modern data analysis is to determine the rank of a corrupted data matrix. Random matrix theory provides useful insights into this task by assuming a “signal+noise” model, where the goal is to estimate the rank of the underlying signal matrix. If the noise is homoskedastic, i.e., the noise variances are identical across all entries, the spectrum of the noise admits the celebrated Marchenko-Pastur (MP) law, providing a simple approach for rank estimation. However, in many practical situations, such as in single-cell RNA sequencing (scRNA-seq), the noise is far from being homoskedastic. In this talk, focusing on a Poisson data model, I will present a simple procedure termed biwhitening, which enforces the MP law to hold by appropriate diagonal scaling via the Sinkhorn-Knopp algorithm. Aside from the Poisson distribution, this procedure is extended to families of distributions with quadratic variance functions. I will demonstrate this procedure on both simulated and experimental data, showcasing accurate rank estimation in simulations and excellent fits to the MP law for real data.
Bio: Boris Landa is a Gibbs Assistant Professor in the Program for Applied Mathematics at Yale University. Previously, he completed his Ph.D. in applied mathematics at Tel Aviv University under the guidance of Prof. Yoel Shkolnisky. Boris's research is focused on theory and methods for processing large datasets corrupted by noise and deformations, with applications in the biological sciences.