Speaker | Zhu Ziwei, University of Michigan |
Host | Li Xudong, School of Data Science, Fudan University |
Time | 9:30-11:30, May 18, 2020 |
Zoom Meeting ID | 98285590674 |
Zoom Meeting Code | 509721 |
Abstract | In this talk, I will focus on the effect of missing data in Principal Component Analysis (PCA). In simple, homogeneous missingness settings with a noise level of constant order, we show that an existing inverse-probability weighted (IPW) estimator of the leading principal components can (nearly) attain the minimax optimal rate of convergence, and discover a new phase transition phenomenon along the way. However, deeper investigation reveals both that, particularly in more realistic settings where the missingness mechanism is heterogeneous, the empirical performance of the IPW estimator can be unsatisfactory, and moreover that, in the noiseless case, it fails to provide exact recovery of the principal components. Our main contribution, then, is to introduce a new method for high-dimensional PCA, called ``primePCA'', that is designed to cope with situations where observations may be missing in a heterogeneous manner. Our numerical studies on both simulated and real data reveal that ``primePCA'' exhibits very encouraging performance across a wide range of scenarios. |
Bio | Ziwei Zhu is currently an Assistant Professor at the Department of Statistics at the University of Michigan, Ann Arbor. His research interests include distributed statistical inference, robust statistics, low-rank matrix estimation and missing data. Prior to UMich, he was a research associate at the Statistical Laboratory at the University of Cambridge, hosted by Professor Richard Samworth. He obtained his Ph.D. degree at the Department of Operations Research and Financial Engineering (ORFE) at Princeton University, advised by Professor Jianqing Fan. |