Xiang Li

prof_pic.jpg

I am currently a post-doctoral researcher at University of Pennsylvania, working with Prof. Qi Long and Prof. Weijie Su. I obtained my Ph.D. in statistics from School of Mathematical Sciences, Peking University in 2023, advised by Prof. Zhihua Zhang. Before that, I earned a B.S. in Statistics and a B.A. in Economics from Peking University in 2018.

My research interests center around the intersection of statistics, stochastic optimization, and machine learning. During my Ph.D., I worked on federated learning (communication efficiency and data heterogeneity), stochastic approximation (asymptotic behavior and convergence analysis), online decision-making (sample complexity and robustness), and online statistical inference.

More recently, my research has focused on large language models, where I study statistical questions—such as detection, inference, and robustness—while treating the models as black boxes without relying on their internal parameters or architecture. One key area is statistical watermarking, where I develop and analyze methods for embedding and detecting watermarks in generated text. My work aims to enhance the reliability and robustness of these techniques with provable guarantees, with a broader goal of advancing LLM inference and usage.

I am on the job market in the 2025–2026 cycle, mainly considering academic opportunities, with an interest in scientific research roles if pursuing positions in industry.

Contact Info: lx10077 at upenn dot cn

News

Jun 15, 2025 Attend 2025 ICSA, where I’ll be giving a talk on robust watermark detection and presenting a tutorial on LLMs.
Apr 1, 2025 Excited to receive the IMS New Researcher Travel Award.
Nov 5, 2024 Attend 2024 SLDS. First time to visit California.
Aug 15, 2024 Chair a session on Federated Learning at 2024 MOPTA.
Jul 12, 2024 Attend 2024 JCSDS. Great to catch up with old friends and meet new ones!
Show More

Selected Publications

  1. A statistical framework of watermarks for large language models: Pivot, detection efficiency and optimal rules
    Xiang Li, Feng Ruan, Huiyuan WangQi Long, and Weijie Su
    The Annals of Statistics, 2025
  2. Evaluating the unseen capabilities: How many theorems do LLMs know?
    Xiang Li, Jiayi Xin, Qi Long, and Weijie Su
    arXiv preprint arXiv:2506.02058, 2025
  3. On the convergence of FedAvg on non-iid data
    Xiang Li*, Kaixuan Huang*, Wenhao Yang*, Shusen Wang, and Zhihua Zhang
    In International Conference on Learning Representations, 2020, 🎤 Oral presentation
  4. Variance-aware decision making with linear function approximation with heavy-tailed rewards
    Xiang Li, and Qiang Sun
    Transactions on Machine Learning Research, 2024, 🎓 Invited to present in ICLR 2025
  5. Online statistical inference for nonlinear stochastic approximation with Markovian data
    Xiang LiJiadong Liang, and Zhihua Zhang
    arXiv preprint arXiv:2302.07690, 2023