Xiang Li

prof_pic.jpg

I am a postdoctoral researcher in Statistics at the University of Pennsylvania, working with Prof. Qi Long and Prof. Weijie Su. I received my Ph.D. in Statistics from the School of Mathematical Sciences, Peking University in 2023, advised by Prof. Zhihua Zhang. Before that, I earned double bachelor’s degrees in Statistics and Economics at Peking University in 2018.

My research interests lie broadly at the intersection of statistics, optimization, and machine learning, with applications spanning data science and artificial intelligence. My current research focuses on the statistical and algorithmic foundations of reliable AI, with emphasis on large language models (LLMs). I investigate statistical watermarking to ensure the provenance and robustness of AI-generated content and develop tools to evaluate how LLMs encode and use knowledge.

Earlier, during my Ph.D., I designed methods for learning with heterogeneous and online data, addressing challenges such as communication efficiency in federated learning, robustness under data heterogeneity, and uncertainty quantification in streaming and decision-making problems. These experiences continue to shape my perspective on building scalable and trustworthy data-driven systems.

I am currently on the academic job market for the 2025–2026 cycle, seeking faculty positions in data science, statistics, mathematics, machine learning, and related fields. I am open to discussions about potential opportunities and collaborations.

Contact Info: lx10077 at upenn dot edu
Curriculum Vitae: CV

News

Oct 25, 2025 I’ll attend 2025 INFORMS annual meeting and would be happy to discuss any faculty opportunities.
Sep 18, 2025 Two papers accepted to NeurIPS 2025 as spotlights: one on the empirical evaluation of goodness-of-fit tests for watermark detection, and the other on mitigating the privacy–utility trade-off in decentralized federated learning.
Aug 4, 2025 Excited to present my recent work on estimating watermark proportion at JSM 2025.
Jun 15, 2025 Attend 2025 ICSA, where I’ll be giving a talk on robust watermark detection and presenting a short course on LLM watermarking. Lecture slides are here.
Apr 1, 2025 Excited to receive the IMS New Researcher Travel Award.
Show More

Selected Publications

  1. A statistical framework of watermarks for large language models: Pivot, detection efficiency and optimal rules
    Xiang LiFeng RuanHuiyuan WangQi Long, and Weijie J. Su
    The Annals of Statistics, 2025, 🏛️ Invited talk at AoS invited paper session, JSM 2025
  2. Robust detection of watermarks in large language models under human edits
    Xiang LiFeng RuanHuiyuan WangQi Long, and Weijie J. Su
    Journal of the Royal Statistical Society: Series B, 2025, 🏆 IMS New Researcher Travel Award
  3. Evaluating the unseen capabilities: How many theorems do LLMs know?
    Xiang Li, Jiayi Xin, Qi Long, and Weijie J. Su
    arXiv preprint arXiv:2506.02058, 2025
  4. On the convergence of FedAvg on non-iid data
    Xiang Li*, Kaixuan Huang*, Wenhao Yang*Shusen Wang, and Zhihua Zhang
    In International Conference on Learning Representations, 2020, 🎤 Oral presentation
  5. Statistical estimation and online inference via Local SGD
    In Conference on Learning Theory, 2022