Lei Li

Carnegie Mellon University

I am an Assistant Professor in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University. I am also affiliated with CMU CyLab and University of California Santa Barbara Computer Science Department. I work on generative AI for language and science, including multilingual NLP, machine translation (text, speech), security of large language models, agentic LLM, and AI for drug discovery and protein design.

Students at CMU are welcome to send me email regarding indendent/direct study, internship, and capstone projects.

  • LLM Reasoning (weak LLM helps strong LLM). [arxiv]
  • LLM Watermark: Unigram-Watermark to detect AI generated text, and GINSEW to defend against model extraction attack.
  • CGMH: A method for controllable text generation from specified keywords. [arxiv]
  • Analyzing anisotropic sentence embeddings from pre-trained language models. [arxiv]
  • Data-efficient methods for many-to-many neural machine translation. [mRASP, mRASP2]
  • Glancing Transformer (GLAT): non-autoregressive translation models are equally good as autoregressive Transformer. [arxiv]
  • VOLT: Learning vocabulary via optimal transport (ACL 2021 Best Paper). [arxiv]
  • SOLO: Fast and accurate object instance segmentation. [SOLO, SOLOv2]