Lei Li

I am developing scalable algorithms to learn and mine knowledge from data, with applications in NLP, machine translation, time series analysis, AI drug discovery, and robot learning.


  • One paper accepted to InterSpeech 2021, about multi-task progressive pretraining for speech translation, achieving new SOTA results on MuST-C benchmarks.
  • 11 papers to appear at ACL 2021 (6 long, 4 findings, and 1 system demo). Strong results in machine translation and speech translation. Other topics include parallel generation, reasoning, summarization and information extraction.
  • 1 paper is accepted to ICML 2021 about long horizon skill learning.
  • 4 papers (1 main and 3 industry) are presenting at NAACL 2021. Check out the long paper about how visual imagination will influence machine translation capability.
  • Four papers on object detection and segmentation are accepted to CVPR 2021, including Sparse R-CNN, DenseCL, Locate-Segment, Auto-Augment. DenCL is accepted as Oral.
  • The paper on finding proper molecules for drug is accepted to ICLR 2021 with the spotlight presentation!
  • Six papers are accepted to AAAI 2021, about end-to-end speech translation, knowledge graph completion, optimization, text generation.
  • One paper about new method to generate query-relevant bidwords for search advertising is accepted to WSDM 2021.
  • SOLOv2 is out! One paper about faster object instance segmentation in images is accepted to NeurIPS 2020.
  • Winner of 5 tasks in WMT20 Machine Translation Contest on Chinese-English, German-English, French-German, English-Khmer, English-Pashto languages. Winner of the WMT20 parallel data filtering task on Khmer and Pashto languages.
  • 5 papers accepted to EMNLP 2020! 3 in Long track and 2 in Findings.
  • SOLO paper accepted to ECCV 2020, achieving the SOTA in visual object instance segmentation.
  • 1 paper accepted to ICML 2020, about solving a family of deep latent models (exponential family mixture VAEs).
  • 1 paper and 1 demo accepted to ACL 2020, about tailoring pretrained language model and the robot reporter Xiaomingbot.
  • I am giving a talk at ICLR 2020 about Learning Deep Latent Models for Text Sequences. You may watch here.
  • 1 paper accepted to AIStats 2020, about density ratio estimation for text generation.
  • 2 papers accepted at ICLR 2020, about mirror generative model to unite language modelling and machine translation, and learning data-to-text generation templates via a variational method even without parallel corpus.
  •  4 papers accepted at AAAI 2020, about pretraining method for neural machine translation, text editing, and approximate second order optimization.
  • 1 paper accepted at NeurIPS 2019, about contextualized embedding for text generation and how we use kernels to model distribution and variance of word embeddings. see you in Vancouver.
  • EMNLP 2019 Tutorial on Discreteness in NLP
  • 1 paper accepted at INLG 2019. It is about the style transfer for text generation .
  • 1 paper accepted at EMNLP 2019, about linear time neural machine translation.
  • 2 papers accepted at ICCV 2019. One is to be presented as an Oral talk.
  • Dr. Hao Zhou and I are going to give a tutorial on deep generative models for text generation at NLPCC-ADL 2019 at Dunhuang, China.

Media Coverage


Email: <the first part of this website> +  gmail server address.