291K Deep Learning for Machine Translation (Fall 2021)

Course Description

This course will teach modern deep learning methods and latest research frontiers for neural machine translation. It will cover basics, history, model architectures, training methods, decoding, data and evaluation methods for neural machine translation. Models for discrete sequences include Long-short term memory, Transformer, and encoder-decoder frameworks for generating language. It will cover learning techniques including back-translation, knowledge distillation, self-supervised pre-training. The course will introduce cutting edge research on multilingual machine translation, low-resource translation, and speech translation, as well as engineering techniques to speed-up computation for neural models. This course will also invite industry leaders to share experience in building real machine translation products.

Instructor

Lei Li

Time and Location

Monday/Wednesday 11:00am-12:50pm  Phelps 3526 (also on zoom)

Office hour:

Via zoom with instructor by appointment.

Textbook

Prerequisites

Prerequisites: 130A or 130B; 165A or 165B.

Grading

Discussion Forum

Piazza: https://piazza.com/class/ksousnwx3cl1ux
Piazza is main channel of communication. Questions can be posted and discussed here.

Policy

Please read the following Link carefully!

Syllabus

#
Date
Lecture Topic
Reading
Homework
1
M 9/27
Introduction to MT, History, Probability, Statsitical MT
Weaver 1949, Brown 1993 HW1
2
W 9/29
Data, Vocabulary and Evaluation Papineni 2002 BLEU
Post 2018 SacreBLEU
Freitag 2021 Human evaluation
Sennrich 2016 BPE

3
M 10/4
Basic Neural Network Layers, Embedding, Model Training
Chap 6 of DL book.
4
W 10/6
CNN
Chap 9 of DL Book.
He 2016 ResNet
Kalchbrenner 2014 CNN Sequence

5
M 10/11
Encoder-decoder, LSTM Sutskever 2014 Seq2seq.
Bahdanau 2015 Attention
Luong 2015 Attention
Gers LSTM-Forget 2000.
HW1 due, HW2
6 W 10/13 Transformer Vaswani Transformer 2017
7
M 10/18 Decoding

Project Proposal Due
8
W 10/20 Pre-training Language Models
Devlin 2019 BERT
Peters 2018 ELMo

9
M 10/25 BERT for MT and Learned Metrics CTNMT (BERT-NMT)
BERTScore
COMET
HW2 due, HW3
10
W 10/27 Semi-supervised and Unsupervised MT Back Translation
Semi-supervised MT
Unsupervised MT, Artetxe 2018
Unsupervised MT, Lample 2018

11
M 11/1 Latent Generative Models for MT VAE
Sentence VAE
Mirror Generative NMT

12
W 11/3 Multilingual Neural Machine Translation Google-MNMT
mTransformer
Serial Adapter
Parallel adapter - CIAT
LaSS, Prune-tune

13
M 11/8 Speech Representation Learning - Guest Lecture by Dr. Michael Auli from Facebook Wav2vec, wav2vec2.0, wav2vec-U.
HW3 due
14
W 11/10 Seq2seq Pre-training for NMT BART, MASS
Project midterm report due
15
M 11/15 Multilingual Pre-training for NMT mRASP & mRASP2
mBART
Graformer

16
W 11/17 Speech Translation (1) COSTT, CTC
17
M 11/22 Speech Translation (2) Chimera, XSTNet, LUT, FAT-ST
18
W 11/24 Advanced vocabulary learning VOLT

19
M 11/29 Parallel Decoding , and
Machine Translation: From Research to Industry - Guest Lecture by Dr. Mingxuan Wang from ByteDance
NAT, CMLM, GLAT,
KSTER

20
W 12/1 Research and Development for Customizable Neural Machine Translation of Text and Speech - Guest Lecture by Dr. Evgeny Matusov from AppTek


M 12/6 Project Poster Presentation (1-3pm)
Final project report due
Topics not covered in the class: Interactive Machine Translation for Computer-aided Translation, Translation models applied to other area, other text generation problems.