291K Deep Learning for Machine Translation (Fall 2021)

Course Description

This course will teach modern deep learning methods and latest research frontiers for neural machine translation. It will cover basics, history, model architectures, training methods, decoding, data and evaluation methods for neural machine translation. Models for discrete sequences include Long-short term memory, Transformer, and encoder-decoder frameworks for generating language. It will cover learning techniques including back-translation, knowledge distillation, self-supervised pre-training. The course will introduce cutting edge research on multilingual machine translation, low-resource translation, and speech translation, as well as engineering techniques to speed-up computation for neural models. This course will also invite industry leaders to share experience in building real machine translation products.

Instructor

Lei Li

Time and Location

Monday/Wednesday 11:00am-12:50pm Phelps 3526 (also on zoom)

Office hour:

Via zoom with instructor by appointment.

Textbook

(Optional) Neural Machine Translation, Philipp Koehn, ISBN-10: 1108497322, Publisher: Cambridge University Press. preprint version available online.
(Optional) Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville, Publisher: MIT Press. available online.
(Optional) Dive into Deep Learning, Aston Zhang, Zachary Lipton, Mu Li, Alexander Smola. available online.
(Optional) Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Emily Bender. Publisher: Morgan & Claypool.

Prerequisites

Prerequisites: 130A or 130B; 165A or 165B.

Grading

Homework

HW1-3: separate assignments (10% each)
HW4: In-class Presentation Language in 10 mins (10%)
HW5: MT Blog (15%). Reading List
Turn-in homework at Gradescope: https://www.gradescope.com/courses/319418

Project: 40%
Participation in active discussion (5)

Discussion Forum

Piazza: https://piazza.com/class/ksousnwx3cl1ux
Piazza is main channel of communication. Questions can be posted and discussed here.

Policy

Please read the following Link carefully!

Syllabus

#	Date	Lecture Topic	Reading	Homework
1	M 9/27	Introduction to MT, History, Probability, Statsitical MT	Weaver 1949, Brown 1993	HW1
2	W 9/29	Data, Vocabulary and Evaluation	Papineni 2002 BLEU Post 2018 SacreBLEU Freitag 2021 Human evaluation Sennrich 2016 BPE
3	M 10/4	Basic Neural Network Layers, Embedding, Model Training	Chap 6 of DL book.
4	W 10/6	CNN	Chap 9 of DL Book. He 2016 ResNet Kalchbrenner 2014 CNN Sequence
5	M 10/11	Encoder-decoder, LSTM	Sutskever 2014 Seq2seq. Bahdanau 2015 Attention Luong 2015 Attention Gers LSTM-Forget 2000.	HW1 due, HW2
6	W 10/13	Transformer	Vaswani Transformer 2017
7	M 10/18	Decoding		Project Proposal Due
8	W 10/20	Pre-training Language Models	Devlin 2019 BERT Peters 2018 ELMo
9	M 10/25	BERT for MT and Learned Metrics	CTNMT (BERT-NMT) BERTScore COMET	HW2 due, HW3
10	W 10/27	Semi-supervised and Unsupervised MT	Back Translation Semi-supervised MT Unsupervised MT, Artetxe 2018 Unsupervised MT, Lample 2018
11	M 11/1	Latent Generative Models for MT	VAE Sentence VAE Mirror Generative NMT
12	W 11/3	Multilingual Neural Machine Translation	Google-MNMT mTransformer Serial Adapter Parallel adapter - CIAT LaSS, Prune-tune
13	M 11/8	Speech Representation Learning - Guest Lecture by Dr. Michael Auli from Facebook	Wav2vec, wav2vec2.0, wav2vec-U.	HW3 due
14	W 11/10	Seq2seq Pre-training for NMT	BART, MASS	Project midterm report due
15	M 11/15	Multilingual Pre-training for NMT	mRASP & mRASP2 mBART Graformer
16	W 11/17	Speech Translation (1)	COSTT, CTC
17	M 11/22	Speech Translation (2)	Chimera, XSTNet, LUT, FAT-ST
18	W 11/24	Advanced vocabulary learning	VOLT
19	M 11/29	Parallel Decoding , and Machine Translation: From Research to Industry - Guest Lecture by Dr. Mingxuan Wang from ByteDance	NAT, CMLM, GLAT, KSTER
20	W 12/1	Research and Development for Customizable Neural Machine Translation of Text and Speech - Guest Lecture by Dr. Evgeny Matusov from AppTek
	M 12/6	Project Poster Presentation (1-3pm)		Final project report due

Topics not covered in the class: Interactive Machine Translation for Computer-aided Translation, Translation models applied to other area, other text generation problems.