291K Deep Learning for Machine Translation (Fall 2021)
Course Description
This course will teach modern deep learning methods and latest research
frontiers for neural machine translation. It will cover basics, history,
model architectures, training methods, decoding, data and evaluation
methods for neural machine translation. Models for discrete sequences
include Long-short term memory, Transformer, and encoder-decoder
frameworks for generating language. It will cover learning techniques
including back-translation, knowledge distillation, self-supervised
pre-training. The course will introduce cutting edge research on
multilingual machine translation, low-resource translation, and speech
translation, as well as engineering techniques to speed-up computation for
neural models. This course will also invite industry leaders to share
experience in building real machine translation products.
Instructor
Lei Li
Time and Location
Monday/Wednesday 11:00am-12:50pm Phelps 3526 (also on zoom)
Office hour:
Via zoom with instructor by appointment.
Textbook
- (Optional) Neural Machine Translation, Philipp Koehn, ISBN-10:
1108497322, Publisher: Cambridge University Press. preprint version
available online.
- (Optional) Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron
Courville, Publisher: MIT Press. available online.
- (Optional) Dive into Deep Learning, Aston Zhang, Zachary Lipton, Mu
Li, Alexander Smola. available online.
- (Optional) Linguistic Fundamentals for Natural Language Processing:
100 Essentials from Morphology and Syntax. Emily Bender. Publisher:
Morgan & Claypool.
Prerequisites
Prerequisites: 130A or 130B; 165A or 165B.
Grading
- Homework
- Project: 40%
- Participation in active discussion (5)
Discussion Forum
Piazza:
https://piazza.com/class/ksousnwx3cl1ux
Piazza is main channel of communication. Questions can be posted and
discussed here.
Policy
Please read the following Link
carefully!
Syllabus
#
|
Date
|
Lecture Topic
|
Reading
|
Homework
|
1
|
M 9/27
|
Introduction to MT, History,
Probability, Statsitical MT
|
Weaver
1949, Brown
1993 |
HW1 |
2
|
W 9/29
|
Data, Vocabulary and Evaluation
|
Papineni
2002 BLEU
Post
2018 SacreBLEU
Freitag
2021 Human evaluation
Sennrich
2016 BPE |
|
3
|
M 10/4
|
Basic Neural Network Layers,
Embedding, Model Training
|
Chap 6 of DL book. |
|
4
|
W 10/6
|
CNN
|
Chap
9 of DL Book.
He
2016 ResNet
Kalchbrenner 2014 CNN
Sequence |
|
5
|
M 10/11
|
Encoder-decoder, LSTM |
Sutskever
2014 Seq2seq.
Bahdanau 2015 Attention
Luong 2015 Attention
Gers
LSTM-Forget 2000.
|
HW1 due, HW2
|
6 |
W 10/13 |
Transformer |
Vaswani
Transformer 2017 |
|
7
|
M 10/18 |
Decoding
|
|
Project Proposal Due |
8
|
W 10/20 |
Pre-training Language Models
|
Devlin 2019 BERT
Peters 2018 ELMo
|
|
9
|
M 10/25 |
BERT for MT and Learned Metrics |
CTNMT (BERT-NMT)
BERTScore
COMET |
HW2 due, HW3 |
10
|
W 10/27 |
Semi-supervised
and Unsupervised MT |
Back Translation
Semi-supervised MT
Unsupervised MT, Artetxe
2018
Unsupervised MT,
Lample 2018 |
|
11
|
M 11/1 |
Latent Generative Models for MT
|
VAE
Sentence VAE
Mirror Generative
NMT |
|
12
|
W 11/3 |
Multilingual Neural
Machine Translation |
Google-MNMT
mTransformer
Serial Adapter
Parallel adapter - CIAT
LaSS, Prune-tune
|
|
13
|
M 11/8 |
Speech Representation Learning
- Guest Lecture by Dr. Michael Auli from Facebook |
Wav2vec, wav2vec2.0,
wav2vec-U.
|
HW3 due |
14
|
W 11/10 |
Seq2seq Pre-training for NMT
|
BART,
MASS
|
Project midterm report due |
15
|
M 11/15 |
Multilingual Pre-training for
NMT |
mRASP & mRASP2
mBART
Graformer
|
|
16
|
W 11/17 |
Speech Translation (1)
|
COSTT, CTC |
|
17
|
M 11/22 |
Speech Translation (2)
|
Chimera, XSTNet,
LUT, FAT-ST |
|
18
|
W 11/24 |
Advanced vocabulary learning |
VOLT
|
|
19
|
M 11/29 |
Parallel Decoding ,
and
Machine Translation: From Research to Industry - Guest Lecture by
Dr. Mingxuan Wang from ByteDance |
NAT, CMLM,
GLAT,
KSTER |
|
20
|
W 12/1 |
Research and Development for Customizable Neural Machine
Translation of Text and Speech - Guest Lecture by Dr. Evgeny Matusov
from AppTek |
|
|
|
M 12/6 |
Project Poster Presentation (1-3pm) |
|
Final project report due |
Topics not covered in the class: Interactive Machine Translation for
Computer-aided Translation, Translation models applied to other area, other
text generation problems.