Blog Home

How can we scientifically assess knowledge contained in a large language model?

John LiLess than 1 minute

Protecting Intellectual Property of NLP Models

How to protect AI models from malicious use or being stolen?

Vivian RossAbout 5 min

Learning Optimal Vocabularies for Machine Translation with only CPU

Constructing a vocabulary is a fisrt step for any NLP tasks. How can we efficiently learn an optimal vocabulary for machine translation? In this blog, I will explain the VOLT algorithm from the paper Vocabulary Leaning via Optimal Transport for Neural Machine Translation, which was awarded the Best Paper at ACL 2021.

Ahmed ElkordyAbout 9 min

Automatic Verification of Natural Language Claims

How to develop a model to verify a natural language statement while explaining its rationale?

Reading Time: About 10 minutes.

Ziyue WangAbout 8 min

Accelerating the Computation on GPUs for Natural Language Processing

A high performance open-source library for NLP Transformer model training and inferencing.

Bowen ZhangAbout 4 min

Neural Machine Translation with Monolingual Translation Memory

Hello fellow readers! In this post, I would like to share a recent advance in the field of Machine Translation. Specifically, I will be presenting the paper Neural Machine Translation with Monolingual Translation Memory by Cai et al, which received one of the six distinguished paper awards from ACL 2021.

Rajan SainiAbout 8 min

Revisiting Self-training for Neural Sequence Generation

Self-training is a very prevalent semi-supervised method. Its key idea is to augment the original labeled dataset with unlabeled data paired with the model's prediction (i.e. the pseudo-parallel data). Self-training has been widely used in classification tasks. However, will it work on sequence generation tasks (e.g. machine translation)? If so, how does it work? This blog introduces a work [1] which investigates these questions and gives the answers.

Zekun LiAbout 5 min

Automatic Machine Translation Evaluation - COMET Explained

While the advance in deep learning has dramatically improved the machine translation quality, there is little development in the evaluation of machine translation models. The most widely-used metrics like BLEU [Papineni et al., 2002] and METEOR [Lavie and Denkowski, 2009] simply match the n-gram between the hypothesis text and reference text, which is too rigid without considering the variance in ground-truth translations and fail to differentiate the current highest performance machine translation models. They also cannot be accurately correlated with human judgment for a piece of text.

Xinyi WangAbout 3 min

Break the Limitation of Training Data — A Better Encoder Enhanced by BERT for Speech Translation

Speech translation (ST) has increasing demand in our daily life and work. Applications like travel assistant, simultaneous conference translation and movie subtitling can highly reduce translation costs. Building a ST system that can understand and directly translate acoustic speech signals into text in a target language is challenging. For example, people do not always premeditate what they are going to say. Not like text translation, ST lacks completed organization sometimes. Another part is that the parallel corpus for ST is not enough, compared to the MT task. Especially, most ST methods are limited by the amount of parallel corpus.

Zichen ChenAbout 5 min

Recurrent Attention for Neural Machine Translation

Upon its emergence, the Transformer Neural Networks [1] dominates the sequence-to-sequence tasks. It even outperforms the Google Neural Machine Translation model in specific tasks. Specifically, the multi-head attention mechanism that depends on element-wise dot-product is deemed as one of the critical building blocks to get things to work. But is it really that important?

Jiachen LiAbout 5 min

Blog about Machine Learning, Natural Language Processing, and AI for Science