Blogs

Transection - Transformers for English to Chinese Translation

This post presents how to train a sequence to sequence Transformer model for English to Chinese translation, nicely abbreviated to Transection. We adopt BART’s (Lewis, Mike, et al. 2019) architecture for this model and train it in two ways. The first way is to train it from scratch and second way is to fine-tune it from BART’s pre-trained base checkpoint that is available in 🤗transformers. The training examples consist of around 5M English-Chinese sequence pairs, along with a test set that has around 40k pairs. Later in this blog, the performance on the test set measured by sacrebleu is compared between the two ways. In addition, a popular pre-trained model in this domain, i.e., Helsinki-NLP/opus-mt-en-zh from 🤗Huggingface’s models hub is used as a baseline to the two ways. Not only in sacrebleu, their performance in other metrics such as generalisation, model size, and training cost is also discussed.

Written on September 30, 2020
Read More

Autocoder - Finetuning GPT-2 for Auto Code Completion

TL;DR. This link provides the code repository that contains two readily downloadable fine-tuned GPT-2 weights, a quick start guide of how to customize Autocoder, and a list of future pointers to this project. Although this blog looks like a technical introduction to Autocoder, I also by the way talk about a lot of relevant stuff, such as nice work, status quo, and future directions in NLP.

Written on June 21, 2020
Read More

Expermenting Deep Models for Text Classification

Text classification as an important task in natural lanugage understanding (NLP) has been widely studied over the last several decades. The task describes input as a document and output as the category of which the document belongs to. In literature, both supervised and unsupervised methods have been applied for text classification.

Written on October 15, 2019
Read More

Run AllenNLP on Windows

As I said, currently, I am a big fun of AI2. This began when I got knowing its work into an open-source NLP research library - AllenNLP. The more I hack the library, the more attractive it is for me. A short paragraph of praise below was written when I was hacking the tool.

Written on September 17, 2019
Read More