ParlAI Quick-start¶

Authors: Alexander Holden Miller, Margaret Li

Colab Tutorial¶

As an alternative to this quick start tutorial, you may also consider our Google Colab tutorial, which takes you through fine-tuning the small version of BlenderBot (90M).

Install¶

First, make sure you have Python 3. Now open up terminal and run the following.

Clone ParlAI Repository:

git clone https://github.com/facebookresearch/ParlAI.git ~/ParlAI

Install ParlAI:

cd ~/ParlAI; python setup.py develop

This will add the parlai command to your system.

Several models have additional requirements, such as PyTorch.

View a task & train a model¶

Let’s start by printing out the first few examples of the bAbI tasks, task 1.

# display examples from bAbI 10k task 1
parlai display_data --task babi:task10k:1

Now let’s try to train a model on it (even on your laptop, this should train fast).

# train MemNN using batch size 1 and for 5 epochs
parlai train_model --task babi:task10k:1 --model-file /tmp/babi_memnn --batchsize 1 --num-epochs 5 --model memnn --no-cuda

Let’s print some of its predictions to make sure it’s working.

# display predictions for model save at specified file on bAbI task 1
parlai display_model --task babi:task10k:1 --model-file /tmp/babi_memnn --eval-candidates vocab

The “eval_labels” and “MemNN” lines should (usually) match!

Let’s try asking the model a question ourselves.

# interact with saved model
parlai interactive --model-file /tmp/babi_memnn --eval-candidates vocab
...
Enter your message: John went to the hallway.\n Where is John?

Hopefully the model gets this right!

Train a Transformer on Twitter¶

Now let’s try training a Transformer (Vaswani, et al 2017) ranker model. Make sure to complete this section on a GPU with PyTorch installed.

We’ll be training on the Twitter task, which is a dataset of tweets and replies. There’s more information on tasks in these docs, including a full list of tasks and instructions on specifying arguments for training and evaluation (like the -t <task> argument used here).

Let’s begin again by printing the first few examples.

# display first examples from twitter dataset
parlai display_data --task twitter

Now, we’ll train the model. This will take a while to reach convergence.

# train transformer ranker
parlai train_model --task twitter --model-file /tmp/tr_twitter --model transformer/ranker --batchsize 16 --validation-every-n-secs 3600 --candidates batch --eval-candidates batch --data-parallel True

You can modify some of the command line arguments we use here -we set batch size to 10, run validation every 3600 seconds, and take candidates from the batch for training and evaluation.

The train model script will by default save the model after achieving best validation results so far. The Twitter task is quite large, and validation is run by default after each epoch (full pass through the train data), but we want to save our model more frequently so we set validation to run once an hour with -vtim 3600.

This train model script evaluates the model on the valid and test sets at the end of training, but if we wanted to evaluate a saved model -perhaps to compare the results of our newly trained Transformer against the BlenderBot 90M baseline from our Model Zoo, we could do the following:

# Evaluate the tiny BlenderBot model on twitter data
parlai eval_model --task twitter --model-file zoo:blender/blender_90M/model

Finally, let’s print some of our transformer’s predictions with the same display_model script from above.

# display predictions for model saved at specific file on twitter
parlai display_model --task twitter --model-file /tmp/tr_twitter --eval-candidates batch

Add a simple model¶

Let’s put together a super simple model which will print the parsed version of what is said to it.

First let’s set it up.

mkdir parlai/agents/parrot
touch parlai/agents/parrot/parrot.py

We’ll inherit the TorchAgent parsing code so we don’t have to write it ourselves. Open parrot.py and copy the following:

from parlai.core.torch_agent import TorchAgent, Output

class ParrotAgent(TorchAgent):
    def train_step(self, batch):
        pass

    def eval_step(self, batch):
        # for each row in batch, convert tensor to back to text strings
        return Output([self.dict.vec2txt(row) for row in batch.text_vec])

    def build_model(self, batch):
        # Our agent doesn't have a real model, so we will return a placeholder
        # here.
        return None

Now let’s test it out:

parlai display_model --task babi:task10k:1 --model parrot

You’ll notice the model is always outputting the “unknown” token. This token is automatically selected because the dictionary doesn’t recognize any tokens, because we haven’t built a dictionary yet. Let’s do that now.

parlai build_dict --task babi:task10k:1 --dict-file /tmp/parrot.dict

Now let’s try our Parrot agent again.

parlai display_model --task babi:task10k:1 --model parrot --dict-file /tmp/parrot.dict

This ParrotAgent implements eval_step, one of two abstract functions in TorchAgent. The other is train_step. You can easily and quickly build a model agent by creating a class which implements only these two functions with the most typical custom code for a model, and inheriting vectorization and batching from TorchAgent.

As needed, you can also override any functions to change the default argument values or to override the behavior with your own. For example, you could change the vectorizer to return numpy arrays instead of Torch Tensors.

Conclusion¶

To see more details about ParlAI’s general structure, how tasks and models are set up, or how to use Mechanical Turk, Messenger, Tensorboard, and more –check out the other tutorials.