Model Zoo

This is a list of pretrained ParlAI models. They are listed by task, or else in a pretraining section (at the end) when meant to be used as initialization for fine-tuning on a task.

Squad models

Drqa Squad Model

[external website]

DrQA Reader trained on SQuAD

Example invocation(s):

parlai eval_model -mf zoo:drqa/squad/model -t squad -dt test

{'exs': 10570, 'accuracy': 0.6886, 'f1': 0.7821, 'hits@1': 0.689, 'hits@5': 0.689, 'hits@10': 0.689, 'hits@100': 0.689, 'bleu': 0.1364, 'train_loss': 0}

Wikipedia models

Wikipedia Retriever (Used For Open Squad)

[external website]

Retrieval over Wikipedia dump, used for DrQA on the open squad dataset. This is the dump from the original paper, used for replicating results.

Example invocation(s):

parlai interactive --model tfidf_retriever -mf zoo:wikipedia_20161221/tfidf_retriever/drqa_docs

Enter Your Message: Yann LeCun
[candidate_scores]: [507.05804682 390.18244433 279.24033928 269.60377042 214.00140589]
[SparseTfidfRetrieverAgent]:
Deep learning (also known as deep structured learning, hierarchical learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data. In a simple case, you could have two sets of neurons: ones that receive an input signal and ones that send an output signal. When the input layer receives an input it passes on a modified version of the input to the next layer. In a deep network, there are many layers between the input and output (and the layers are not made of neurons but it can help to think of it that way), allowing the algorithm to use multiple processing layers, composed of multiple linear and non-linear transformations.

Deep learning is part of a broader family of machine learning methods based on ...
to commonsense reasoning which operates on concepts in terms of production rules of the grammar, and is a basic goal of both human language acquisition and AI. (See also Grammar induction.)

Wikipedia Retriever (Used For Wizard Of Wikipedia)

[related project]

Retrieval over Wikipedia dump, used for DrQA on the open squad dataset.

Example invocation(s):

parlai interactive --model tfidf_retriever -mf zoo:wikipedia_full/tfidf_retriever/model

Enter Your Message: Yann LeCun
[candidate_scores]: [454.74038503 353.88863708 307.31353203 280.4501096  269.89960432]
[SparseTfidfRetrieverAgent]:
Yann LeCun (; born 1960) is a computer scientist with contributions in machine learning, computer vision, mobile robotics and computational neuroscience. He is well known for his work on optical character recognition and computer vision using convolutional neural networks (CNN), and is a founding father of convolutional nets. He is also one of the main creators of the DjVu image compression technology (together with Léon Bottou and Patrick Haffner). He co-developed the Lush programming language with Léon Bottou.

Yann LeCun was born near Paris, France, in 1960. He received a Diplôme d'Ingénieur from the Ecole Superieure d'Ingénieur en Electrotechnique et Electronique (ESIEE), Paris in 1983, and a PhD in Computer Science from Université Pierre et Marie Curie in 1987 during which he ...
of Science and Technology in Saudi Arabia because he was considered a terrorist in the country in view of his atheism.

In 2018 Yann LeCun picked a fight with a robot to support Facebook AI goals.

Wizard Of Wikipedia models

Wizard Of Wikipedia (End To End Generator)

[related project]

End2End Generative model for Wizard of Wikipedia

Example invocation(s):

parlai display_model -t wizard_of_wikipedia:generator -mf zoo:wizard_of_wikipedia/end2end_generator/model -n 1 --display-ignore-fields knowledge_parsed

[chosen_topic]: Gardening
[knowledge]: no_passages_used __knowledge__ no_passages_used
Gardening __knowledge__ Gardening is the practice of growing and cultivating plants as part of horticulture.
Gardening __knowledge__ In gardens, ornamental plants are often grown for their flowers, foliage, or overall appearance; useful plants, such as root vegetables, leaf vegetables, fruits, and herbs, are grown for consumption, for use as dyes, or for medicinal or cosmetic use.
Gardening __knowledge__ Gardening is considered by many people to be a relaxing activity.
Gardening __knowledge__ Gardening ranges in scale from fruit orchards, to long boulevard plantings with one or more different types of shrubs, trees, and herbaceous plants, to residential yards including lawns and foundation plantings, to plants in large or small containers ...
there had been several other notable gardening magazines in circulation, including the "Gardeners' Chronicle" and "Gardens Illustrated", but these were tailored more for the professional gardener.

[title]: Gardening
[checked_sentence]: Gardening is considered by many people to be a relaxing activity.
[eval_labels_choice]: I live on a farm, we garden all year long, it is very relaxing.
[checked_sentence_parsed]: Gardening __knowledge__ Gardening is considered by many people to be a relaxing activity.
[WizTeacher]: Gardening
I like Gardening, even when I've only been doing it for a short time.
[eval_labels: I live on a farm, we garden all year long, it is very relaxing.]
[TorchAgent]: i love gardening , it is considered a relaxing activity .

Wizard Of Wikipedia (Full Dialogue Retrieval Model)

[related project]

Full Dialogue Retrieval Model for Wizard of Wikipedia

Example invocation(s):

parlai display_model -t wizard_of_wikipedia -mf zoo:wizard_of_wikipedia/full_dialogue_retrieval_model/model -m projects:wizard_of_wikipedia:wizard_transformer_ranker --n-heads 6 --ffn-size 1200 --embeddings-scale False --delimiter ' __SOC__ ' --n-positions 1000 --legacy True

[chosen_topic]: Gardening
[knowledge]: Gardening Gardening is the practice of growing and cultivating plants as part of horticulture.
Gardening In gardens, ornamental plants are often grown for their flowers, foliage, or overall appearance; useful plants, such as root vegetables, leaf vegetables, fruits, and herbs, are grown for consumption, for use as dyes, or for medicinal or cosmetic use.
Gardening Gardening is considered by many people to be a relaxing activity.
Gardening Gardening ranges in scale from fruit orchards, to long boulevard plantings with one or more different types of shrubs, trees, and herbaceous plants, to residential yards including lawns and foundation plantings, to plants in large or small containers grown inside or outside.
Gardening Gardening may be very specialized, with only one type of plant grown, ...
there had been several other notable gardening magazines in circulation, including the "Gardeners' Chronicle" and "Gardens Illustrated", but these were tailored more for the professional gardener.

[title]: Gardening
[checked_sentence]: Gardening is considered by many people to be a relaxing activity.
[eval_labels_choice]: I live on a farm, we garden all year long, it is very relaxing.
[wizard_of_wikipedia]: Gardening
I like Gardening, even when I've only been doing it for a short time.
[label_candidates: OK what's the history?|Right, thats cool. I had no idea they still did the DVD thing, What is Netflix's highest rated show? do you know? |I will definitely check his first album out as he sounds interesting.|I don't know a whole lot about it. I was raised Catholic but don't practice anything now.|Well , this was a good conversation. |...and 95 more]
[eval_labels: I live on a farm, we garden all year long, it is very relaxing.]
[TorchAgent]: I live on a farm, we garden all year long, it is very relaxing.
parlai interactive -m projects:wizard_of_wikipedia:interactive_retrieval -t wizard_of_wikipedia

[ Your chosen topic is: Teapot ]
Enter Your Message: do you like tea?
[WizardRetrievalInteractiveAgent]: Yes!  I only use teapots that have a little air hole in the lid. That prevents the spout from dripping or splashing when the tea is poured. Most teapots have this though.
Enter Your Message: what about kettles?
[WizardRetrievalInteractiveAgent]: I would think you could use them to heat any type of liquid! I use my teapots with a tea cosy. It's a thermal cover that helps keep the tea hot.
Enter Your Message: do you like earl grey?
[WizardRetrievalInteractiveAgent]: I think I'll try some Lipton, I love their green tea!

Imageseq2Seq Dodecadialogue Wizard Of Wikipedia Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Wizard of Wikipedia task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/wizard_of_wikipedia_ft/model --inference beam --beam-size 10 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3 -t wizard_of_wikipedia

Please choose one of the following topics by typing A, B, C, ..., etc. :

A: Denver Art Museum
B: Joke
C: Sport utility vehicle
D: Chocolate brownie
E: [NO TOPIC]

Enter Your Message: D
[ Your chosen topic is: Chocolate brownie ]
/private/home/kshuster/ParlAI/parlai/core/torch_ranker_agent.py:670: UserWarning: [ Executing eval mode with provided inline set of candidates ]
''.format(mode)
[ Using chosen sentence from Wikpedia ]: A chocolate brownie (commonly referred to as simply brownie) is a square, baked, chocolate dessert.
[ImageSeq2seq]: i love chocolate brownies . they ' re my favorite .
Enter Your Message: me too! do you ever make them?
[ Using chosen sentence from Wikpedia ]: Brownies are typically eaten by hand, often accompanied by milk, served warm with ice cream (a la mode), topped with whipped cream, or sprinkled with powdered sugar and fudge.
[ImageSeq2seq]: i don ' t make them , but i would love to .
parlai eval_model -mf zoo:dodecadialogue/wizard_of_wikipedia_ft/model -t wizard_of_wikipedia:Generator --prepend-gold-knowledge true

[ Finished evaluating tasks ['wizard_of_wikipedia:Generator'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates  tpb
3939    .3823 2.144 7.5e-06 8.532      .5348                22908 2852

Unlikelihood Wizard Of Wikipedia Context And Label Repetition Model

[related project]

Dialogue model finetuned on Wizard of Wikipedia with context and label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_wiki_ctxt_and_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Unlikelihood Wizard Of Wikipedia Context Repetition Model

[related project]

Dialogue model finetuned on Wizard of Wikipedia with context repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_wiki_ctxt/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Unlikelihood Wizard Of Wikipedia Label Repetition Model

[related project]

Dialogue model finetuned on Wizard of Wikipedia with label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_wiki_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Light Dialog models

Light Bert-Biranker Dialogue Model

[related project]

LIGHT Dialogue task, replicating the numbers from the paper.

Example invocation(s):

parlai eval_model -t light_dialog -mf zoo:light/biranker_dialogue/model

{'exs': 6623, 'accuracy': 0.7586, 'f1': 0.7802, 'hits@1': 0.759, 'hits@5': 0.965,'hits@10': 0.994, 'hits@100': 1.0, 'bleu': 0.7255, 'lr': 5e-05, 'total_train_updates': 15050,'examples': 6623, 'loss': 5307.0, 'mean_loss': 0.8013, 'mean_rank': 1.599, 'train_accuracy': 0}

Imageseq2Seq Dodecadialogue Light Dialogue Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the LIGHT Dialogue task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/light_dialog_ft/model --inference beam --beam-size 5 --beam-min-length 20 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going?
[ImageSeq2seq]: i ' m doing well . how about you ? what ' s going on in the world today ?
Enter Your Message: not much, wish it had some more epic battles!
[ImageSeq2seq]: me too . it ' s been so long since i ' ve seen a battle like this . do you have a favorite battle ?
parlai eval_model -mf zoo:dodecadialogue/light_dialog_ft/model -t light_dialog

[ Finished evaluating tasks ['light_dialog'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates   tpb
6623   .07002 2.927 7.5e-06 18.66      .3927                38068 20.81

Convai2 models

Controllable Dialogue Convai2 Model

[related project]

Seq2Seq model with control trained on ConvAI2

Example invocation(s):

parlai eval_model --model projects.controllable_dialogue.controllable_seq2seq.controllable_seq2seq:ControllableSeq2seqAgent --task projects.controllable_dialogue.tasks.agents -mf zoo:controllable_dialogue/convai2_finetuned_baseline

{'exs': 7801, 'accuracy': 0.0006409, 'f1': 0.1702, 'bleu': 0.005205, 'token_acc': 0.3949, 'loss': 3.129, 'ppl': 22.86}

Poly-Encoder Transformer Convai2 Model

[related project]

Polyencoder pretrained on Reddit and fine-tuned on ConvAI2 scoring 89+ hits @ 1/20. See the pretrained_transformers directory for a list of other available pretrained transformers

Example invocation(s):

parlai interactive -mf zoo:pretrained_transformers/model_poly/model -t convai2

hi how are you doing ?
[Polyencoder]: i am alright . i am back from the library .
Enter Your Message: oh, what do you do for a living?
[Polyencoder]: i work at the museum downtown . i love it there .
Enter Your Message: what is your favorite drink?
[Polyencoder]: i am more of a tea guy . i get my tea from china .
parlai eval_model -mf zoo:pretrained_transformers/model_poly/model -t convai2 --eval-candidates inline

[ Finished evaluating tasks ['convai2'] using datatype valid ]
{'exs': 7801, 'accuracy': 0.8942, 'f1': 0.9065, 'hits@1': 0.894, 'hits@5': 0.99, 'hits@10': 0.997, 'hits@100': 1.0, 'bleu': 0.8941, 'lr': 5e-09, 'total_train_updates': 0, 'examples': 7801, 'loss': 3004.0, 'mean_loss': 0.385, 'mean_rank': 1.234, 'mrr': 0.9359}

Bi-Encoder Transformer Convai2 Model

[related project]

Bi-encoder pretrained on Reddit and fine-tuned on ConvAI2 scoring ~87 hits @ 1/20.

Example invocation(s):

parlai interactive -mf zoo:pretrained_transformers/model_bi/model -t convai2

hi how are you doing ?
[Biencoder]: my mother is from russia .
Enter Your Message: oh cool, whereabouts ?
[Biencoder]: no , she passed away when i was 18 . thinking about russian recipes she taught me ,
Enter Your Message: what do you cook?
[Biencoder]: like meat mostly , me and my dogs love them , do you like dogs ?
parlai eval_model -mf zoo:pretrained_transformers/model_bi/model -t convai2 --eval-candidates inline

[ Finished evaluating tasks ['convai2'] using datatype valid ]
{'exs': 7801, 'accuracy': 0.8686, 'f1': 0.8833, 'hits@1': 0.869, 'hits@5': 0.987, 'hits@10': 0.996, 'hits@100': 1.0, 'bleu': 0.8685, 'lr': 5e-09, 'total_train_updates': 0, 'examples': 7801, 'loss': 28.77, 'mean_loss': 0.003688, 'mean_rank': 1.301, 'mrr': 0.9197}

Imageseq2Seq Dodecadialogue Convai2 Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on Convai2

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/convai2_ft/model -t convai2 --inference beam --beam-size 3 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3

[context]: your persona: i currently work for ibm in chicago.
your persona: i'm not a basketball player though.
your persona: i am almost 7 feet tall.
your persona: i'd like to retire to hawaii in the next 10 years.
Enter Your Message: hi how's it going
[ImageSeq2seq]: i ' m doing well . how are you ?
Enter Your Message: i'm well, i am really tall
[ImageSeq2seq]: that ' s cool . i like simple jokes .
parlai eval_model -mf zoo:dodecadialogue/convai2_ft/model -t convai2

[ Finished evaluating tasks ['convai2'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates   tpb
7801    .2993 2.415 7.5e-06 11.19      .4741                15815 845.8

Unlikelihood Convai2 Context And Label Repetition Model

[related project]

Dialogue model finetuned on ConvAI2 with context and label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_convai2_ctxt_and_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi , how are you doing today ?

Unlikelihood Convai2 Context Repetition Model

[related project]

Dialogue model finetuned on ConvAI2 with context repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_convai2_ctxt/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi , how are you doing today ?

Unlikelihood Convai2 Label Repetition Model

[related project]

Dialogue model finetuned on ConvAI2 with label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_convai2_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi , how are you doing today ?

Unlikelihood Vocab Alpha 1E0 Model

[related project]

Dialogue model finetuned on convai2 with vocab unlikelihood, alpha value 1e0

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/vocab_alpha1e0/model -m projects.dialogue_unlikelihood.agents:TransformerSequenceVocabUnlikelihoodAgent

Enter Your Message: Hi.
[TransformerSequenceVocabUnlikelihood]: hi there ! how are you ?

Unlikelihood Vocab Alpha 1E1 Model

[related project]

Dialogue model finetuned on convai2 with vocab unlikelihood, alpha value 1e1

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/vocab_alpha1e1/model -m projects.dialogue_unlikelihood.agents:TransformerSequenceVocabUnlikelihoodAgent

Enter Your Message: Hi.
[TransformerSequenceVocabUnlikelihood]: hi how are you today

Unlikelihood Vocab Alpha 1E2 Model

[related project]

Dialogue model finetuned on convai2 with vocab unlikelihood, alpha value 1e2

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/vocab_alpha1e2/model -m projects.dialogue_unlikelihood.agents:TransformerSequenceVocabUnlikelihoodAgent

Enter Your Message: Hi.
[TransformerSequenceVocabUnlikelihood]: hello , how are you ?

Unlikelihood Vocab Alpha 1E3 Model

[related project]

Dialogue model finetuned on convai2 with vocab unlikelihood, alpha value 1e3

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/vocab_alpha1e3/model -m projects.dialogue_unlikelihood.agents:TransformerSequenceVocabUnlikelihoodAgent

Enter Your Message: Hi.
[TransformerSequenceVocabUnlikelihood]: hi there !

Personality Captions models

Transresnet (Resnet 152) Personality-Captions Model

[related project]

Transresnet Model pretrained on the Personality-Captions task

Example invocation(s):

parlai eval_model -t personality_captions -mf zoo:personality_captions/transresnet/model --num-test-labels 5 -dt test

{'exs': 10000, 'accuracy': 0.5113, 'f1': 0.5951, 'hits@1': 0.511, 'hits@5': 0.816, 'hits@10': 0.903, 'hits@100': 0.998, 'bleu': 0.4999, 'hits@1/100': 1.0, 'loss': -0.002, 'med_rank': 1.0}

Pretrained Transformers models

Poly-Encoder Transformer Reddit Pretrained Model

[related project]

Poly-Encoder pretrained on Reddit. Use this model as an --init-model for a poly-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/poly_model_huge_reddit/model -t convai2 --model transformer/polyencoder --batchsize 256 --eval-batchsize 10 --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 8.0 --max_train_time 200000 -veps 0.5 -vme 8000 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --fp16 True --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --polyencoder-type codes --poly-n-codes 64 --poly-attention-type basic --dict-endtoken __start__ --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.8942 ...}

Poly-Encoder Transformer Wikipedia/Toronto Books Pretrained Model

[related project]

Poly-Encoder pretrained on Wikipedia/Toronto Books. Use this model as an --init-model for a poly-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/poly_model_huge_wikito/model -t convai2 --model transformer/polyencoder --batchsize 256 --eval-batchsize 10 --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 8.0 --max_train_time 200000 -veps 0.5 -vme 8000 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --fp16 True --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --polyencoder-type codes --poly-n-codes 64 --poly-attention-type basic --dict-endtoken __start__ --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.861 ...}

Bi-Encoder Transformer Reddit Pretrained Model

[related project]

Bi-Encoder pretrained on Reddit. Use this model as an --init-model for a bi-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/bi_model_huge_reddit/model --batchsize 512 -t convai2 --model transformer/biencoder --eval-batchsize 6 --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 10.0 --max_train_time 200000 -veps 0.5 -vme 8000 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --share-word-embeddings False --dict-endtoken __start__ --fp16 True --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.8686 ...}

Bi-Encoder Transformer Wikipedia/Toronto Books Pretrained Model

[related project]

Bi-Encoder pretrained on Wikipedia/Toronto Books. Use this model as an --init-model for a poly-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/bi_model_huge_wikito/model --batchsize 512 -t convai2 --model transformer/biencoder --eval-batchsize 6 --warmup_updates 100 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 10.0 --max_train_time 200000 -veps 0.5 -vme 8000 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates batch --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type mean --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --share-word-embeddings False --dict-endtoken __start__ --fp16 True --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.846 ...}

Cross-Encoder Transformer Reddit Pretrained Model

[related project]

Cross-Encoder pretrained on Reddit. Use this model as an --init-model for a cross-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/cross_model_huge_reddit/model -t convai2 --model transformer/crossencoder --batchsize 16 --eval-batchsize 10 --warmup_updates 1000 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 12.0 --max_train_time 200000 -veps 0.5 -vme 2500 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates inline --fp16 True --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type first --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --dict-endtoken __start__ --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.903 ...}

Cross-Encoder Transformer Wikipedia/Toronto Books Pretrained Model

[related project]

Cross-Encoder pretrained on Wikipedia/Toronto Books. Use this model as an --init-model for a poly-encoder when fine-tuning on another task. For more details on how to train, see the project page.

Example invocation(s):

parlai train_model --init-model zoo:pretrained_transformers/cross_model_huge_wikito/model -t convai2 --model transformer/crossencoder --batchsize 16 --eval-batchsize 10 --warmup_updates 1000 --lr-scheduler-patience 0 --lr-scheduler-decay 0.4 -lr 5e-05 --data-parallel True --history-size 20 --label-truncate 72 --text-truncate 360 --num-epochs 12.0 --max_train_time 200000 -veps 0.5 -vme 2500 --validation-metric accuracy --validation-metric-mode max --save-after-valid True --log_every_n_secs 20 --candidates inline --fp16 True --dict-tokenizer bpe --dict-lower True --optimizer adamax --output-scaling 0.06 --variant xlm --reduction-type first --share-encoders False --learn-positional-embeddings True --n-layers 12 --n-heads 12 --ffn-size 3072 --attention-dropout 0.1 --relu-dropout 0.0 --dropout 0.1 --n-positions 1024 --embedding-size 768 --activation gelu --embeddings-scale False --n-segments 2 --learn-embeddings True --dict-endtoken __start__ --model-file <YOUR MODEL FILE>

(subject to some variance, you may see the following as a result of validation of the model)
{'exs': 7801, 'accuracy': 0.873 ...}

Image Chat models

Transresnet (Resnet152) Image-Chat Model

[related project]

Transresnet Multimodal Model pretrained on the Image-Chat task

Example invocation(s):

parlai eval_model -t image_chat -mf zoo:image_chat/transresnet_multimodal/model -dt test

{'exs': 29991, 'accuracy': 0.4032, 'f1': 0.4432, 'hits@1': 0.403, 'hits@5': 0.672, 'hits@10': 0.779, 'hits@100': 1.0, 'bleu': 0.3923,'first_round': {'hits@1/100': 0.3392, 'loss': -0.002001, 'med_rank': 3.0},'second_round': {'hits@1/100': 0.4558, 'loss': -0.002001, 'med_rank': 2.0},'third_round+': {'hits@1/100': 0.4147, 'loss': -0.002001, 'med_rank': 2.0}}'hits@10': 0.903, 'hits@100': 0.998, 'bleu': 0.4999, 'hits@1/100': 1.0, 'loss': -0.002, 'med_rank': 1.0}

Imageseq2Seq Dodecadialogue Image Chat Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Image Chat task

Example invocation(s):

parlai eval_model -mf zoo:dodecadialogue/image_chat_ft/model -t image_chat:generation --image-mode no_image_model

[ Finished evaluating tasks ['image_chat:generation'] using datatype valid ]
exs  gpu_mem  loss        lr   ppl  token_acc  total_train_updates  tpb
15000    .2231 4.353 3.125e-07 77.73      .2905               321001 1653

Self Feeding models

Self-Feeding Chatbot

[related project]

The self-feeding chatbot of Hancock, et al., 2019 (https://arxiv.org/abs/1901.05415). This model learns from is mistakes when actually talking with users. This particular model corresponds to the model with 131k human-human chats + 60k human-bot chats + 60k feedback chats.

Example invocation(s):

python projects/self_feeding/interactive.py --model-file zoo:self_feeding/hh131k_hb60k_fb60k_st1k/model --no-cuda true

Enter Your Message: hi, my name is stephen. what's yours?
[SelfFeeding]: hi there greg . do you have pets ? i've 2 cats named milo and fio .
Enter Your Message: sadly, i have no pets. my landlord isn't a fan.
[SelfFeeding]: sorry to hear that . i always had bad allergies when i liven on my farm in kansas .
parlai eval_model -mf zoo:self_feeding/hh131k_hb60k_fb60k_st1k/model -t self_feeding:all

[ Finished evaluating tasks ['self_feeding:all'] using datatype valid ]
{'exs': 3500, 'dia_rank': 4.654, 'dia_acc': 0.3525, 'fee_rank': 1.0, 'fee_acc': 1.0, 'fee_exs': 1000, 'sat_re': 0.4607, 'sat_f1': 0.5605, 'sat_acc': 0.724}

Dialogue Safety models

Transformer Classifier Single-Turn Dialogue Safety Model

[related project]

Classifier trained on both the standard and adversarial safety tasks in addition to Wikipedia Toxic Comments.

Example invocation(s):

parlai eval_model -t dialogue_safety:adversarial --round 3 -dt test -mf zoo:dialogue_safety/single_turn/model -bs 40

{'exs': 3000, 'accuracy': 0.9627, 'f1': 0.9627, 'bleu': 9.627e-10, 'lr': 5e-09, 'total_train_updates': 0, 'examples': 3000, 'mean_loss': 0.005441, 'class___notok___recall': 0.7833, 'class___notok___prec': 0.8333, 'class___notok___f1': 0.8076, 'class___ok___recall': 0.9826, 'class___ok___prec': 0.9761, 'class___ok___f1': 0.9793, 'weighted_f1': 0.9621}

Bert Classifier Multi-Turn Dialogue Safety Model

[related project]

Classifier trained on the multi-turn adversarial safety task in addition to both the single-turn standard and adversarial safety tasks and Wikipedia Toxic Comments.

Example invocation(s):

parlai eval_model -t dialogue_safety:multiturn -dt test -mf zoo:dialogue_safety/multi_turn/model --split-lines True -bs 40

{'exs': 3000, 'accuracy': 0.9317, 'f1': 0.9317, 'bleu': 9.317e-10, 'lr': 5e-09, 'total_train_updates': 0, 'examples': 3000, 'mean_loss': 0.008921, 'class___notok___recall': 0.7067, 'class___notok___prec': 0.6444, 'class___notok___f1': 0.6741, 'class___ok___recall': 0.9567, 'class___ok___prec': 0.9671, 'class___ok___f1': 0.9618, 'weighted_f1': 0.9331}

Integration Tests models

Integration Test Models

Model files used to check backwards compatibility and code coverage of important standard models.

Example invocation(s):

parlai eval_model -mf zoo:unittest/transformer_generator2/model -t integration_tests:multiturn_candidate -m transformer/generator

{'exs': 400, 'accuracy': 1.0, 'f1': 1.0, 'bleu-4': 0.2503, 'lr': 0.001, 'total_train_updates': 5000, 'gpu_mem_percent': 9.37e-05, 'loss': 0.0262, 'token_acc': 1.0, 'nll_loss': 7.935e-05, 'ppl': 1.0}

#Dodeca models

Imageseq2Seq Dodecadialogue All Tasks Mt Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/all_tasks_mt/model --inference beam --beam-size 3 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how are you?
[ImageSeq2seq]: i ' m doing well . how are you ?
Enter Your Message: not much, what do you like to do?
[ImageSeq2seq]: i like to go to the park and play with my friends .
parlai eval_model -mf zoo:dodecadialogue/all_tasks_mt/model -t "#Dodeca"--prepend-personality True --prepend-gold-knowledge True --image-mode no_image_model

[ Finished evaluating tasks ['#Dodeca'] using datatype valid ]
exs  gpu_mem  loss        lr   ppl  token_acc  total_train_updates  tpb
WizTeacher             3939          2.161           8.678      .5325
all                   91526    .3371 2.807 9.375e-07 18.23      .4352               470274 2237
convai2                7801          2.421           11.26      .4721
cornell_movie         13905          3.088           21.93      .4172
dailydialog            8069           2.47           11.82      .4745
empathetic_dialogues   5738          2.414           11.18      .4505
igc                     486          2.619           13.73      .4718
image_chat:Generation 15000          3.195           24.42      .3724
light_dialog           6623          2.944              19      .3918
twitter               10405           3.61           36.98      .3656
ubuntu                19560          3.148            23.3      .4035

Imageseq2Seq Dodecadialogue Base Model

[related project]

Image Seq2Seq base model, from which all DodecaDialogue models were trained

Example invocation(s):

parlai train_model -t "#Dodeca" --prepend-gold-knowledge true --prepend-personality true -mf /tmp/dodeca_model --init-model zoo:dodecadialogue/base_model/model --dict-file zoo:dodecadialogue/dict/dodeca.dict --model image_seq2seq --dict-tokenizer bpe --dict-lower true -bs 32 -eps 0.5 -esz 512 --ffn-size 2048 --fp16 false --n-heads 16 --n-layers 8 --n-positions 512 --text-truncate 512 --label-truncate 128 --variant xlm -lr 7e-6 --lr-scheduler reduceonplateau --optimizer adamax --dropout 0.1 --validation-every-n-secs 3600 --validation-metric ppl --validation-metric-mode min --validation-patience 10 --activation gelu --embeddings-scale true --learn-positional-embeddings true --betas 0.9,0.999 --warmup-updates 2000 --gradient-clip 0.1

A trained model (logs omitted)

Cornell Movie models

Imageseq2Seq Dodecadialogue Cornell Movie Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Cornell Movie task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/cornell_movie_ft/model --inference beam --beam-size 10 --beam-min-length 20 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going?
[ImageSeq2seq]: oh , it ' s great . i ' m having a great time . how are you doing ?
Enter Your Message: i'm doing well, what do you like to do?
[ImageSeq2seq]: i like to go to the movies . what about you ? do you have any hobbies ?
parlai eval_model -mf zoo:dodecadialogue/cornell_movie_ft/model -t cornell_movie

[ Finished evaluating tasks ['cornell_movie'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates   tpb
13905   .07094 2.967 2.5e-06 19.43      .4290                29496 15.76

Dailydialog models

Imageseq2Seq Dodecadialogue Dailydialog Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the DailyDialog task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/daily_dialog_ft/model --inference beam --beam-size 5 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going
[ImageSeq2seq]: i ' m doing well . how about you ?
Enter Your Message: not much, what do you like to do?
[ImageSeq2seq]: i like to go to the beach and play volleyball .
parlai eval_model -mf zoo:dodecadialogue/daily_dialog_ft/model -t dailydialog

[ Finished evaluating tasks ['dailydialog'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates   tpb
8069   .06787 2.326 7.5e-06 10.24      .5093               150959 15.67

Tbd models

Imageseq2Seq Dodecadialogue Eli5 Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the ELI5 task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/eli5_ft/model --inference beam --beam-size 10 --beam-min-length 200 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: Hi, can you tell me about quantum physics?
[ImageSeq2seq]: yes , i can . quantum physics is the study of how particles interact with each other , and how they interact with other particles . it ' s important to note that quantum mechanics is n ' t the same as classical physics . classical physics is a study of the properties of particles , and what they do . in classical physics , there are two types of particles : quarks and neutrinos . quarks are made up of quarks , neutrinos , and electrons . neutrinos are made of protons , neutrons , electrons , and neutrons . they ' re all the same thing , but they all have the same properties . so , if you ' re interested in quantum physics , you might want to check out / r / askscience . there ' s a subreddit for that sort of thing . edit : i ' m not sure what you mean by " quantum physics " , but i ' ll let you know if you want to know more . edit 2 : thanks for the gold !

Imageseq2Seq Dodecadialogue Pushshift.Io Reddit Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the pushshift.io Reddit task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/reddit_ft/model --inference beam --beam-size 5 --beam-min-length 20 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going?
[ImageSeq2seq]: hi , i ' m doing pretty well . how are you ? : ) and yourself ? : d
Enter Your Message: just hanging in there, you up to anything fun?
[ImageSeq2seq]: not really . i just got home from work . i ' ll be back in a few hours .

Empathetic Dialogues models

Imageseq2Seq Dodecadialogue Empathetic Dialogue Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Empathetic Dialogue task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/empathetic_dialogues_ft/model --inference beam --beam-size 5 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi, how's it going?
[ImageSeq2seq]: i ' m doing well . how are you ?
Enter Your Message: i'm fine, feeling a little sad
[ImageSeq2seq]: that ' s too bad . what ' s going on ?
parlai eval_model -mf zoo:dodecadialogue/empathetic_dialogues_ft/model -t empathetic_dialogues

[ Finished evaluating tasks ['empathetic_dialogues'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates  tpb
5738    .3278 2.405 7.5e-06 11.08      .4517                20107 1914

Igc models

Imageseq2Seq Dodecadialogue Image Grounded Conversations Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Image Grounded Conversations task

Example invocation(s):

parlai eval_model -mf zoo:dodecadialogue/igc_ft/model -t igc:responseOnly

[ Finished evaluating tasks ['igc:responseOnly'] using datatype valid ]
exs  gpu_mem  loss    lr   ppl  token_acc  total_train_updates   tpb
162    .0726 2.832 1e-06 16.98      .4405                10215 9.852

Twitter models

Imageseq2Seq Dodecadialogue Twitter Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Twitter task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/twitter_ft/model --inference beam --beam-size 10 --beam-min-length 20 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going?
[ImageSeq2seq]: it ' s going well ! how are you ? @ smiling_face_with_heart - eyes @
Enter Your Message: im doing well, what do you like to do
[ImageSeq2seq]: hi ! i ' m doing well ! i like to read , watch movies , play video games , and listen to music . how about you ?
parlai eval_model -mf zoo:dodecadialogue/twitter_ft/model -t twitter

[ Finished evaluating tasks ['twitter'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates  tpb
10405    .3807 3.396 7.5e-06 29.83      .3883               524029 2395

Ubuntu models

Imageseq2Seq Dodecadialogue Ubuntu V2 Ft Model

[related project]

Image Seq2Seq model trained on all DodecaDialogue tasks and fine-tuned on the Ubuntu V2 task

Example invocation(s):

parlai interactive -mf zoo:dodecadialogue/ubuntu_ft/model --inference beam --beam-size 2 --beam-min-length 10 --beam-block-ngram 3 --beam-context-block-ngram 3

Enter Your Message: hi how's it going?
[ImageSeq2seq]: i ' m fine . . . you ? .
Enter Your Message: doing ok, what do you like to do?
[ImageSeq2seq]: i like to read , write , and read .
parlai eval_model -mf zoo:dodecadialogue/ubuntu_ft/model -t ubuntu

[ Finished evaluating tasks ['ubuntu'] using datatype valid ]
exs  gpu_mem  loss      lr   ppl  token_acc  total_train_updates  tpb
19560    .3833 2.844 2.5e-05 17.18      .4389               188076 3130

Blended Skill Talk models

Blendedskilltalk: Blendedskilltalk Single-Task Model

[related project]

Pretrained polyencoder retrieval model fine-tuned on the BlendedSkillTalk dialogue task.

Example invocation(s):

parlai interactive -mf zoo:blended_skill_talk/bst_single_task/model -t blended_skill_talk

Results vary.
parlai eval_model -mf zoo:blended_skill_talk/bst_single_task/model -t blended_skill_talk -dt test

09:51:57 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.7920   .7785 5482 .8124    .0370   .7920    .9788         1   .9542 .8251 .8636 1.866 19.76

Blendedskilltalk: Convai2 Single-Task Model

[related project]

Pretrained polyencoder retrieval model fine-tuned on the ConvAI2 dialogue task.

Example invocation(s):

parlai eval_model -mf zoo:blended_skill_talk/convai2_single_task/model -t blended_skill_talk -dt test

10:23:53 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.7678   .7553 5482 .7902   .07928   .7678    .9728         1   .9414 .9337 .8451  2.04 19.76

Blendedskilltalk: Empatheticdialogues Single-Task Model

[related project]

Pretrained polyencoder retrieval model fine-tuned on the EmpatheticDialogues dialogue task.

Example invocation(s):

parlai eval_model -mf zoo:blended_skill_talk/ed_single_task/model -t blended_skill_talk -dt test

10:16:47 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.6895   .6774 5482 .7219   .07928   .6895    .9509         1   .9051 1.242 .7849  2.79 19.76

Blendedskilltalk: Wizard Of Wikipedia Single-Task Model

[related project]

Pretrained polyencoder retrieval model fine-tuned on the Wizard of Wikipedia dialogue task.

Example invocation(s):

parlai eval_model -mf zoo:blended_skill_talk/wizard_single_task/model -t blended_skill_talk -dt test

10:34:46 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.6742   .6616 5482 .7059   .07928   .6742    .9445         1   .8902 1.321 .7706 2.962 19.76

Blendedskilltalk: Mt Single-Skills Model

[related project]

Pretrained polyencoder retrieval model fine-tuned on the ConvAI2, EmpatheticDialogues, and Wizard of Wikipedia dialogue tasks.

Example invocation(s):

parlai eval_model -mf zoo:blended_skill_talk/multi_task/model -t blended_skill_talk -dt test

10:23:35 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.8010   .7872 5482 .8204   .07928   .8010    .9779         1   .9564 .8154 .8697 1.908 19.76

Blendedskilltalk: Mt Single-Skills Model Fine-Tuned On Bst

[related project]

Pretrained polyencoder retrieval model fine-tuned on the ConvAI2, EmpatheticDialogues, and Wizard of Wikipedia dialogue tasks, and then further fine-tuned on the BlendedSkillTalk dialogue task.

Example invocation(s):

parlai eval_model -mf zoo:blended_skill_talk/multi_task_bst_tuned/model -t blended_skill_talk -dt test

10:36:01 | Finished evaluating tasks ['blended_skill_talk'] using datatype test
accuracy  bleu-4  exs    f1  gpu_mem  hits@1  hits@10  hits@100  hits@5  loss   mrr  rank   tpb
.8378   .8230 5482 .8543   .07928   .8378    .9872         1   .9704 .5897 .8963 1.604 19.76

Blender 90M

[related project]

90< parameter generative model finetuned on blended_skill_talk tasks.

Example invocation(s):

python parlai/scripts/safe_interactive.py -mf zoo:blender/blender_90M/model -t blended_skill_talk

Enter Your Message: Hi what's up?
[TransformerGenerator]: hello , how are you ? i just got back from working at a law firm , how about you ?

Blender 2.7B

[related project]

2.7B parameter generative model finetuned on blended_skill_talk tasks.

Example invocation(s):

python parlai/scripts/safe_interactive.py -mf zoo:blender/blender_3B/model -t blended_skill_talk

Enter Your Message: Hi how are you?
[TransformerGenerator]: I'm doing well. How are you doing? What do you like to do in your spare time?

Blender 9.4B

[related project]

9.4B parameter generative model finetuned on blended_skill_talk tasks.

Example invocation(s):

python parlai/scripts/safe_interactive.py -mf zoo:blender/blender_9B/model -t blended_skill_talk

Enter Your Message: Hi!
[TransformerGenerator]: What do you do for a living? I'm a student at Miami University.

Pushshift.Io models

Tutorial Transformer Generator

Small (87M paramter) generative transformer, pretrained on pushshift.io Reddit.

Example invocation(s):

parlai interactive -mf zoo:tutorial_transformer_generator/model

Enter Your Message: hi, how are you today?
[TransformerGenerator]: i ' m doing well , how about you ?
Enter Your Message: I'm giving a tutorial on chatbots!
[TransformerGenerator]: that ' s awesome ! what ' s it about ?
Enter Your Message: bots just like you
[TransformerGenerator]: i ' ll be sure to check it out !

Reddit 2.7B

[related project]

2.7B parameter generative model finetuned on blended_skill_talk tasks.

Example invocation(s):

parlai train_model -t blended_skill_talk,wizard_of_wikipedia,convai2:normalized,empathetic_dialogues --multitask-weights 1,3,3,3 -veps 0.25 --attention-dropout 0.0 --batchsize 128 --model transformer/generator --embedding-size 2560 --ffn-size 10240 --variant prelayernorm --n-heads 32 --n-positions 128 --n-encoder-layers 2 --n-decoder-layers 24 --history-add-global-end-token end --delimiter '  ' --dict-tokenizer bytelevelbpe  --dropout 0.1 --fp16 True --init-model zoo:blender/reddit_3B/model --dict-file zoo:blender/reddit_3B/model.dict --label-truncate 128 --log_every_n_secs 10 -lr 7e-06 --lr-scheduler reduceonplateau --lr-scheduler-patience 3 --optimizer adam --relu-dropout 0.0 --activation gelu --model-parallel true --save-after-valid True --text-truncate 128 --truncate 128 --warmup_updates 100 --fp16-impl mem_efficient --update-freq 2 --gradient-clip 0.1 --skip-generation True -vp 10 -vmt ppl -vmm min --model-file /tmp/test_train_27B

Results vary.

Reddit 9.4B

[related project]

9.4B parameter generative model finetuned on blended_skill_talk tasks.

Example invocation(s):

parlai train_model -t blended_skill_talk,wizard_of_wikipedia,convai2:normalized,empathetic_dialogues --multitask-weights 1,3,3,3 -veps 0.25 --attention-dropout 0.0 --batchsize 8 --eval-batchsize 64 --model transformer/generator --embedding-size 4096 --ffn-size 16384 --variant prelayernorm --n-heads 32 --n-positions 128 --n-encoder-layers 4 --n-decoder-layers 32 --history-add-global-end-token end --dict-tokenizer bytelevelbpe --dropout 0.1 --fp16 True --init-model zoo:blender/reddit_9B/model --dict-file zoo:blender/reddit_9B/model.dict --label-truncate 128 -lr 3e-06 -dynb full --lr-scheduler cosine --max-lr-steps 9000 --lr-scheduler-patience 3 --optimizer adam --relu-dropout 0.0 --activation gelu --model-parallel true --save-after-valid False --text-truncate 128 --truncate 128 --warmup_updates 1000 --fp16-impl mem_efficient --update-freq 4 --log-every-n-secs 30 --gradient-clip 0.1 --skip-generation True -vp 10 --max-train-time 84600 -vmt ppl -vmm min --model-file /tmp/test_train_94B

Results vary.

Wikipedia Plus Toronto Books models

Bart

[external website]

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.

Example invocation(s):

parlai eval_model -mf zoo:bart/bart_large/model -t convai2 -bs 64

Finished evaluating tasks ['convai2'] using datatype valid
accuracy   bleu-4    exs      f1  gpu_mem    loss    ppl  token_acc   tpb
0        .0004641   7801  .02084    .4878   5.041  154.6      .2042  1652
parlai train_model -m bart -mf /tmp/model_file -t convai2 -bs 24 --fp16 true -eps 1 -lr 1e-5 --optimizer adam

valid:
accuracy  bleu-4  exs    f1  gpu_mem  loss    lr   ppl  token_acc  total_train_updates   tpb
.0001282  .01229 7801 .2035    .6361 2.386 1e-05 10.87      .4741                 5478 321.3

Eli5 models

Unlikelihood Eli5 Context And Label Repetition Model

[related project]

Dialogue model finetuned on ELI5 with context and label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_eli5_ctxt_and_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Unlikelihood Eli5 Context Repetition Model

[related project]

Dialogue model finetuned on ELI5 with context repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_eli5_ctxt/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Unlikelihood Eli5 Label Repetition Model

[related project]

Dialogue model finetuned on ELI5 with label repetition unlikelihood

Example invocation(s):

python parlai/scripts/interactive.py -mf zoo:dialogue_unlikelihood/rep_eli5_label/model -m projects.dialogue_unlikelihood.agents:RepetitionUnlikelihoodAgent

Enter Your Message: Hi.
[RepetitionUnlikelihood]: hi .

Pretrained Word Embeddings

Some models support using Pretrained Embeddings, via torchtext. As of writing, this includes:

Example invocation:

parlai train_model -t convai2 -m seq2seq -emb fasttext_cc

Adding ‘-fixed’ to the name e.g. ‘twitter-fixed’ means backprop will not go through this (i.e. they will remain unchanged).

BERT

BERT is in the model zoo and is automatically used for initialization of bert bi-, poly- and cross-encoder rankers.

Example invocation:

parlai train_model -t convai2 -m bert_ranker/bi_encoder_ranker