Baseline model which always emits the N most common non-punctuation unigrams. Typically this is mostly stopwords. This model is a poor conversationalist, but may get reasonable F1.
UnigramAgent has one option, –num-words, which controls the unigrams outputted.
This also makes a nice reference for a simple, minimalist agent.
Adds command line arguments.
Returns the DictionaryAgent used for tokenization.
Construct a UnigramAgent.
opt – parlai options
shared – Used to duplicate the model for batching/hogwild.
Basic sharing function.
Stub observe method.
Marks whether a string may be included in the unigram list.
Used to filter punctuation and special tokens.
Core algorithm, which gathers the most common unigrams into a string.
Stub act, which always makes the same prediction.
Stub save which dumps options.
Necessary for evaluation scripts to load the model.
Stub load which ignores the model on disk, as UnigramAgent depends on the dictionary, which is saved elsewhere.