agents.unigram

Baseline model which always emits the N most common non-punctuation unigrams. Typically this is mostly stopwords. This model is a poor conversationalist, but may get reasonable F1.

UnigramAgent has one option, –num-words, which controls the unigrams outputted.

This also makes a nice reference for a simple, minimalist agent.

class parlai.agents.unigram.unigram.UnigramAgent(opt, shared=None)

Bases: parlai.core.agents.Agent

classmethod add_cmdline_args(parser)

Adds command line arguments.

classmethod dictionary_class()

Returns the DictionaryAgent used for tokenization.

__init__(opt, shared=None)

Construct a UnigramAgent.

Parameters
  • opt – parlai options

  • shared – Used to duplicate the model for batching/hogwild.

share()

Basic sharing function.

observe(obs)

Stub observe method.

is_valid_word(word)

Marks whether a string may be included in the unigram list.

Used to filter punctuation and special tokens.

get_prediction()

Core algorithm, which gathers the most common unigrams into a string.

act()

Stub act, which always makes the same prediction.

save(path=None)

Stub save which dumps options.

Necessary for evaluation scripts to load the model.

load(path)

Stub load which ignores the model on disk, as UnigramAgent depends on the dictionary, which is saved elsewhere.