File for miscellaneous utility functions and constants.


Return a representable finite number near -inf for a dtype.

parlai.core.utils.maintain_dialog_history(history, observation, reply='', historyLength=1, useReplies='label_else_model', dict=None, useStartEndIndices=True, splitSentences=False)

Keep track of dialog history, up to a truncation length.

Either includes replies from the labels, model, or not all using param ‘replies’.


parlai.core.utils.load_cands(path, lines_have_ids=False, cands_are_replies=False)

Load global fixed set of candidate labels that the teacher provides.

Every example will include these as candidates. The true labels for a specific example are also added to this set, so that it’s possible to get the right answer.

class parlai.core.utils.Opt(*args, **kwargs)

Bases: dict

Class for tracking options.

Functions like a dict, but allows us to track the history of arguments as they are set.

__init__(*args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.


Display the history for an item in the dict.

class parlai.core.utils.Predictor(args=None, **kwargs)

Bases: object

Wrapper to set up running version of model and request predictions.

Note that this maintains no World state (does not use a World), merely providing the observation directly to the model and getting a response.

This is limiting when it comes to certain use cases, but allows for quick model deployment.

__init__(args=None, **kwargs)

Initialize the predictor, setting up opt automatically if needed.

Args is expected to be in the same format as sys.argv: e.g. a list in the form [‘–model’, ‘seq2seq’, ‘-hs’, 128, ‘-lr’, 0.5].

kwargs is interpreted by appending ‘–’ to it and replacing underscores with hyphens, so ‘dict_file=/tmp/dict.tsv’ would be interpreted as ‘–dict-file /tmp/dict.tsv’.


From a ParlAI-standard message dict, get model prediction.

class parlai.core.utils.Timer

Bases: object

Computes elapsed time.


Initialize timer.


Reset timer to zero.


Resume timer.


Pause timer.


Get current timer time.

class parlai.core.utils.TimeLogger

Bases: object

Class for logging time progress against a goal.


Set up timer.


Return time elapsed at last log call.


Return current timer time.

log(done, total, report=None)

Log report, time elapsed, and percentage progress towards goal.

  • done – number of examples completed so far

  • total – total number of elements to be completed. if total > 0, calculates the time remaining and percentage complete.

  • report – dict of pairs to log


tuple log string, log dict log string contains time elapsed and string representation of the log dict log dict contains pairs of all items to log, which includes percentage complete and projected time left if total > 0

class parlai.core.utils.AttrDict(*args, **kwargs)

Bases: dict

Helper class to have a dict-like object with dot access.

For example, instead of d = {‘key’: ‘value’} use d = AttrDict(key=’value’). To access keys, instead of doing d[‘key’] use d.key.

While this has some limitations on the possible keys (for example, do not set the key items or you will lose access to the items() method), this can make some code more clear.

__init__(*args, **kwargs)

Initialize AttrDict using input dict.

parlai.core.utils.round_sigfigs(x, sigfigs=4)

Round value to specified significant figures.

  • x – input number

  • sigfigs – number of significant figures to return


float number rounded to specified sigfigs

class parlai.core.utils.NoLock

Bases: object

Empty lock. Does nothing when you enter or exit.


Build a nolock for other classes to use for no-op locking.

class parlai.core.utils.PaddingUtils

Bases: object

Helps with padding input and target tensors.


classmethod pad_text(observations, dictionary, end_idx=None, null_idx=0, dq=False, eval_labels=True, truncate=None)

Pad observations to max width.

We check that examples are valid, pad with zeros, and sort by length so that we can use the pack_padded function. The list valid_inds keeps track of which indices are valid and the order in which we sort the examples.

dq – whether we should use deque or list eval_labels – whether or not we want to consider eval labels truncate – truncate input and output lengths


classmethod map_predictions(predictions, valid_inds, batch_reply, observations, dictionary, end_idx, report_freq=0.1, labels=None, answers=None, ys=None)

Match predictions to original index in the batch.

Predictions are mapped back to appropriate indices in the batch_reply using valid_inds.

report_freq – how often we report predictions


class parlai.core.utils.OffensiveLanguageDetector

Bases: object

Tries to detect offensive language in text.

Detects offensive language using a list of offensive language and phrases from


Get data from external sources and build data representation.


Add a single phrase to the filter.


Add list of custom phrases to the filter.


Determine if text contains any offensive words in the filter.

str_segment(text, dict_agent, k=1, max_length=None)

Segment a word without spaces into the most probable phrase with spaces.

  • text (string) – string to segment

  • dict_agent (DictionaryAgent) – Dictionary we use to look at word frequencies

  • k (int) – top k segmentations of string

  • max_length (int) – max length of a substring (word) in the string. default (None) uses the length of the string.


list of top k segmentations of the given string

Return type


Example Usage:

dict_agent = DictionaryAgent using Wiki Toxic Comments data old = OffensiveLanguageDector()

split_str = old.str_segment(‘fucku2’, dict_agent) split_str is ‘fuck u 2’

We can then run old.contains_offensive_language(split_str) which yields the offensive word ‘fuck’

parlai.core.utils.clip_text(text, max_len)

Clip text to max length, adding ellipses.

parlai.core.utils.display_messages(msgs, prettify=False, ignore_fields='', max_len=1000)

Return a string describing the set of messages provided.

If prettify is true, candidates are displayed using prettytable. ignore_fields provides a list of fields in the msgs which should not be displayed.

parlai.core.utils.str_to_msg(txt, ignore_fields='')

Convert formatted string to ParlAI message dict.

  • txt – formatted string to convert. String format is tab-separated fields, with colon separating field name and contents.

  • ignore_fields – (default ‘’) comma-separated field names to not include in the msg dict even if they’re in the string.

parlai.core.utils.msg_to_str(msg, ignore_fields='')

Convert ParlAI message dict to string.

  • msg – dict to convert into a string.

  • ignore_fields – (default ‘’) comma-separated field names to not include in the string even if they’re in the msg dict.

parlai.core.utils.set_namedtuple_defaults(namedtuple, default=None)

Set all of the fields for a given nametuple to a singular value.

Additionally removes the default docstring for each field. Modifies the tuple in place, but returns it anyway.

More info:

  • namedtuple – A constructed collections.namedtuple

  • default – The default value to set.


the modified namedtuple

parlai.core.utils.padded_tensor(items, pad_idx=0, use_cuda=False, left_padded=False, max_len=None, fp16friendly=False)

Create a right-padded matrix from an uneven list of lists.

Returns (padded, lengths), where padded is the padded matrix, and lengths is a list containing the lengths of each row.

Matrix is right-padded (filled to the right) by default, but can be left padded if the flag is set to True.

Matrix can also be placed on cuda automatically.

  • items (list[iter[int]]) – List of items

  • sort (bool) – If True, orders by the length

  • pad_idx (int) – the value to use for padding

  • use_cuda (bool) – if true, places padded on GPU

  • left_padded (bool) –

  • max_len (int) – if None, the max length is the maximum item length

  • fp16friendly (bool) – if True, pads the time dimension to be a multiple of 8.


(padded, lengths) tuple

Return type

(Tensor[int64], list[int])

parlai.core.utils.padded_3d(tensors, pad_idx=0, use_cuda=0, dtype=torch.int64, fp16friendly=False)

Make 3D padded tensor for list of lists of 1D tensors or lists.

  • tensors – list of lists of 1D tensors (or lists)

  • pad_idx – padding to fill tensor with

  • use_cuda – whether to call cuda() before returning

  • fp16friendly (bool) – if True, pads the final dimension to be a multiple of 8.


3D tensor with the maximum dimensions of the inputs

parlai.core.utils.argsort(keys, *lists, descending=False)

Reorder each list in lists by the (descending) sorted order of keys.

  • keys (iter) – Keys to order by.

  • lists (list[list]) – Lists to reordered by keys’s order. Correctly handles lists and 1-D tensors.

  • descending (bool) – Use descending order if true.


The reordered items.

parlai.core.utils.warn_once(msg, warningtype=None)

Raise a warning, but only once.

  • msg (str) – Message to display

  • warningtype (Warning) – Type of warning, e.g. DeprecationWarning

parlai.core.utils.fp16_optimizer_wrapper(optimizer, verbose=False, dynamic_loss_scale=True, loss_initial_scale=131072.0)

Wrap the an optimizer with FP16 loss scaling protection.

Requires apex to be installed. Will throw an ImportError if it is not.

  • optimizer – Any torch optimizer

  • verbose (bool) – Enables verbose output in the FP16 optimizer. Turning this on can help debug when FP16 is underperforming.

  • dynamic_loss_scaling (bool) – FP16 requires loss scaling to avoid underflows. It is recommended this stays on, but advanced users may want it off.

  • loss_initial_scale (float) – Initial loss scaling. Default chosen empirically, but models with very low or high loss values may need this adjusted. Stick with powers of 2.


An APEX FP16 optimizer. Please note this has different requirements on how backward() and step() are called.