Utilities for downloading and building data. These can be replaced if your particular file system does not support them.

parlai.core.build_data.built(path, version_string=None)

Checks if ‘.built’ flag has been set for that task.

If a version_string is provided, this has to match, or the version is regarded as not built.

parlai.core.build_data.mark_done(path, version_string=None)

Marks the path as done by adding a ‘.built’ file with the current timestamp plus a version description string if specified., path, fname, redownload=False)

Downloads file using requests. If redownload is set to false, then will not download tar file again if it is present (default True).


Makes the directory and any nonexistent parent directories.

parlai.core.build_data.move(path1, path2)

Renames the given file.


Removes the given directory, if it exists.

parlai.core.build_data.untar(path, fname, deleteTar=True)

Unpacks the given archive file to the same directory, then (by default) deletes the archive file.

parlai.core.build_data.download_from_google_drive(gd_id, destination)

Uses the requests package to download a file from Google Drive.

parlai.core.build_data.download_models(opt, fnames, model_folder, version='v1.0', path='aws', use_model_type=False)

Download models into the ParlAI model zoo from a url.

  • fnames – list of filenames to download

  • model_folder – models will be downloaded into models/model_folder/model_type

  • path – url for downloading models; defaults to downloading from AWS

  • use_model_type – whether models are categorized by type in AWS

parlai.core.build_data.modelzoo_path(datapath, path)

If path starts with ‘models’, then we remap it to the model zoo path within the data directory (default is ParlAI/data/models). We download models from the model zoo if they are not here yet.