pimlico.core.modules.map.filter module

class pimlico.core.modules.map.filter.DocumentMapOutputTypeWrapper(*args, **kwargs)[source]

Bases: object

archive_iter(subsample=None, start_after=None)[source]

Provides an iterator just like TarredCorpus, but instead of iterating over data read from disk, gets it on the fly from the input datatype.

data_ready()[source]

Ready to supply this data as soon as all the wrapper module’s inputs are ready to produce their data.

non_filter_datatype = None
output_name = None
wrapped_module_info = None
pimlico.core.modules.map.filter.wrap_module_info_as_filter(module_info_instance)[source]