core

Some basic core datatypes that are commonly used for passing simple data, like strings and dicts, through pipelines.

class Dict(*args, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

Simply stores a Python dict, pickled to disk. All content in the dict should be pickleable.

datatype_name = 'dict'
datatype_supports_python2 = True
class Reader(datatype, setup, pipeline, module=None)[source]

Bases: pimlico.datatypes.base.Reader

Reader class for Dict

class Setup(datatype, data_paths)[source]

Bases: pimlico.datatypes.base.Setup

Setup class for Dict.Reader

get_required_paths()[source]

May be overridden by subclasses to provide a list of paths (absolute, or relative to the data dir) that must exist for the data to be considered ready.

reader_type

alias of Dict.Reader

get_dict()[source]
class Writer(datatype, base_dir, pipeline, module=None, **kwargs)[source]

Bases: pimlico.datatypes.base.Writer

Writer class for Dict

required_tasks = ['dict']
write_dict(d)[source]
metadata_defaults = {}
writer_param_defaults = {}
class StringList(*args, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

Simply stores a Python list of strings, written out to disk in a readable form. Not the most efficient format, but if the list isn’t humungous it’s OK (e.g. storing vocabularies).

datatype_name = 'string_list'
datatype_supports_python2 = True
class Reader(datatype, setup, pipeline, module=None)[source]

Bases: pimlico.datatypes.base.Reader

Reader class for StringList

class Setup(datatype, data_paths)[source]

Bases: pimlico.datatypes.base.Setup

Setup class for StringList.Reader

get_required_paths()[source]

May be overridden by subclasses to provide a list of paths (absolute, or relative to the data dir) that must exist for the data to be considered ready.

reader_type

alias of StringList.Reader

get_list()[source]
class Writer(datatype, base_dir, pipeline, module=None, **kwargs)[source]

Bases: pimlico.datatypes.base.Writer

Writer class for StringList

required_tasks = ['list']
write_list(l)[source]
metadata_defaults = {}
writer_param_defaults = {}