pimlico.datatypes.core module

Some basic core datatypes that are commonly used for simple datatypes, file types, etc.

class SingleTextDocument(base_dir, pipeline, module=None, additional_name=None, use_main_metadata=False, **kwargs)[source]

Bases: pimlico.datatypes.files.NamedFileCollection

datatype_name = 'single_doc'
filenames = ['data.txt']
read_data()[source]
class SingleTextDocumentWriter(base_dir, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatypeWriter

class Dict(base_dir, pipeline, module=None, additional_name=None, use_main_metadata=False, **kwargs)[source]

Bases: pimlico.datatypes.files.NamedFileCollection

Simply stores a Python dict, pickled to disk.

datatype_name = 'dict'
filenames = ['data']
data
class DictWriter(base_dir, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatypeWriter

class StringList(base_dir, pipeline, module=None, additional_name=None, use_main_metadata=False, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

Simply stores a Python list of strings, written out to disk in a readable form. Not the most efficient format, but if the list isn’t humungous it’s OK (e.g. storing vocabularies).

datatype_name = 'string_list'
data_ready()[source]

Check whether the data corresponding to this datatype instance exists and is ready to be read.

Default implementation just checks whether the data dir exists. Subclasses might want to add their own checks, or even override this, if the data dir isn’t needed.

path
data
class StringListWriter(base_dir, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatypeWriter