keras

Datatypes for storing and loading Keras models.

class KerasModel(*args, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

Datatype for both types of Keras models, stored using Keras’ own storage mechanisms. This uses Keras’ method of storing the model architecture as JSON and stores the weights using hdf5.

datatype_name = 'keras_model'
custom_objects = {}
datatype_supports_python2 = True
get_software_dependencies()[source]

Get a list of all software required to read this datatype. This is separate to metadata config checks, so that you don’t need to satisfy the dependencies for all modules in order to be able to run one of them. You might, for example, want to run different modules on different machines. This is called when a module is about to be executed and each of the dependencies is checked.

Returns a list of instances of subclasses of :class:~pimlico.core.dependencies.base.SoftwareDependency, representing the libraries that this module depends on.

Take care when providing dependency classes that you don’t put any import statements at the top of the Python module that will make loading the dependency type itself dependent on runtime dependencies. You’ll want to run import checks by putting import statements within this method.

You should call the super method for checking superclass dependencies.

Note that there may be different software dependencies for writing a datatype using its Writer. These should be specified using get_writer_software_dependencies().

class Reader(datatype, setup, pipeline, module=None)[source]

Bases: pimlico.datatypes.base.Reader

Reader class for KerasModel

get_custom_objects()[source]
load_model()[source]
class Setup(datatype, data_paths)

Bases: pimlico.datatypes.base.Setup

Setup class for KerasModel.Reader

data_ready(path)

Check whether the data at the given path is ready to be read using this type of reader. It may be called several times with different possible base dirs to check whether data is available at any of them.

Often you will override this for particular datatypes to provide special checks. You may (but don’t have to) check the setup’s parent implementation of data_ready() by calling super(MyDatatype.Reader.Setup, self).data_ready(path).

The base implementation just checks whether the data dir exists. Subclasses will typically want to add their own checks.

get_base_dir()
Returns:the first of the possible base dir paths at which the data is ready to read. Raises an exception if none is ready. Typically used to get the path from the reader, once we’ve already confirmed that at least one is available.
get_data_dir()
Returns:the path to the data dir within the base dir (typically a dir called “data”)
get_reader(pipeline, module=None)

Instantiate a reader using this setup.

Parameters:
  • pipeline – currently loaded pipeline
  • module – (optional) module name of the module by which the datatype has been loaded. Used for producing intelligible error output
get_required_paths()

May be overridden by subclasses to provide a list of paths (absolute, or relative to the data dir) that must exist for the data to be considered ready.

read_metadata(base_dir)

Read in metadata for a dataset stored at the given path. Used by readers and rarely needed outside them. It may sometimes be necessary to call this from data_ready() to check that required metadata is available.

reader_type

alias of KerasModel.Reader

ready_to_read()

Check whether we’re ready to instantiate a reader using this setup. Always called before a reader is instantiated.

Subclasses may override this, but most of the time you won’t need to. See data_ready() instead.

Returns:True if the reader’s ready to be instantiated, False otherwise
class Writer(datatype, base_dir, pipeline, module=None, **kwargs)[source]

Bases: pimlico.datatypes.base.Writer

Writer class for KerasModel

required_tasks = ['architecture', 'weights']
weights_filename
write_model(model)[source]
write_architecture(model)[source]
write_weights(model)[source]
metadata_defaults = {}
writer_param_defaults = {}
class KerasModelBuilderClass(*args, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

An alternative way to store Keras models.

Create a class whose init method build the model architecture. It should take a kwarg called build_params, which is a JSON-encodable dictionary of parameters that determine how the model gets build (hyperparameters). When you initialize your model for training, create this hyperparameter dictionary and use it to instantiate the model class.

Use the KerasModelBuilderClassWriter to store the model during training. Create a writer, then start model training, storing the weights to the filename given by the weights_filename attribute of the writer. The hyperparameter dictionary will also be stored.

The writer also stores the fully-qualified path of the model-builder class. When we read the datatype and want to rebuild the model, we import the class, instantiate it and then set its weights to those we’ve stored.

The model builder class must have the model stored in an attribute model.

datatype_name = 'keras_model_builder_class'
datatype_supports_python2 = True
class Reader(datatype, setup, pipeline, module=None)[source]

Bases: pimlico.datatypes.base.Reader

Reader class for KerasModelBuilderClass

weights_filename
load_build_params()[source]
create_builder_class(override_params=None)[source]
load_model(override_params=None)[source]

Instantiate the model builder class with the stored parameters and set the weights on the model to those stored.

Returns:model builder instance (keras model in attribute model
class Setup(datatype, data_paths)

Bases: pimlico.datatypes.base.Setup

Setup class for KerasModelBuilderClass.Reader

data_ready(path)

Check whether the data at the given path is ready to be read using this type of reader. It may be called several times with different possible base dirs to check whether data is available at any of them.

Often you will override this for particular datatypes to provide special checks. You may (but don’t have to) check the setup’s parent implementation of data_ready() by calling super(MyDatatype.Reader.Setup, self).data_ready(path).

The base implementation just checks whether the data dir exists. Subclasses will typically want to add their own checks.

get_base_dir()
Returns:the first of the possible base dir paths at which the data is ready to read. Raises an exception if none is ready. Typically used to get the path from the reader, once we’ve already confirmed that at least one is available.
get_data_dir()
Returns:the path to the data dir within the base dir (typically a dir called “data”)
get_reader(pipeline, module=None)

Instantiate a reader using this setup.

Parameters:
  • pipeline – currently loaded pipeline
  • module – (optional) module name of the module by which the datatype has been loaded. Used for producing intelligible error output
get_required_paths()

May be overridden by subclasses to provide a list of paths (absolute, or relative to the data dir) that must exist for the data to be considered ready.

read_metadata(base_dir)

Read in metadata for a dataset stored at the given path. Used by readers and rarely needed outside them. It may sometimes be necessary to call this from data_ready() to check that required metadata is available.

reader_type

alias of KerasModelBuilderClass.Reader

ready_to_read()

Check whether we’re ready to instantiate a reader using this setup. Always called before a reader is instantiated.

Subclasses may override this, but most of the time you won’t need to. See data_ready() instead.

Returns:True if the reader’s ready to be instantiated, False otherwise
class Writer(*args, **kwargs)[source]

Bases: pimlico.datatypes.base.Writer

Writer class for KerasModelBuilderClass

required_tasks = ['architecture', 'weights']
write_weights(model)[source]
metadata_defaults = {}
writer_param_defaults = {}