pimlico.datatypes.word2vec module

class Word2VecModel(base_dir, pipeline, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatype

Datatype for storing Gensim-trained word2vec embeddings.

See also

Datatype pimlico.datatypes.embeddings.Embeddings
Another, more generic way, to write the same data, which should generally be used in preference to this one. Embeddings does not depend on Gensim, but can be converted to Gensim’s data structure easily.
shell_commands = [<pimlico.datatypes.word2vec.NearestNeighboursCommand object>, <pimlico.datatypes.word2vec.VectorCommand object>, <pimlico.datatypes.word2vec.SimilarityCommand object>]
data_ready()[source]
load_model()[source]
model
get_software_dependencies()[source]
class Word2VecModelWriter(base_dir, verb_only=False, **kwargs)[source]

Bases: pimlico.datatypes.base.PimlicoDatatypeWriter

Note

Generally, it’s preferable to use pimlico.datatypes.embeddings.Embeddings, which is more generic, so easier to connect up with general vector/embedding-handling modules.

write_word2vec_model(model)[source]
write_keyed_vectors(vectors)[source]