FastText embedding reader (vec)

Path pimlico.modules.input.embeddings.fasttext_vec
Executable yes

Reads in embeddings from the FastText format, storing them in the format used internally in Pimlico for embeddings.

Can be used, for example, to read the pre-trained embeddings offered by Facebook AI.

Currently only reads the text format (.vec), not the binary format (.bin).

See also

pimlico.modules.input.embeddings.fasttext_gensim:
An alternative reader that uses Gensim’s FastText format reading code and permits reading from the binary format, which contains more information.

Inputs

No inputs

Outputs

Name Type(s)
embeddings embeddings

Options

Name Description Type
limit Limit to the first N words. Since the files are typically ordered from most to least frequent, this limits to the N most common words int
path (required) Path to the FastText embedding file string

Example config

This is an example of how this module can be used in a pipeline config file.

[my_fasttext_vec_embedding_reader_module]
type=pimlico.modules.input.embeddings.fasttext_vec
path=value

This example usage includes more options.

[my_fasttext_vec_embedding_reader_module]
type=pimlico.modules.input.embeddings.fasttext_vec
limit=0
path=value