Word2vec embedding trainer
Path |
pimlico.modules.embeddings.word2vec |
Executable |
yes |
Word2vec embedding learning algorithm, using Gensim‘s implementation.
Find out more about word2vec.
This module is simply a wrapper to call Gensim‘s Python
(+C) implementation of word2vec on a Pimlico corpus.
Options
Name |
Description |
Type |
iters |
number of iterations over the data to perform. Default: 5 |
int |
min_count |
word2vec’s min_count option: prunes the dictionary of words that appear fewer than this number of times in the corpus. Default: 5 |
int |
negative_samples |
number of negative samples to include per positive. Default: 5 |
int |
size |
number of dimensions in learned vectors. Default: 200 |
int |