LDA top words

Path pimlico.modules.gensim.lda_top_words
Executable yes

Extract the top words for each topic from a Gensim LDA model.

Can be used as input to coherence evaluation.

Currently, this just outputs the highest probability words, but it could be extended in future to extract words according to other measures, like relevance or lift.

This module does not support Python 2, so can only be used when Pimlico is being run under Python 3

Inputs

Name Type(s)
model lda_model

Outputs

Name Type(s)
top_words topics_top_words

Options

Name Description Type
num_words Number of words to show per topic. Default: 15 int

Example config

This is an example of how this module can be used in a pipeline config file.

[my_lda_top_words_module]
type=pimlico.modules.gensim.lda_top_words
input_model=module_a.some_output

This example usage includes more options.

[my_lda_top_words_module]
type=pimlico.modules.gensim.lda_top_words
input_model=module_a.some_output
num_words=15

Test pipelines

This module is used by the following test pipelines. They are a further source of examples of the module’s usage.