LDA top words¶

Path	pimlico.modules.gensim.lda_top_words
Executable	yes

Extract the top words for each topic from a Gensim LDA model.

Can be used as input to coherence evaluation.

Currently, this just outputs the highest probability words, but it could be extended in future to extract words according to other measures, like relevance or lift.

This module does not support Python 2, so can only be used when Pimlico is being run under Python 3

Inputs¶

Name	Type(s)
model	`lda_model`

Outputs¶

Name	Type(s)
top_words	`topics_top_words`

Options¶

Name	Description	Type
num_words	Number of words to show per topic. Default: 15	int

Example config¶

This is an example of how this module can be used in a pipeline config file.

[my_lda_top_words_module]
type=pimlico.modules.gensim.lda_top_words
input_model=module_a.some_output

This example usage includes more options.

[my_lda_top_words_module]
type=pimlico.modules.gensim.lda_top_words
input_model=module_a.some_output
num_words=15

Test pipelines¶

This module is used by the following test pipelines. They are a further source of examples of the module’s usage.

lda_top_words