Corpus concatenation

Path pimlico.modules.corpora.concat
Executable no

Concatenate two corpora to produce a bigger corpus.

They must have the same data point type, or one must be a subtype of the other.

In theory, we could find the most specific common ancestor and use that as the output type, but this is not currently implemented and may not be worth the trouble. Perhaps we will add this in future.

This is a filter module. It is not executable, so won’t appear in a pipeline’s list of modules that can be run. It produces its output for the next module on the fly when the next module needs it.

Inputs

Name Type(s)
corpora list of IterableCorpus

Outputs

Name Type(s)
corpus corpus with data-point from input