Malt dependency parser¶
Path | pimlico.modules.malt |
Executable | yes |
Runs the Malt dependency parser.
Malt is a Java tool, so we use a Py4J wrapper.
Input is supplied as word annotations (which are converted to CoNLL format for input to the parser). These must include at least each word (field ‘word’) and its POS tag (field ‘pos’). If a ‘lemma’ field is supplied, that will also be used.
The fields in the output contain all of the word features provided by the
parser’s output. Some may be None
if they are empty in the parser output.
All the fields in the input (which always include word
and pos
at least)
are also output.
Inputs¶
Name | Type(s) |
---|---|
documents | grouped_corpus <WordAnnotationsDocumentType > |
Outputs¶
Name | Type(s) |
---|---|
parsed | AddAnnotationField |
Options¶
Name | Description | Type |
---|---|---|
model | Filename of parsing model, or path to the file. If just a filename, assumed to be Malt models dir (models/malt). Default: engmalt.linear-1.7.mco, which can be acquired by ‘make malt’ in the models dir | string |
Example config¶
This is an example of how this module can be used in a pipeline config file.
[my_malt_module]
type=pimlico.modules.malt
input_documents=module_a.some_output
This example usage includes more options.
[my_malt_module]
type=pimlico.modules.malt
input_documents=module_a.some_output
model=engmalt.linear-1.7.mco
Test pipelines¶
This module is used by the following test pipelines. They are a further source of examples of the module’s usage.