OpenNLP coreference resolution¶
Path | pimlico.modules.opennlp.coreference |
Executable | yes |
Todo
Document this module
Todo
Replace check_runtime_dependencies() with get_software_dependencies()
Use local config setting opennlp_memory to set the limit on Java heap memory for the OpenNLP processes. If parallelizing, this limit is shared between the processes. That is, each OpenNLP worker will have a memory limit of opennlp_memory / processes. That setting can use g, G, m, M, k and K, as in the Java setting.
Inputs¶
Name | Type(s) |
---|---|
parses | ConstituencyParseTreeCorpus |
Outputs¶
Name | Type(s) |
---|---|
coref | CorefCorpus |
Options¶
Name | Description | Type |
---|---|---|
gzip | If True, each output, except annotations, for each document is gzipped. This can help reduce the storage occupied by e.g. parser or coref output. Default: False | bool |
model | Coreference resolution model, full path or directory name. If a filename is given, it is expected to be in the OpenNLP model directory (models/opennlp/). Default: ‘’ (standard English opennlp model in models/opennlp/) | string |
readable | If True, pretty-print the JSON output, so it’s human-readable. Default: False | bool |
timeout | Timeout in seconds for each individual coref resolution task. If this is exceeded, an InvalidDocument is returned for that document | int |