OpenNLP coreference resolution¶

Path	pimlico.modules.opennlp.coreference
Executable	yes

Todo

Document this module

Todo

Replace check_runtime_dependencies() with get_software_dependencies()

Use local config setting opennlp_memory to set the limit on Java heap memory for the OpenNLP processes. If parallelizing, this limit is shared between the processes. That is, each OpenNLP worker will have a memory limit of opennlp_memory / processes. That setting can use g, G, m, M, k and K, as in the Java setting.

Inputs¶

Name	Type(s)
parses	`ConstituencyParseTreeCorpus`

Outputs¶

Name	Type(s)
coref	`CorefCorpus`

Options¶

Name	Description	Type
gzip	If True, each output, except annotations, for each document is gzipped. This can help reduce the storage occupied by e.g. parser or coref output. Default: False	bool
model	Coreference resolution model, full path or directory name. If a filename is given, it is expected to be in the OpenNLP model directory (models/opennlp/). Default: ‘’ (standard English opennlp model in models/opennlp/)	string
readable	If True, pretty-print the JSON output, so it’s human-readable. Default: False	bool
timeout	Timeout in seconds for each individual coref resolution task. If this is exceeded, an InvalidDocument is returned for that document	int