pimlico.datatypes.coref.corenlp module

Datatypes for coreference resolution output. Based on Stanford CoreNLP’s coref output, so includes all the information provided by that.

class pimlico.datatypes.coref.corenlp.CorefCorpus(base_dir, pipeline, raw_data=False)[source]

Bases: pimlico.datatypes.tar.TarredCorpus

process_document(data)[source]
datatype_name = 'corenlp_coref'
class pimlico.datatypes.coref.corenlp.CorefCorpusWriter(base_dir, gzip=False, append=False, trust_length=False, encoding='utf-8')[source]

Bases: pimlico.datatypes.tar.TarredCorpusWriter

document_to_raw_data(data)
class pimlico.datatypes.coref.corenlp.Entity(id, mentions)[source]

Bases: object

class pimlico.datatypes.coref.corenlp.Mention(id, sentence_num, start_index, end_index, text, type, position=None, animacy=None, is_representative_mention=None, number=None, gender=None)[source]

Bases: object

static from_json(json)[source]
to_json_dict()[source]