Files in this item
Download all files in item (1.74 MB)This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
- Name
- coref149_corefud_train.conllu
- Size
- 1.19 MB
- Format
- Unknown
- Description
- Labeled coref149 training set in CoNLL-U format.
- MD5
- 52637114f35442028eb178647c71a872
- Name
- coref149_corefud_test_unlabeled.conllu
- Size
- 563.83 KB
- Format
- Unknown
- Description
- Unlabeled coref149 test set in CoNLL-U format.
- MD5
- f72542c02f9149250a19010d0e835428
- Name
- README.txt
- Size
- 1.91 KB
- Format
- Text file
- Description
- Description of the resource.
- MD5
- bf4de0b48e5d082dac5e4cb71121b209
CorefUD conversion of Slovene coreference resolution corpus coref149 v1.0 http://hdl.handle.net/11356/1989 CC BY-NC-SA 4.0 This corpus is the CorefUD conversion of the coref149 corpus for coreference resolution in Slovene (http://hdl.handle.net/11356/1182). It contains 149 documents annotated with coreference information: 100 training and 49 test documents. The test documents were selected according to the underlying cluster distribution: most documents contain a small to medium amount of clusters while a few contain a large amount of clusters. Coreference in Universal Dependencies (CorefUD) is an initiative to collect coreference corpora in various languages and harmonize them to the same scheme and data format (CoNLL-U). The coreference information is stored in the MISC column. More concretely, the start and end of each coreference mention is marked with the "Entity=" attribute. For example, "Entity=(e0" marks the start of the entity e0 at the current token while "Entity=e0) marks . . .