Files in this item
Download all files in item (40.95 MB)This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)





- Name
- ssj500k.conllu.zip
- Size
- 10 MB
- Format
- application/zip
- Description
- Corpus in CONLL-U format, complete corpus with UD morphology and separately the UD syntactically annotated part, also split into train/dev/test.
- MD5
- f65ae2995a2a7acfe43b1a5aa3140dca
- ssj500k.conllu
- ssj500k-ud-morphology.conllu38 MB
- sl_ssj-ud_v2.4-dev.conllu1 MB
- sl_ssj-ud_v2.4.conllu11 MB
- sl_ssj-ud_v2.4-train.conllu9 MB
- sl_ssj-ud_v2.4-test.conllu1 MB
- 00README.txt147 B

- Name
- ssj500k-en.TEI.zip
- Size
- 11.92 MB
- Format
- application/zip
- Description
- Corpus encoded in TEI format with annotations in English
- MD5
- 2c5bb4d729bb03dbc2d88d8358196cfa
- ssj500k-en.TEI
- ssj500k.back.xml552 kB
- ssj500k-en.xml51 kB
- schema
- tei_clarin_doc.xml7 MB
- tei_clarin.zip87 kB
- tei_clarin.rnc282 kB
- tei_clarin_schema.xml3 kB
- tei_clarin_example.xml32 kB
- tei_clarin.dtd229 kB
- tei_clarin_doc.html7 MB
- tei_clarin.rng579 kB
- 00README.txt147 B
- ssj500k-en.body.xml98 MB

- Name
- ssj500k-sl.TEI.zip
- Size
- 11.92 MB
- Format
- application/zip
- Description
- Corpus encoded in TEI format with annotations in Slovene
- MD5
- da8d2116b54be5d26ec675e8bb5fc996
- ssj500k-sl.TEI
- ssj500k-sl.xml51 kB
- ssj500k-sl.body.xml98 MB
- ssj500k.back.xml552 kB
- schema
- tei_clarin_doc.xml7 MB
- tei_clarin.zip87 kB
- tei_clarin.rnc282 kB
- tei_clarin_schema.xml3 kB
- tei_clarin_example.xml32 kB
- tei_clarin.dtd229 kB
- tei_clarin_doc.html7 MB
- tei_clarin.rng579 kB
- 00README.txt147 B

- Name
- ssj500k.vert.zip
- Size
- 7.12 MB
- Format
- application/zip
- Description
- Corpus in derived vertical (Sketch Engine / CQP) format
- MD5
- 4c30a74912329a5252f942829f0f4a79
- ssj500k.vert
- ssj500k22.vert44 MB
- ssj500k22.regi4 kB
- 00README.txt147 B