Files in this item
This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- konvNormSl.zip
- Size
- 4.57 MB
- Format
- application/zip
- Description
- Dataset
- MD5
- 98a809350431cce453224e842a413212
- konvNormSl
- README.txt1 kB
- token
- dev
- goo300k-gaj.token.dev.norm.txt302 kB
- tweet-L3.token.dev.norm.txt57 kB
- tweet-L1.token.dev.orig.txt57 kB
- goo300k-gaj.token.dev.orig.txt303 kB
- tweet-L3.token.dev.orig.txt56 kB
- goo300k-bohoric.token.dev.norm.txt82 kB
- tweet-L1.token.dev.norm.txt57 kB
- goo300k-bohoric.token.dev.orig.txt85 kB
- train
- goo300k-bohoric.token.train.orig.txt733 kB
- tweet-L1.token.train.orig.txt452 kB
- goo300k-gaj.token.train.norm.txt2 MB
- tweet-L3.token.train.norm.txt484 kB
- goo300k-gaj.token.train.orig.txt2 MB
- tweet-L3.token.train.orig.txt471 kB
- goo300k-bohoric.token.train.norm.txt705 kB
- tweet-L1.token.train.norm.txt454 kB
- test
- tweet-L3.token.test.orig.txt58 kB
- goo300k-gaj.token.test.norm.txt314 kB
- goo300k-gaj.token.test.orig.txt314 kB
- tweet-L1.token.test.norm.txt58 kB
- goo300k-bohoric.token.test.norm.txt85 kB
- tweet-L3.token.test.norm.txt60 kB
- tweet-L1.token.test.orig.txt58 kB
- goo300k-bohoric.token.test.orig.txt88 kB
- dev
- segment
- dev
- goo300k-gaj.segment.dev.norm.txt255 kB
- tweet-L3.segment.dev.norm.txt48 kB
- goo300k-bohoric.segment.dev.norm.txt69 kB
- goo300k-gaj.segment.dev.orig.txt256 kB
- tweet-L3.segment.dev.orig.txt47 kB
- goo300k-bohoric.segment.dev.orig.txt72 kB
- tweet-L1.segment.dev.norm.txt48 kB
- tweet-L1.segment.dev.orig.txt48 kB
- train
- goo300k-gaj.segment.train.norm.txt1 MB
- goo300k-bohoric.segment.train.norm.txt593 kB
- tweet-L3.segment.train.orig.txt394 kB
- tweet-L1.segment.train.orig.txt385 kB
- goo300k-gaj.segment.train.orig.txt1 MB
- goo300k-bohoric.segment.train.orig.txt621 kB
- tweet-L3.segment.train.norm.txt407 kB
- tweet-L1.segment.train.norm.txt386 kB
- test
- tweet-L3.segment.test.orig.txt48 kB
- goo300k-bohoric.segment.test.orig.txt74 kB
- tweet-L1.segment.test.norm.txt49 kB
- tweet-L1.segment.test.orig.txt49 kB
- goo300k-gaj.segment.test.norm.txt264 kB
- tweet-L3.segment.test.norm.txt49 kB
- goo300k-bohoric.segment.test.norm.txt71 kB
- goo300k-gaj.segment.test.orig.txt264 kB
- dev