dc.contributor.author |
Rei, Luis |
dc.contributor.author |
Krek, Simon |
dc.contributor.author |
Mladenić, Dunja |
dc.date.accessioned |
2016-11-28T13:47:36Z |
dc.date.available |
2016-11-28T13:47:36Z |
dc.date.issued |
2016-11-28 |
dc.identifier.uri |
http://hdl.handle.net/11356/1078 |
dc.description |
The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total, the corpus contains almost 20K annotated messages and 350K tokens.
The corpus is described in
Luis Rei, Dunja Mladenić, Simon Krek. A Multilingual Social Media Linguistic Corpus. Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities. 27–28 September 2016, Ljubljana, Slovenia. https://nl.ijs.si/janes/cmc-corpora2016/proceedings/ |
dc.language.iso |
spa |
dc.language.iso |
ita |
dc.language.iso |
deu |
dc.publisher |
Jožef Stefan Institute |
dc.relation |
info:eu-repo/grantAgreement/EC/FP7/611346 |
dc.rights |
The MIT License (MIT) |
dc.rights.uri |
https://opensource.org/licenses/mit-license.php |
dc.rights.label |
PUB |
dc.source.uri |
https://github.com/lrei/xlime_twitter_corpus |
dc.subject |
social media |
dc.subject |
computer-mediated communication |
dc.subject |
Twitter |
dc.subject |
part-of-speech tagging |
dc.subject |
named entities |
dc.subject |
sentiment classification |
dc.subject |
multilingual |
dc.subject |
manual annotation |
dc.title |
xLiMe Twitter Corpus XTC 1.0.1 |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
hidden |
false |
hasMetadata |
false |
has.files |
yes |
branding |
CLARIN.SI data & tools |
contact.person |
Luis Rei luis.rei@ijs.si Jožef Stefan Institute |
sponsor |
ICT Programme FP7-ICT-611346 xLiMe euFunds info:eu-repo/grantAgreement/EC/FP7/611346 |
size.info |
363994 tokens |
size.info |
19669 texts |
files.count |
2 |
files.size |
6592396 |