Prikaži enostavni zapis vnosa

 
dc.contributor.author Mozetič, Igor
dc.contributor.author Grčar, Miha
dc.contributor.author Smailović, Jasmina
dc.date.accessioned 2016-02-23T10:08:53Z
dc.date.available 2016-04-25T21:45:18Z
dc.date.issued 2016-02-23
dc.identifier.uri http://hdl.handle.net/11356/1054
dc.description The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages. The data can be used to train and evaluate Twitter sentiment classifiers, to compute annotator agreement, or to study the differences between language usage on Twitter. The data analysis is described in the following papers: I. Mozetič, M. Grčar, J. Smailović. Multilingual Twitter sentiment classification: The role of human annotators, PLoS ONE 11(5): e0155036, doi: 10.1371/journal.pone.e0155036, 2016. (http://dx.doi.org/10.1371/journal.pone.0155036) I. Mozetič, L. Torgo, V. Cerqueira, J. Smailović. How to evaluate sentiment classifiers for Twitter time-ordered data?, PLoS ONE 13(3): e0194317, doi: 10.1371/journal.pone.0194317, 2018. (https://dx.doi.org/10.1371/journal.pone.0194317)
dc.language.iso sqi
dc.language.iso bos
dc.language.iso bul
dc.language.iso hrv
dc.language.iso eng
dc.language.iso deu
dc.language.iso hun
dc.language.iso pol
dc.language.iso por
dc.language.iso srp
dc.language.iso rus
dc.language.iso slk
dc.language.iso slv
dc.language.iso spa
dc.language.iso swe
dc.publisher Jožef Stefan Institute
dc.relation info:eu-repo/grantAgreement/EC/FP7/610704
dc.relation info:eu-repo/grantAgreement/EC/FP7/317532
dc.relation info:eu-repo/grantAgreement/EC/H2020/640772
dc.relation.isreferencedby https://dx.doi.org/10.1371/journal.pone.0155036
dc.relation.isreferencedby https://dx.doi.org/10.1371/journal.pone.0194317
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.subject sentiment classification
dc.subject Twitter
dc.subject inter-annotator agreement
dc.subject annotator self-agreement
dc.subject multilingual
dc.title Twitter sentiment for 15 European languages
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Igor Mozetic igor.mozetic@ijs.si Jožef Stefan Institute
sponsor EC 610704 SIMPOL euFunds info:eu-repo/grantAgreement/EC/FP7/610704
sponsor EC 317532 MULTIPLEX euFunds info:eu-repo/grantAgreement/EC/FP7/317532
sponsor EC 640772 DOLFINS euFunds info:eu-repo/grantAgreement/EC/H2020/640772
sponsor ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds
size.info 1643735 items
files.count 16
files.size 51781021


 Datoteke v tem vnosu

 Prenesi vse datoteke v vnosu (49.38 MB)
Icon
Ime
README.txt
Velikost
665 bajtov
Format
Besedilna datoteka
Opis
Unknown
MD5
8b2d34f643b73d8a44557dc8d2ba6d2f
 Prenesi datoteko  Predogled
 Predogled datoteke  
There are 15 files for the corresponding 15 European languages:
Albanian, Bosnian, Bulgarian, Croatian, English, German, Hungarian,
Polish, Portuguese, Russian, Serbian, Slovak, Slovenian, Spanish, and Swedish.

Files are in the standard csv format, each line has the following form:
TweetID,HandLabel,AnnotatorID

TweetID is assigned by Twitter and can be used to retreive the tweet.
HandLabel is the sentimen label as assigned by the human annotator
(Negative, Neutral, or Positive).
AnnotatorID is a 3-digit integer assigned to anonymous annotators,
and can be used to identify tweets annotated several times by the
same or by different annotators. . . .
                                            
Icon
Ime
German_Twitter_sentiment.csv
Velikost
3.27 MB
Format
Neznano
Opis
CSV file
MD5
b6b766a80454a928ce0b90211dd60bab
 Prenesi datoteko
Icon
Ime
English_Twitter_sentiment.csv
Velikost
3.1 MB
Format
Neznano
Opis
CSV file
MD5
8407a2302f20336a8809ba74f6d0112a
 Prenesi datoteko
Icon
Ime
Croatian_Twitter_sentiment.csv
Velikost
2.95 MB
Format
Neznano
Opis
CSV file
MD5
d28be685aa56adf237a3d59e6043ddd7
 Prenesi datoteko
Icon
Ime
Bulgarian_Twitter_sentiment.csv
Velikost
2.02 MB
Format
Neznano
Opis
CSV file
MD5
c1248f10e9130b70036a994a11c44018
 Prenesi datoteko
Icon
Ime
Albanian_Twitter_sentiment.csv
Velikost
1.65 MB
Format
Neznano
Opis
CSV file
MD5
aaebf885a823be2e941a7bf58e1aeb5b
 Prenesi datoteko
Icon
Ime
Russian_Twitter_sentiment.csv
Velikost
3.14 MB
Format
Neznano
Opis
CSV file
MD5
199a3dd666abbb193d5541aac35eb9d6
 Prenesi datoteko
Icon
Ime
Bosnian_Twitter_sentiment.csv
Velikost
1.35 MB
Format
Neznano
Opis
CSV file
MD5
8dcfbeb77c8ae28b1f3211831705e14a
 Prenesi datoteko
Icon
Ime
Portuguese_Twitter_sentiment.csv
Velikost
4.6 MB
Format
Neznano
Opis
CSV file
MD5
446fe4c9be94b69b419cc8c81aea284e
 Prenesi datoteko
Icon
Ime
Polish_Twitter_sentiment.csv
Velikost
6.77 MB
Format
Neznano
Opis
CSV file
MD5
647396610cce71dda658924111a3833b
 Prenesi datoteko
Icon
Ime
Hungarian_Twitter_sentiment.csv
Velikost
2.07 MB
Format
Neznano
Opis
CSV file
MD5
cb7301ef7a528cf4360c8a7303d5d723
 Prenesi datoteko
Icon
Ime
Swedish_Twitter_sentiment.csv
Velikost
1.77 MB
Format
Neznano
Opis
CSV file
MD5
7e7f6885784b4e195bcf02a25d4881dd
 Prenesi datoteko
Icon
Ime
Spanish_Twitter_sentiment.csv
Velikost
8.31 MB
Format
Neznano
Opis
CSV file
MD5
8b0a56106e3764d17787eecc502484f3
 Prenesi datoteko
Icon
Ime
Slovenian_Twitter_sentiment.csv
Velikost
4.03 MB
Format
Neznano
Opis
CSV file
MD5
5717253537f2241c80be514ed135c612
 Prenesi datoteko
Icon
Ime
Slovak_Twitter_sentiment.csv
Velikost
2.14 MB
Format
Neznano
Opis
CSV file
MD5
8889f24b7fbf7662324612440fa8d723
 Prenesi datoteko
Icon
Ime
Serbian_Twitter_sentiment.csv
Velikost
2.22 MB
Format
Neznano
Opis
CSV file
MD5
faeefe277de414e78e27ef679864c0e9
 Prenesi datoteko

Prikaži enostavni zapis vnosa