Show simple item record

 
dc.contributor.author Erjavec, Tomaž
dc.contributor.author Derzhanski, Ivan
dc.contributor.author Divjak, Dagmar
dc.contributor.author Feldman, Anna
dc.contributor.author Kopotev, Mikhail
dc.contributor.author Kotsyba, Natalia
dc.contributor.author Krstev, Cvetana
dc.contributor.author Petrovski, Aleksandar
dc.contributor.author QasemiZadeh, Behrang
dc.contributor.author Radziszewski, Adam
dc.contributor.author Sharoff, Serge
dc.contributor.author Sokolovsky, Paul
dc.contributor.author Vitas, Duško
dc.contributor.author Zdravkova, Katerina
dc.date.accessioned 2015-06-15T08:50:05Z
dc.date.available 2015-06-15T08:50:05Z
dc.date.issued 2010-05-14
dc.identifier.uri http://hdl.handle.net/11356/1042
dc.description The MULTEXT-East morphosyntactic lexicons have a simple structure, where each line is a lexical entry with three tab-separated fields: (1) the word-form, the inflected form of the word; (2) the lemma, the base-form of the word; (3) the MSD, the morphosyntactic description of the word-form, i.e., its fine-grained PoS tag, as defined in the MULTEXT-East morphosyntactic specifications. This submission contains the non-commercial MULTEXT-East lexicons, while a separate submission (http://hdl.handle.net/11356/1041) gives those that are freely available.
dc.language.iso fas
dc.language.iso mkd
dc.language.iso pol
dc.language.iso rus
dc.language.iso srp
dc.publisher Jožef Stefan Institute
dc.relation info:eu-repo/grantAgreement/EC/FP7/211938
dc.relation.isreferencedby https://doi.org/10.1007/s10579-011-9174-8
dc.relation.replaces http://hdl.handle.net/11372/LRT-675
dc.rights MULTEXT-East licence
dc.rights.uri https://nl.ijs.si/ME/mte-licence.txt
dc.rights.label ACA
dc.source.uri http://nl.ijs.si/ME/Vault/V4/
dc.subject lemmatisation
dc.subject inflection
dc.subject part-of-speech tagging
dc.subject multilingual
dc.title MULTEXT-East non-commercial lexicons 4.0
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType lexicon
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
demo.uri http://nl.ijs.si/ME/Vault/V4/doc/index.html#sec-lex
contact.person Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute
sponsor EU Copernicus COP-106 MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages Other
sponsor EU Copernicus CONCEDE Consortium for Central European Dictionary Encoding Other
sponsor FP7 Capacities MONDILEX Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources euFunds info:eu-repo/grantAgreement/EC/FP7/211938
size.info 2288228 entries
files.count 6
files.size 12635404


 Files in this item

 Download all files in item (12.05 MB)
This item is
Academic Use
and licensed under:
MULTEXT-East licence
Attribution Required Noncommercial
Icon
Name
wfl-mk.txt.gz
Size
7.48 MB
Format
application/gzip
Description
Macedonian, 1323572 entries
MD5
5c031243e17ba3846e2b9109c621b1dc
 Download file
Icon
Name
wfl-fa.txt.gz
Size
140.81 KB
Format
application/gzip
Description
Persian, 13006 entries
MD5
d6671a35667f5647101dd1dde8aa5760
 Download file
Icon
Name
wfl-pl.txt.gz
Size
1.45 MB
Format
application/gzip
Description
Polish, 337695 entries
MD5
80e8d92da2b884cc1ac26db658676dd6
 Download file
Icon
Name
wfl-ru.txt.gz
Size
1.61 MB
Format
application/gzip
Description
Russian, 243759 entries
MD5
681d2a333b91f1cfd021456341b25055
 Download file
Icon
Name
wfl-sr.txt.gz
Size
1.37 MB
Format
application/gzip
Description
Serbian, 370196 entries
MD5
e2390479bbe67c8fa0a6c74b246fc542
 Download file
Icon
Name
00README.txt
Size
4.34 KB
Format
Text file
Description
Unknown
MD5
4690f923c2297de2d158134ddf159779
 Download file

Show simple item record