The MULTEXT-East morphosyntactic lexicons have a simple structure, where each line is a lexical entry with three tab-separated fields: (1) the word-form, the inflected form of the word; (2) the lemma, the base-form of the word; (3) the MSD, the morphosyntactic description of the word-form, i.e., its fine-grained PoS tag, as defined in the MULTEXT-East morphosyntactic specifications.
This submission contains the non-commercial MULTEXT-East lexicons, while a separate submission (http://hdl.handle.net/11356/1041) gives those that are freely available.
Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute
sponsor
EU Copernicus COP-106 MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages Other
sponsor
EU Copernicus CONCEDE Consortium for Central European Dictionary Encoding Other
sponsor
FP7 Capacities MONDILEX Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources euFunds info:eu-repo/grantAgreement/EC/FP7/211938