What's New
corpus
Description:
The Mići Princ "text and speech" dialectal dataset is a word-aligned version of the translation of The Little Prince into various Chakavian micro-dialects, released by the Udruga Calculus and the Peek&Poke museum ...
This item contains 6 files (1.04
GB).
Publicly Available
lexicalConceptualResource
Description:
The database of the Collocations Dictionary of Modern Slovene 2.0 contains 4,491,958 collocations in 81,443 entries. Collocations occur in 81 different syntactic relations. Collocations are labelled according to their ...
This item contains 1 file (100.09
MB).
Publicly Available
corpus
Description:
The Bosnian web corpus CLASSLA-web.bs 1.0 is based on the MaCoCu-bs 1.0 web corpus crawl (http://hdl.handle.net/11356/1808), which was additionally cleaned and enriched with linguistic and genre information. The CLASSLA-web.bs ...
This item contains 2 files (6.36
GB).
Publicly Available
Most Viewed Items
Top Last Week
lexicalConceptualResource
Description:
Frequency lists of collocations were extracted from the Gigafida 2.1 Corpus of Written Standard Slovene (https://www.clarin.si/noske/run.cgi/corp_info?corpname=gfida21) using specialised scripts for extraction of data from ...
This item contains 1 file (139.56
MB).
Publicly Available
corpus
Description:
The Mići Princ "text and speech" dialectal dataset is a word-aligned version of the translation of The Little Prince into various Chakavian micro-dialects, released by the Udruga Calculus and the Peek&Poke museum ...
This item contains 6 files (1.04
GB).
Publicly Available
corpus
Description:
ParlaMint 4.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.67
GB).
Publicly Available