What's New
lexicalConceptualResource
Description:
ArboSloleks is a dataset containing Slovene word formation trees that have been automatically constructed from word relations (http://hdl.handle.net/11356/1986) extracted from Sloleks 2.0 (http://hdl.handle.net/11356/1230). ...
Ta vnos vsebuje 1 datoteko (2.53
MB).
Publicly Available
corpus
Description:
This corpus consists of editions of three volumes of sermons written by Ignatius Holzapfel (1799-1866) when he was active as parish priest in Črnomelj and Ribnica. The bulk of Holzapfel's manuscript legacy remained ...
Ta vnos vsebuje 1 datoteko (278.19
KB).
Publicly Available
corpus
Description:
The document contains a diplomatic transcription of over 285 pages of manuscript documents about the Slovenian mystic Magdalena Gornik (1835-1896) from the village of Gora near Sodražica. The vast majority of the documents ...
Ta vnos vsebuje 1 datoteko (866.85
KB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 30 datotek(e) (5.87
GB).
Publicly Available
corpus
Description:
The hr500k training corpus contains about 500,000 tokens manually annotated on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation and named entities. About half of the corpus is also ...
Ta vnos vsebuje 3 datotek(e) (91.53
MB).
Publicly Available
corpus
Description:
The ssj500k training corpus is based on two training corpora built within the JOS project (https://nl.ijs.si/jos/). It contains the jos100k corpus and additional material from the jos1M corpus forming a training corpus ...
Ta vnos vsebuje 3 datotek(e) (17.7
MB).
Publicly Available