What's New
corpus
Description:
GaMS-Instruct-DH is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions. It consists of pairs of prompts and responses, some of which contain an additional context ...
Ta vnos vsebuje 1 datoteko (888.96
KB).
Publicly Available
corpus
Description:
GaMS-Instruct-GEN is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions. It consists of pairs of prompts and responses, some of which contain an additional input ...
Ta vnos vsebuje 1 datoteko (3.12
MB).
Publicly Available
toolService
Description:
The X-GENRE classifier is a text classification model that can be used for automatic genre identification. The model classifies texts to one of 9 genre labels: Information/Explanation, News, Instruction, Opinion/Argumentation, ...
Ta vnos vsebuje 1 datoteko (779.93
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 30 datotek(e) (5.87
GB).
Publicly Available
corpus
Description:
The CVET corpus contains 230 texts (around 175 thousand words) of varying length, published in the religious journal "Cvetje z vertov sv. Frančiška" between 1887 and 1916, when the magazine was edited by the linguist Fr. ...
Ta vnos vsebuje 4 datotek(e) (15.02
MB).
Publicly Available
lexicalConceptualResource
Description:
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100,000 most frequent Slovenian lemmas, ...
Ta vnos vsebuje 2 datotek(e) (85.8
MB).
Publicly Available