What's New
corpus

Description:
Gos 2.1 is the reference speech corpus of the Slovenian language. This edition contains about 300 hours of speech, or 2.4 million words, 127 thousand utterances and 1,500 texts. It is composed from three different ...
This item contains 4 files (100.88
GB).
Restricted Use


toolService

Description:
This model for lemmatisation of spoken Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SST treebank of spoken Slovenian (https://github.com/UniversalDependencies/ ...
This item contains 1 file (2.09
MB).
Publicly Available



toolService

Description:
This model for morphosyntactic annotation of spoken Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SST treebank of spoken Slovenian (https://github.com/Universal ...
This item contains 2 files (514.74
MB).
Publicly Available



Most Viewed Items
Top Last Week
corpus

Description:
The corpus contains 256,567 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. The submission contains 7 files: ...
This item contains 8 files (616.88
MB).
Publicly Available



corpus

Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 30 files (5.87
GB).
Publicly Available


corpus

Description:
ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
This item contains 31 files (65.97
GB).
Publicly Available

