What's New
corpus

Description:
Gos 2.1 is the reference speech corpus of the Slovenian language. This edition contains about 300 hours of speech, or 2.4 million words, 127 thousand utterances and 1,500 texts. It is composed from three different ...
Ta vnos vsebuje 4 datotek(e) (100.88
GB).
Restricted Use


toolService

Description:
This model for lemmatisation of spoken Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SST treebank of spoken Slovenian (https://github.com/UniversalDependencies/ ...
Ta vnos vsebuje 1 datoteko (2.09
MB).
Publicly Available



toolService

Description:
This model for morphosyntactic annotation of spoken Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SST treebank of spoken Slovenian (https://github.com/Universal ...
Ta vnos vsebuje 2 datotek(e) (514.74
MB).
Publicly Available



Največ ogledov
V preteklem tednu
corpus

Description:
The FRENK dataset consists of comments to Facebook posts (news articles) of mainstream media outlets from Croatia, Great Britain, and Slovenia, on the topics of migrants and LGBT. The dataset contains whole discussion ...
Ta vnos vsebuje 1 datoteko (4.17
MB).
Academic Use



corpus

Description:
The corpus contains 256,567 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. The submission contains 7 files: ...
Ta vnos vsebuje 8 datotek(e) (616.88
MB).
Publicly Available



corpus

Description:
The LiLaH-HAG dataset (HAG is short for hate-age-gender) consists of metadata on Facebook comments to Facebook posts of mainstream media in Great Britain, Flanders, Slovenia and Croatia. The metadata available in the dataset ...
Ta vnos vsebuje 1 datoteko (128.23
KB).
Publicly Available



