What's New
toolService

Description:
STARK is a highly customizable tool designed for extracting different types of syntactic structures (trees) from parsed corpora (treebanks), aimed at corpus-driven linguistic investigations of syntactic and lexical phenomena ...
This item contains 1 file (3.17
MB).
Publicly Available
corpus

Description:
The Frenk-MRW dataset contains French and Slovene socially unacceptable Facebook comments that are manually annotated for metaphor and metonymy based on the observed incongruity between the basic and contextual meaning. ...
This item contains 1 file (1.82
MB).
Academic Use



lexicalConceptualResource

Description:
ILS is a dataset containing Slovene word forms containing a single lC bigram, i.e. an "l" grapheme preceding a consonant grapheme (a bigram of "l"+C(onsonant) = lC bigram). This combination is one of the less predictable ...
This item contains 1 file (1.05
MB).
Publicly Available



Most Viewed Items
Top Last Week
lexicalConceptualResource

Description:
A lexicon of 751 emoji characters with automatically assigned sentiment.
The sentiment is computed from 70,000 tweets, labeled by 83 human annotators
in 13 European languages.
The process and analysis of emoji sentiment ...
This item contains 3 files (93.95
KB).
Publicly Available



toolService

Description:
This Conformer CTC BPE E2E Automated Speech Recognition model was trained following the NVIDIA NeMo Conformer-CTC fine-tuning recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/de ...
This item contains 1 file (430.87
MB).
Publicly Available
lexicalConceptualResource

Description:
A list of headwords from the collection "Besede slovenskega jezika" (Words of Slovenian Language).
This item contains 1 file (997.48
KB).
Publicly Available


