What's New

 corpus 
corpus
Description:
The dataset was created using a large number of Serbian Legislation texts gathered from the https://www.pravno-informacioni-sistem.rs/ website. The gathered texts were used for fine-tuning a neural network called SRBerta ...
 This item contains 5 files (66.42 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 toolService 
toolService
Description:
The SloNER is a model for Slovenian Named Entity Recognition. It is is a PyTorch neural network model, intended for usage with the HuggingFace transformers library (https://github.com/huggingface/transformers). The model ...
 This item contains 1 file (387.44 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 corpus 
corpus
Description:
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, ...
 This item contains 2 files (128.43 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
The ParlaSpeech-HR dataset is built from parliamentary proceedings available in the Croatian part of the ParlaMint corpus and the parliamentary recordings available from the Croatian Parliament's YouTube channel. The corpus ...
 This item contains 5 files (117.25 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike