The evaluation framework contains public evaluation scripts. All the scripts contain additional Dockerfiles that allow for platform-independent evaluation and exact comparison of results. Pre-built Docker images are available in slobench/eval DockerHub repository. The evaluation framework is used and maintained by the SloBENCH leaderboard Web site team. SloBENCH submitters are able to check their compliance of submissions and evaluate theri model on training/validation data prior to submission.
The initial version of SloBENCH contains evaluation scripts with examples of training and testing datasets for nine different tasks: named entity recognition, part-of-speech tagging, lemmatization, dependency parsing, semantic role labeling, translation (ENG-SLO, SLO-ENG), summarization and question answering.