Show simple item record

 
dc.contributor.author Vasić, Daniel
dc.contributor.author Žitko, Branko
dc.contributor.author Gašpar, Angelina
dc.contributor.author Ljubešić, Nikola
dc.contributor.author Štrkalj Despot, Kristina
dc.contributor.author Merkler, Danijela
dc.date.accessioned 2020-11-20T16:41:42Z
dc.date.available 2020-11-20T16:41:42Z
dc.date.issued 2020-11-20
dc.identifier.uri http://hdl.handle.net/11356/1377
dc.description This corpus can be used to build and evaluate methods for extracting and presenting knowledge based on a semantic hypergraph. The corpus consists of 184 simple, complex and dependently complex sentences. All sentences are marked on the levels of tokenisation, sentence segmentation, morphosyntactic tagging, lemmatisation, syntactic dependencies, named entities, and semantic roles. This resource also includes, a representation of a subset of 176 sentences in the form of a semantic hypergraph that can be used to evaluate knowledge extraction methods for Croatian. The sentences used in this corpora are taken from the textbook: Hudeček, L., Mihaljević, M., Sršen, J. and Čamagajevac, S. (2017). Hrvatska Školska Gramatika. Zagreb: Institut za hrvatski jezik i jezikoslovlje. https://gramatika.hr/impresum/
dc.language.iso hrv
dc.publisher University of Mostar
dc.publisher University of Split
dc.publisher Jožef Stefan Institute
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://www.acnltutor.net/
dc.subject knowledge extraction
dc.subject knowledge representation
dc.subject semantics
dc.subject semantic role labeling
dc.title Semantic hypergraph corpus SemCRO 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
demo.uri https://bitbucket.org/danielvasic/croatiangraphbrain
contact.person Daniel Vasić daniel.vasic@fpmoz.sum.ba University of Mostar
sponsor Office of Naval Research N00014-20-1-2066 Enhancing Adaptive Courseware based on Natural Language Processing nationalFunds
size.info 184 sentences
size.info 176 semanticUnits
files.count 1
files.size 22175


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
semcro.v1.zip
Size
21.66 KB
Format
application/zip
Description
Test corpus in extened CoNLL-U (184 sentences) and gold hypergraph (subset of 176 sentences).
MD5
a48538e5aa3ddfe57c5bf9e2282c164d
 Download file  Preview
 File Preview  
  • semcro.v1
    • semcro.test.conll133 kB
    • README.txt1 kB
    • semcro.hypergraph.hg9 kB

Show simple item record