dc.contributor.author |
Klemen, Matej |
dc.contributor.author |
Kosem, Iztok |
dc.contributor.author |
Arhar Holdt, Špela |
dc.contributor.author |
Pollak, Senja |
dc.contributor.author |
Huber, Damjan |
dc.contributor.author |
Lutar, Mateja |
dc.date.accessioned |
2022-11-14T09:50:02Z |
dc.date.available |
2022-11-14T09:50:02Z |
dc.date.issued |
2022-11-14 |
dc.identifier.uri |
http://hdl.handle.net/11356/1696 |
dc.description |
The KUUS corpus comprises 17 textbooks for Slovenian as a second and foreign language published between 2002 and 2022 at the Centre for Slovene as a Second and Foreign Language (Faculty of Arts, University of Ljubljana). These textbooks were widely used in the teaching of Slovenian as a second and foreign language to children, adolescents and adults in Slovenia and abroad at the time of the creation of the corpus. The KUUS consists of 520,796 words. It was linguistically annotated with the CLASSLA v1.1.1 pipeline (https://github.com/clarinsi/classla/) at the levels of tokenization, sentence segmentation, lemmatization, MULTEXT-East v6 MSD-tags (https://nl.ijs.si/ME/V6/msd/html/msd-sl.html), JOS dependency syntax (https://nl.ijs.si/jos/bib/jos-skladnja-navodila.pdf), and named entities (https://nl.ijs.si/janes/wp-content/uploads/2017/09/SlovenianNER-eng-v1.1.pdf). The metadata for each of the textbooks includes the information about the title, subtitle, authors, year of publication, publisher, CEFR level, target audience, and the estimated number of lessons for the textbook.
The corpus is presented in more detail in: KLEMEN, Matej, ARHAR HOLDT, Špela, POLLAK, Senja, KOSEM, Iztok, HUBER, Damjan, LUTAR, Mateja, 2022: Korpus učbenikov za učenje slovenščine kot drugega in tujega jezika. Nataša Pirih Svetina, Ina Ferbežar (eds.): Na stičišču svetov: slovenščina kot drugi in tuji jezik. Obdobja 41. Ljubljana: Založba Univerze v Ljubljani. 165–174. DOI: https://doi.org/10.4312/Obdobja.41.2784-7152 |
dc.language.iso |
slv |
dc.publisher |
Centre for Slovene as a Second and Foreign Language, University of Ljubljana |
dc.publisher |
Centre for Language Resources and Technologies, University of Ljubljana |
dc.relation.isreferencedby |
https://doi.org/10.4312/Obdobja.41.2784-7152 |
dc.relation.isreplacedby |
http://hdl.handle.net/11356/1877 |
dc.rights |
CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0 |
dc.rights.uri |
https://clarin.si/repository/xmlui/page/licence-aca-id-by-nc-inf-nored-1.0 |
dc.rights.label |
ACA |
dc.source.uri |
https://centerslo.si/KUUS |
dc.subject |
textbook corpus |
dc.subject |
L2 |
dc.subject |
language learning |
dc.title |
Corpus of textbooks for learning Slovenian as L2 KUUS 1.0 |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
has.files |
yes |
branding |
CLARIN.SI data & tools |
contact.person |
Matej Klemen matej.klemen@ff.uni-lj.si Centre for Slovene as a Second and Foreign Language, University of Ljubljana |
sponsor |
Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds |
sponsor |
ARRS J7-3159 Empirical foundations for digitally-supported development of writing skills nationalFunds |
sponsor |
ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds |
sponsor |
ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds |
sponsor |
Centre for Slovene as a Second and Foreign Language, University of Ljubljana - KUUS ownFunds |
size.info |
520796 words |
files.count |
1 |
files.size |
56636904 |