Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 4.1

Erjavec, Tomaž; Kopp, Matyáš; Ogrodniczuk, Maciej; Osenova, Petya; Agerri, Rodrigo; Agirrezabal, Manex; Agnoloni, Tommaso; Aires, José; Albini, Monica; Alkorta, Jon; Antiba-Cartazo, Iván; Arrieta, Ekain; Barcala, Mario; Bardanca, Daniel; Barkarson, Starkaður; Bartolini, Roberto; Battistoni, Roberto; Bel, Nuria; Bonet Ramos, Maria del Mar; Calzada Pérez, María; Cardoso, Aida; Çöltekin, Çağrı; Coole, Matthew; Darģis, Roberts; de Does, Jesse; de Libano, Ruben; Depoorter, Griet; Depuydt, Katrien; Diwersy, Sascha; Dodé, Réka; Fernandez, Kike; Fernández Rei, Elisa; Frontini, Francesca; Garcia, Marcos; García Díaz, Noelia; García Louzao, Pedro; Gavriilidou, Maria; Gkoumas, Dimitris; Grigorov, Ilko; Grigorova, Vladislava; Haltrup Hansen, Dorte; Iruskieta, Mikel; Jarlbrink, Johan; Jelencsik-Mátyus, Kinga; Jongejan, Bart; Kahusk, Neeme; Kirnbauer, Martin; Kryvenko, Anna; Ligeti-Nagy, Noémi; Ljubešić, Nikola; Luxardo, Giancarlo; Magariños, Carmen; Magnusson, Måns; Marchetti, Carlo; Marx, Maarten; Meden, Katja; Mendes, Amália; Mochtak, Michal; Mölder, Martin; Montemagni, Simonetta; Navarretta, Costanza; Nitoń, Bartłomiej; Norén, Fredrik Mohammadi; Nwadukwe, Amanda; Ojsteršek, Mihael; Pančur, Andrej; Papavassiliou, Vassilis; Pereira, Rui; Pérez Lago, María; Piperidis, Stelios; Pirker, Hannes; Pisani, Marilina; Pol, Henk van der; Prokopidis, Prokopis; Quochi, Valeria; Rayson, Paul; Regueira, Xosé Luís; Rii, Andriana; Rudolf, Michał; Ruisi, Manuela; Rupnik, Peter; Schopper, Daniel; Simov, Kiril; Sinikallio, Laura; Skubic, Jure; Tamper, Minna; Tungland, Lars Magne; Tuominen, Jouni; van Heusden, Ruben; Varga, Zsófia; Vázquez Abuín, Marta; Venturi, Giulia; Vidal Miguéns, Adrián; Vider, Kadri; Vivel Couso, Ainhoa; Vladu, Adina Ioana; Wissik, Tanja; Yrjänäinen, Väinö; Zevallos, Rodolfo; Fišer, Darja

dc.contributor.author	Erjavec, Tomaž
dc.contributor.author	Kopp, Matyáš
dc.contributor.author	Ogrodniczuk, Maciej
dc.contributor.author	Osenova, Petya
dc.contributor.author	Agerri, Rodrigo
dc.contributor.author	Agirrezabal, Manex
dc.contributor.author	Agnoloni, Tommaso
dc.contributor.author	Aires, José
dc.contributor.author	Albini, Monica
dc.contributor.author	Alkorta, Jon
dc.contributor.author	Antiba-Cartazo, Iván
dc.contributor.author	Arrieta, Ekain
dc.contributor.author	Barcala, Mario
dc.contributor.author	Bardanca, Daniel
dc.contributor.author	Barkarson, Starkaður
dc.contributor.author	Bartolini, Roberto
dc.contributor.author	Battistoni, Roberto
dc.contributor.author	Bel, Nuria
dc.contributor.author	Bonet Ramos, Maria del Mar
dc.contributor.author	Calzada Pérez, María
dc.contributor.author	Cardoso, Aida
dc.contributor.author	Çöltekin, Çağrı
dc.contributor.author	Coole, Matthew
dc.contributor.author	Darģis, Roberts
dc.contributor.author	de Does, Jesse
dc.contributor.author	de Libano, Ruben
dc.contributor.author	Depoorter, Griet
dc.contributor.author	Depuydt, Katrien
dc.contributor.author	Diwersy, Sascha
dc.contributor.author	Dodé, Réka
dc.contributor.author	Fernandez, Kike
dc.contributor.author	Fernández Rei, Elisa
dc.contributor.author	Frontini, Francesca
dc.contributor.author	Garcia, Marcos
dc.contributor.author	García Díaz, Noelia
dc.contributor.author	García Louzao, Pedro
dc.contributor.author	Gavriilidou, Maria
dc.contributor.author	Gkoumas, Dimitris
dc.contributor.author	Grigorov, Ilko
dc.contributor.author	Grigorova, Vladislava
dc.contributor.author	Haltrup Hansen, Dorte
dc.contributor.author	Iruskieta, Mikel
dc.contributor.author	Jarlbrink, Johan
dc.contributor.author	Jelencsik-Mátyus, Kinga
dc.contributor.author	Jongejan, Bart
dc.contributor.author	Kahusk, Neeme
dc.contributor.author	Kirnbauer, Martin
dc.contributor.author	Kryvenko, Anna
dc.contributor.author	Ligeti-Nagy, Noémi
dc.contributor.author	Ljubešić, Nikola
dc.contributor.author	Luxardo, Giancarlo
dc.contributor.author	Magariños, Carmen
dc.contributor.author	Magnusson, Måns
dc.contributor.author	Marchetti, Carlo
dc.contributor.author	Marx, Maarten
dc.contributor.author	Meden, Katja
dc.contributor.author	Mendes, Amália
dc.contributor.author	Mochtak, Michal
dc.contributor.author	Mölder, Martin
dc.contributor.author	Montemagni, Simonetta
dc.contributor.author	Navarretta, Costanza
dc.contributor.author	Nitoń, Bartłomiej
dc.contributor.author	Norén, Fredrik Mohammadi
dc.contributor.author	Nwadukwe, Amanda
dc.contributor.author	Ojsteršek, Mihael
dc.contributor.author	Pančur, Andrej
dc.contributor.author	Papavassiliou, Vassilis
dc.contributor.author	Pereira, Rui
dc.contributor.author	Pérez Lago, María
dc.contributor.author	Piperidis, Stelios
dc.contributor.author	Pirker, Hannes
dc.contributor.author	Pisani, Marilina
dc.contributor.author	Pol, Henk van der
dc.contributor.author	Prokopidis, Prokopis
dc.contributor.author	Quochi, Valeria
dc.contributor.author	Rayson, Paul
dc.contributor.author	Regueira, Xosé Luís
dc.contributor.author	Rii, Andriana
dc.contributor.author	Rudolf, Michał
dc.contributor.author	Ruisi, Manuela
dc.contributor.author	Rupnik, Peter
dc.contributor.author	Schopper, Daniel
dc.contributor.author	Simov, Kiril
dc.contributor.author	Sinikallio, Laura
dc.contributor.author	Skubic, Jure
dc.contributor.author	Tamper, Minna
dc.contributor.author	Tungland, Lars Magne
dc.contributor.author	Tuominen, Jouni
dc.contributor.author	van Heusden, Ruben
dc.contributor.author	Varga, Zsófia
dc.contributor.author	Vázquez Abuín, Marta
dc.contributor.author	Venturi, Giulia
dc.contributor.author	Vidal Miguéns, Adrián
dc.contributor.author	Vider, Kadri
dc.contributor.author	Vivel Couso, Ainhoa
dc.contributor.author	Vladu, Adina Ioana
dc.contributor.author	Wissik, Tanja
dc.contributor.author	Yrjänäinen, Väinö
dc.contributor.author	Zevallos, Rodolfo
dc.contributor.author	Fišer, Darja
dc.date.accessioned	2024-06-04T18:47:44Z
dc.date.available	2024-06-04T18:47:44Z
dc.date.issued	2024-06-03
dc.identifier.uri	http://hdl.handle.net/11356/1911
dc.description	ParlaMint 4.1 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora comprise between 9 and 126 million words and the complete set contains over 1.2 billion words. The transcriptions are divided by days with information on the term, session and meeting, and contain speeches marked by the speaker and their role (e.g. chair, regular speaker). The speeches also contain marked-up transcriber comments, such as gaps in the transcription, interruptions, applause, etc. The corpora have extensive metadata, most importantly on speakers (name, gender, MP and minister status, party affiliation), on their political parties and parliamentary groups (name, coalition/opposition status, Wikipedia-sourced left-to-right political orientation, and CHES variables, https://www.chesdata.eu/). Note that some corpora have further metadata, e.g. the year of birth of the speakers, links to their Wikipedia articles, their membership in various committees, etc. The transcriptions are also marked with the subcorpora they belong to ("reference", until 2020-01-30, "covid", from 2020-01-31, and "war", from 2022-02-24). An overview of the statistics of the corpora is avaialable on GitHub in the folder Build/Metadata, in particular for the release 4.1 at https://github.com/clarin-eric/ParlaMint/tree/v4.1/Build/Metadata. The corpora are encoded according to the ParlaMint encoding guidelines (https://clarin-eric.github.io/ParlaMint/) and schemas (included in the distribution). The ParlaMint.ana linguistic annotation includes tokenization; sentence segmentation; lemmatisation; Universal Dependencies part-of-speech, morphological features, and syntactic dependencies; and the 4-class CoNLL-2003 named entities. Some corpora also have further linguistic annotations, in particular PoS tagging according a language-specific scheme, with their corpus TEI headers giving further details on the annotation vocabularies and tools used. This entry contains the ParlaMint.ana TEI-encoded linguistically annotated corpora; the derived CoNLL-U files along with TSV metadata of the speeches; and the derived vertical files (with their registry file), suitable for use with CQP-based concordancers, such as CWB, noSketch Engine or KonText. Also included is the 4.1 release of the sample data and scripts available at the GitHub repository of the ParlaMint project at https://github.com/clarin-eric/ParlaMint and the log files produced in the process of building the corpora for this release. The log files show e.g. known errors in the corpora, while more information about known problems is available in the open issues at the GitHub repository of the project. This entry contains the linguistically marked-up version of the corpus, while the text version, i.e. without the linguistic annotation is also available at http://hdl.handle.net/11356/1912. Another related resource, namely the ParlaMint corpora machine translated to English ParlaMint-en.ana 4.1 can be found at http://hdl.handle.net/11356/1910. As opposed to the previous version 4.0, this version fixes a number of bugs and restructures the ParlaMint GitHub repository. The DK corpus has been linguistically re-annotated to remove bugs, while its speeches are now also marked with topics. The PT corpus has been extended to 2024-03 and the UA corpus to 2023-11, which also has improved language marking (uk vs. ru) on segments.
dc.language.iso	bos
dc.language.iso	bul
dc.language.iso	cat
dc.language.iso	hrv
dc.language.iso	ces
dc.language.iso	dan
dc.language.iso	nld
dc.language.iso	eng
dc.language.iso	est
dc.language.iso	fra
dc.language.iso	glg
dc.language.iso	deu
dc.language.iso	hun
dc.language.iso	isl
dc.language.iso	ita
dc.language.iso	lav
dc.language.iso	ell
dc.language.iso	nor
dc.language.iso	pol
dc.language.iso	por
dc.language.iso	rus
dc.language.iso	srp
dc.language.iso	slv
dc.language.iso	spa
dc.language.iso	swe
dc.language.iso	tur
dc.language.iso	ukr
dc.language.iso	fin
dc.language.iso	eus
dc.publisher	CLARIN ERIC
dc.relation.isreferencedby	https://doi.org/10.1007/s10579-024-09798-w
dc.relation.replaces	http://hdl.handle.net/11356/1860
dc.relation.isreplacedby	http://hdl.handle.net/11356/2005
dc.rights	Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.rights.label	PUB
dc.source.uri	https://www.clarin.eu/content/parlamint
dc.subject	Parla-CLARIN
dc.subject	parliamentary debates
dc.subject	COVID-19
dc.subject	TEI
dc.subject	Bulgarian Parliament
dc.subject	Croatian Parliament
dc.subject	Polish Parliament
dc.subject	Slovenian Parliament
dc.subject	Czech Parliament
dc.subject	Icelandic Parliament
dc.subject	Belgian Parliament
dc.subject	Danish Parliament
dc.subject	Spanish Parliament
dc.subject	Dutch Parliament
dc.subject	Turkish Parliament
dc.subject	Italian Parliament
dc.subject	Hungarian Parliament
dc.subject	Latvian Parliament
dc.subject	French Parliament
dc.subject	Bosnian Parliament
dc.subject	Catalonian Parliament
dc.subject	Galician Parliament
dc.subject	Greek Parliament
dc.subject	Norwegian Parliament
dc.subject	Serbian Parliament
dc.subject	Swedish Parliament
dc.subject	Ukrainian Parliament
dc.subject	Finnish Parliament
dc.subject	Estonian Parliament
dc.subject	Basque Parliament
dc.subject	Portuguese parliament
dc.subject	Austrian Parliament
dc.subject	UK Parliament
dc.title	Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 4.1
dc.type	corpus
metashare.ResourceInfo#ContentInfo.mediaType	text
has.files	yes
branding	CLARIN.SI data & tools
demo.uri	https://github.com/clarin-eric/ParlaMint/
contact.person	Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute
contact.person	Matyáš Kopp kopp@ufal.mff.cuni.cz Charles University
sponsor	CLARIN ERIC - ParlaMint: Towards Comparable Parliamentary Corpora Other
sponsor	Austrian Academy of Sciences - ÖAW nationalFunds
sponsor	European Commission POIR.04.02.00-00C002/19 European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, CLARIN – Common Language Resources and Technology Infrastructure Other
sponsor	Dutch Language Institute - - nationalFunds
sponsor	Ministry of Education, Youth and Sports of the Czech Republic LM2023062 LINDAT/CLARIAH-CZ: Digital Research Infrastructure for Language Technologies, Arts and Humanities nationalFunds
sponsor	Department of Nordic Studies and Linguistics (NorS), University of Copenhagen CLARIN-DK CLARIN-DK nationalFunds
sponsor	Galician Language Institute, University of Santiago de Compostela - - ownFunds
sponsor	Xunta de Galicia - University of Santiago de Compostela 2021-CP080 Nós: Galician in the society and economy of artificial intelligence (2021-CP080), agreement between Xunta de Galicia and the University of Santiago de Compostela nationalFunds
sponsor	Hungarian Research Centre for Linguistics - - nationalFunds
sponsor	National Library of Norway - - nationalFunds
sponsor	Institute of Computer Science, Polish Academy of Sciences - - nationalFunds
sponsor	Polish Ministry of Education and Science 2022/WK/09 National contribution to CLARIN ERIC – European Research Infrastructure Consortium: Common Language Resources and Technology Infrastructure 2022–2023 (CLARIN Q) nationalFunds
sponsor	Fundação para a Ciência e a Tecnologia UIDP/00214/2020 - nationalFunds
sponsor	Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds
sponsor	ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor	Nederlandse Organisatie voor Wetenschappelijk Onderwijs CISC.CC.016 Access to City Councils using Exploratory Search Systems nationalFunds
sponsor	Bulgarian Ministry of Education and Science DO1-301/17.12.21 Bulgarian National Interdisciplinary Research e-Infrastructure for Resources and Technologies in favor of the Bulgarian Language and Cultural Heritage, part of the EU infrastructures CLARIN and DARIAH nationalFunds
sponsor	Institute for Language and Speech Processing / ATHENA RC - - nationalFunds
sponsor	ARRS (Slovenian Research Agency) J7-4642 MEZZANINE nationalFunds
sponsor	The Árni Magnsússon Institute for Icelandic Studies - - ownFunds
sponsor	Slovenian Research Agency (ARRS) P6-0436 Basic national research program 'Digital Humanities' (2022-2027) nationalFunds
sponsor	ARRS (Slovenian Research Agency) N6-0099 Flemish-Slovenian bilateral basic research project ‘Linguistic landscape of hate speech online’ (2019-2023) nationalFunds
sponsor	ARRS (Slovenian Research Agency) N6-0288 the MSCA Seal of Excellence postdoctoral project 'The Changing Discursive Semantics of EU Representations' (2022-2024) nationalFunds
sponsor	Ministry of Science and Innovation of Spain - - nationalFunds
sponsor	HiTZ - Ixa Group (UPV/EHU) - - Other
sponsor	Spanish Ministry of Science and Innovation PID2019-108866RB-I0 / AEI / 10.13039/501100011033 Original, translated and interpreted representations of the refugee cris(e)s: methodological triangulation within corpus-based discourse studies nationalFunds
size.info	8132022 utterances
size.info	1231036093 words
files.count	31
files.size	70832756094
featuredService.noske	Joint 4.1 corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_xx
featuredService.noske	Austrian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_at
featuredService.noske	Bosnian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_ba
featuredService.noske	Belgian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_be
featuredService.noske	Bulgarian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_bg
featuredService.noske	Czech corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_cz
featuredService.noske	Danish corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_dk
featuredService.noske	Estonian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_ee
featuredService.noske	Spanish Corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_es
featuredService.noske	Catalan corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_es_ct
featuredService.noske	Galician corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_es_ga
featuredService.noske	Basque corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_es_pv
featuredService.noske	Finnish corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_fi
featuredService.noske	French corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_fr
featuredService.noske	British corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_gb
featuredService.noske	Greek corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_gr
featuredService.noske	Croatian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_hr
featuredService.noske	Hungarian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_hu
featuredService.noske	Icelandic corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_is
featuredService.noske	Italian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_it
featuredService.noske	Latvian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_lv
featuredService.noske	Dutch corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_nl
featuredService.noske	Norwegian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_no
featuredService.noske	Polish corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_pl
featuredService.noske	Portuguese corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_pt
featuredService.noske	Serbian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_rs
featuredService.noske	Swedish corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_se
featuredService.noske	Slovenian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_si
featuredService.noske	Turkish corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_tr
featuredService.noske	Ukrainian corpus\|https://www.clarin.si/ske/#dashboard?corpname=parlamint41_ua
featuredService.teitok	ParlaMint 4.1\|https://lindat.mff.cuni.cz/services/teitok/parlamint-41/

Datoteke v tem vnosu

To je vnos

Publicly Available

z licenco:
Creative Commons - Attribution 4.0 International (CC BY 4.0)

Ime: ParlaMint-AT.ana.tgz
Velikost: 3.27 GB
Format: Neznano
Opis: Austrian corpus
MD5: 78f9130e33733d805e75da9049c5b863

Prenesi datoteko

Ime: ParlaMint-BA.ana.tgz
Velikost: 996.43 MB
Format: Neznano
Opis: Bosnian corpus
MD5: 1f05074a95668ff0b96acd43b7283f74

Prenesi datoteko

Ime: ParlaMint-BE.ana.tgz
Velikost: 2.78 GB
Format: Neznano
Opis: Belgian corpus
MD5: 9b8ee3913b35d0dca9a20934d55b93ed

Prenesi datoteko

Ime: ParlaMint-BG.ana.tgz
Velikost: 1.5 GB
Format: Neznano
Opis: Bulgarian corpus
MD5: ae44c0153df36efabbfcc783717dd37a

Prenesi datoteko

Ime: ParlaMint-CZ.ana.tgz
Velikost: 1.85 GB
Format: Neznano
Opis: Czech corpus
MD5: c2784ebdafd94b188f4eae73af8223fc

Prenesi datoteko

Ime: ParlaMint-DK.ana.tgz
Velikost: 1.85 GB
Format: Neznano
Opis: Danish corpus
MD5: 1abbd96121881f18af9bf45a426b2689

Prenesi datoteko

Ime: ParlaMint-EE.ana.tgz
Velikost: 1.25 GB
Format: Neznano
Opis: Estonian corpus
MD5: 6d71a849b3d395f988f7300db04c6fc7

Prenesi datoteko

Ime: ParlaMint-ES.ana.tgz
Velikost: 966.93 MB
Format: Neznano
Opis: Spanish Corpus
MD5: 4f6972487be30a83c1fe6105870752c3

Prenesi datoteko

Ime: ParlaMint-ES-CT.ana.tgz
Velikost: 967.05 MB
Format: Neznano
Opis: Catalan corpus
MD5: 40a5e16dd61d6cda1d572249f7c41d16

Prenesi datoteko

Ime: ParlaMint-ES-GA.ana.tgz
Velikost: 916.22 MB
Format: Neznano
Opis: Galician corpus
MD5: 8f4c727d94705143f51b5fad52e769b9

Prenesi datoteko

Ime: ParlaMint-ES-PV.ana.tgz
Velikost: 859.77 MB
Format: Neznano
Opis: Basque corpus
MD5: 7c2a100171a99640656fca936b13cecf

Prenesi datoteko

Ime: ParlaMint-FI.ana.tgz
Velikost: 845.5 MB
Format: Neznano
Opis: Finnish corpus
MD5: 61921c5cb7283f1af91739b05fe5ee2b

Prenesi datoteko

Ime: ParlaMint-FR.ana.tgz
Velikost: 2.41 GB
Format: Neznano
Opis: French corpus
MD5: 0e65bd7a72b31b0c39ecfac66043e200

Prenesi datoteko

Ime: ParlaMint-GB.ana.tgz
Velikost: 5.59 GB
Format: Neznano
Opis: British corpus
MD5: c48c4e0481ddc5a5715723193bf909e1

Prenesi datoteko

Ime: ParlaMint-GR.ana.tgz
Velikost: 2.94 GB
Format: Neznano
Opis: Greek corpus
MD5: c008600bca2cca1e268a64f36223b1b4

Prenesi datoteko

Ime: ParlaMint-HR.ana.tgz
Velikost: 4.57 GB
Format: Neznano
Opis: Croatian corpus
MD5: c66e01eb321ae299f28d383416a64025

Prenesi datoteko

Ime: ParlaMint-HU.ana.tgz
Velikost: 1.68 GB
Format: Neznano
Opis: Hungarian corpus
MD5: 59e72bc21f4f24b84daa24b02a69eed5

Prenesi datoteko

Ime: ParlaMint-IS.ana.tgz
Velikost: 1.48 GB
Format: Neznano
Opis: Icelandic corpus
MD5: 4aa2bae3200dccfe49e9d3772b30a3eb

Prenesi datoteko

Ime: ParlaMint-IT.ana.tgz
Velikost: 1.65 GB
Format: Neznano
Opis: Italian corpus
MD5: b8a25860b9018ab9083a727f66819778

Prenesi datoteko

Ime: ParlaMint-LV.ana.tgz
Velikost: 562.82 MB
Format: Neznano
Opis: Latvian corpus
MD5: 71d5b15cc94a1b2997833a12c655b423

Prenesi datoteko

Ime: ParlaMint-NL.ana.tgz
Velikost: 3.04 GB
Format: Neznano
Opis: Dutch corpus
MD5: 3653aa4375200958e8e886ec57dc99f1

Prenesi datoteko

Ime: ParlaMint-NO.ana.tgz
Velikost: 4.08 GB
Format: Neznano
Opis: Norwegian corpus
MD5: b6232de904b63db366863f890446f28f

Prenesi datoteko

Ime: ParlaMint-PL.ana.tgz
Velikost: 2.07 GB
Format: Neznano
Opis: Polish corpus
MD5: c5c3e8fd5a15308fb63facc8a04e6b6f

Prenesi datoteko

Ime: ParlaMint-PT.ana.tgz
Velikost: 1.15 GB
Format: Neznano
Opis: Portuguese corpus
MD5: 6e6e45bc6ace03fd6b7d67292e0df6a8

Prenesi datoteko

Ime: ParlaMint-RS.ana.tgz
Velikost: 4.39 GB
Format: Neznano
Opis: Serbian corpus
MD5: 805610f0126213cf92811be4bd128077

Prenesi datoteko

Ime: ParlaMint-SE.ana.tgz
Velikost: 2.32 GB
Format: Neznano
Opis: Swedish corpus
MD5: 5fd1647243c23e078bc6c6b8f0204dbb

Prenesi datoteko

Ime: ParlaMint-SI.ana.tgz
Velikost: 3.86 GB
Format: Neznano
Opis: Slovenian corpus
MD5: bb483408e241c364e5f0d975cc2bc587

Prenesi datoteko

Ime: ParlaMint-TR.ana.tgz
Velikost: 2.82 GB
Format: Neznano
Opis: Turkish corpus
MD5: 16a86ab87bb9762f87c565b11d54c70f

Prenesi datoteko

Ime: ParlaMint-UA.ana.tgz
Velikost: 3.39 GB
Format: Neznano
Opis: Ukrainian corpus
MD5: d5fe45f11bbebf05883a66c60e690f89

Prenesi datoteko

Ime: ParlaMint-4.1.tgz
Velikost: 18.77 MB
Format: Neznano
Opis: https://github.com/clarin-eric/ParlaMint/releases/tag/v4.1 (samples, schemas, scripts)
MD5: 91929b37c965a5c6591b1cf2eda271ea

Prenesi datoteko

Ime: ParlaMint-4.1-Logs.tgz
Velikost: 23.36 MB
Format: Neznano
Opis: Build log files of the corpora
MD5: 4c2f2b7d5394eceab9f7dbf5a217b55a

Prenesi datoteko

Prikaži enostavni zapis vnosa

Datoteke v tem vnosu

Partnerji

Partnerji

Repozitorij