Datoteke v tem vnosu
To je vnos
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
z licenco:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
- Ime
- ParlaSpeech-HR.v2.0.jsonl.gz
- Velikost
- 362.17 MB
- Format
- application/gzip
- Opis
- Corpus text in gzipped JSON Lines format
- MD5
- bfdad5b7a3fc1a5f42e2e00b6fdd999f
- Ime
- ParlaSpeech-HR.v2.0.part1.tgz
- Velikost
- 30.48 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 1
- MD5
- 065b28dab675a9fa7b96e4aa2f37418b
- Ime
- ParlaSpeech-HR.v2.0.part2.tgz
- Velikost
- 42.37 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 2
- MD5
- 53a37542cfe6e860eefee48caf180d66
- Ime
- ParlaSpeech-HR.v2.0.part3.tgz
- Velikost
- 37.61 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 3
- MD5
- e41cc3aa0d8b54c82b3250021ed4bf88
- Ime
- ParlaSpeech-HR.v2.0.part4.tgz
- Velikost
- 41.48 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 4
- MD5
- 5b618ca214c3f846f4d1d46386253c18
- Ime
- ParlaSpeech-HR.v2.0.part5.tgz
- Velikost
- 50.13 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 5
- MD5
- 9cbc3155cde96d8b9e0359745820febc
- Ime
- ParlaSpeech-HR.v2.0.part6.tgz
- Velikost
- 4.91 GB
- Format
- Neznano
- Opis
- Speech in FLAC format, part 6
- MD5
- ea859bacdbbb236c5b13f4bba6a4122f
- Ime
- README.txt
- Velikost
- 1023 bajtov
- Format
- Besedilna datoteka
- Opis
- Description of the corpus format
- MD5
- 7baa432c16d1480a961fd52ab5a95e97
ASR training dataset for Croatian ParlaSpeech-HR v2.0 http://hdl.handle.net/11356/1914 The ParlaSpeech-HR.v2.0.jsonl (JSON lines) file consists of entries with the following attributes: id: ParlaMint utterance ID with zero-based character offsets pointing to the specific part of the utterance words: List of character and milisecond offsets to specific words in the trasncript, especially useful for further segmentation of each entry audio: path to the FLAC file (available from the part*.tgz files), the folder name corresponding to the YouTube video ID audio_length: length of the recording in seconds text: transcript of the audio text_start: starting character position in the original ParlaMint 4.0 utterance text_end: ending character position in the original ParlaMint 4.0 utterance audio_start: starting milisecond position in the original YouTube video audio_end: ending milisecond position in the original YouTube video speaker_info: full information on the speaker (and speech) from th . . .