dc.contributor.author |
Jelovšek, Tjaša |
dc.contributor.author |
Lebar Bajec, Iztok |
dc.contributor.author |
Bajec, Marko |
dc.contributor.author |
Bajec, Žan |
dc.contributor.author |
Cvek, Jernej |
dc.date.accessioned |
2022-12-02T10:50:58Z |
dc.date.available |
2022-12-02T10:50:58Z |
dc.date.issued |
2022-12-01 |
dc.identifier.uri |
http://hdl.handle.net/11356/1742 |
dc.description |
This Text Normalisator converts Slovene text from written-form into its spoken-form. Traditionally it is an essential preprocessing step before text-to-speech (TTS). As input it accepts text as a string, and returns a dictionary with fields "input_text", "normalised_text", "status" and "logs". Example:
normalize_text("Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta 1954, je, da je temperatura trojne točke vode enaka 0,01 °C.")
{'input_text': 'Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta 1954, je, da je temperatura trojne točke vode enaka 0,01 °C.', 'normalized_text': 'Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta tisoč devetsto štiriinpetdeset, je, da je temperatura trojne točke vode enaka nič celih nič ena stopinje Celzija.', 'status': 1, 'logs': [('1954', 'tisoč devetsto štiriinpetdeset'), ('0,01', 'nič celih nič ena'), ('°C', 'stopinje Celzija')]}
For further details see README.md. |
dc.language.iso |
slv |
dc.publisher |
Faculty of Computer and Information Science, University of Ljubljana |
dc.relation.isreferencedby |
https://rsdo.slovenscina.eu/en/speech-technologies |
dc.rights |
Apache License 2.0 |
dc.rights.uri |
https://opensource.org/licenses/Apache-2.0 |
dc.rights.label |
PUB |
dc.source.uri |
https://github.com/clarinsi/Slovene_normalizator |
dc.subject |
text normalisation |
dc.title |
Slovene Text Normalizator RSDO-DS2-NORM 1.0 |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
has.files |
yes |
branding |
CLARIN.SI data & tools |
demo.uri |
https://www.slovenscina.eu/en/razpoznavalnik |
contact.person |
Iztok Lebar Bajec ilb@fri.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana |
sponsor |
Ministry of Culture C3340-20-278001 Development of Slovene in a Digital Environment Other |
files.count |
1 |
files.size |
145991680 |