Mostrar el registro sencillo del ítem

dc.contributorUniversitat Ramon Llull. La Salle
dc.contributor.authorAragó, Joan
dc.contributor.authorFreixes, Marc
dc.date.accessioned2026-03-17T19:34:48Z
dc.date.available2026-03-17T19:34:48Z
dc.date.created2024
dc.date.issued2024-09-25
dc.identifier.isbn9781643685434ca
dc.identifier.issn1879-8314ca
dc.identifier.urihttp://hdl.handle.net/20.500.14342/6074
dc.description.abstractThe growing demand for high-quality speech synthesis systems in minority languages presents a notable challenge for researchers. In response, this study focuses on synthesizing Valencian speech to develop an effective text-to-speech system for this linguistic variety. A meticulously recorded corpus, comprising 7 hours of speech data, was utilised to train a model based on a conditional variational autoencoder with adversarial learning, specifically Variational Inference with adversarial learning for end-to-end Text-to-Speech (VITS). Additionally, a pretrained multispeaker model was fine-tuned using 30 minutes, and the entire corpus. Perceptual testing was conducted to evaluate the synthesised speech quality, revealing promising results. Notably, the proposed model demonstrated competitiveness compared to the recently released Valencian model by the Aina project, indicating its efficacy in generating natural and fluent Valencian speech. These findings contribute to advancing the field of Valencian text-to-speech synthesis and carry implications for the development of speech synthesis systems in other minority languages.ca
dc.format.extent4 p.ca
dc.language.isoengca
dc.publisherIOS Pressca
dc.relation.ispartofArtificial Intelligence Research and Development - Proceedings of the 26th International Conference of the Catalan Association for Artificial Intelligenceca
dc.rights© L'autor/aca
dc.rightsAttribution-NonCommercial 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/*
dc.subject.otherHuman-machine communicationca
dc.subject.otherAI applicationsca
dc.subject.otherSpeech synthesisca
dc.titleSpeech synthesis of Valencian using a conditional variational autoencoder with adversarial learningca
dc.typeinfo:eu-repo/semantics/articleca
dc.rights.accessLevelinfo:eu-repo/semantics/openAccess
dc.embargo.termscapca
dc.subject.udc00ca
dc.subject.udc004ca
dc.subject.udc81ca
dc.identifier.doihttps://doi.org/10.3233/FAIA240427ca
dc.description.versioninfo:eu-repo/semantics/publishedVersionca


Ficheros en el ítem

 

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

© L'autor/a
Excepto si se señala otra cosa, la licencia del ítem se describe como http://creativecommons.org/licenses/by-nc/4.0/
Compartir en TwitterCompartir en LinkedinCompartir en FacebookCompartir en TelegramCompartir en WhatsappImprimir