Speech synthesis of Valencian using a conditional variational autoencoder with adversarial learning

Aragó, Joan; Freixes, Marc; Aragó, Joan; Freixes, Marc

doi:https://doi.org/10.3233/FAIA240427

Fecha de publicación

2024-09-25

URI http://hdl.handle.net/20.500.14342/6074

DOI

https://doi.org/10.3233/FAIA240427

ISBN

9781643685434

ISSN

1879-8314

Resumen

The growing demand for high-quality speech synthesis systems in minority languages presents a notable challenge for researchers. In response, this study focuses on synthesizing Valencian speech to develop an effective text-to-speech system for this linguistic variety. A meticulously recorded corpus, comprising 7 hours of speech data, was utilised to train a model based on a conditional variational autoencoder with adversarial learning, specifically Variational Inference with adversarial learning for end-to-end Text-to-Speech (VITS). Additionally, a pretrained multispeaker model was fine-tuned using 30 minutes, and the entire corpus. Perceptual testing was conducted to evaluate the synthesised speech quality, revealing promising results. Notably, the proposed model demonstrated competitiveness compared to the recently released Valencian model by the Aina project, indicating its efficacy in generating natural and fluent Valencian speech. These findings contribute to advancing the field of Valencian text-to-speech synthesis and carry implications for the development of speech synthesis systems in other minority languages.

Tipo de documento

Artículo

Versión del documento

Versión publicada

Lengua

Inglés

Materias (CDU)

00 - Ciencia y conocimiento. Investigación. Cultura. Humanidades

004 - Informática

81 - Lingüística y lenguas

Palabras clave

Human-machine communication

AI applications

Speech synthesis

Páginas

4 p.

Publicado por

IOS Press

Publicado en

Artificial Intelligence Research and Development - Proceedings of the 26th International Conference of the Catalan Association for Artificial Intelligence

Citación recomendada

Esta citación se ha generado automáticamente.

Mostrar el registro completo del ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Contribucions a congressos [252]

Derechos

Excepto si se señala otra cosa, la licencia del ítem se describe como http://creativecommons.org/licenses/by-nc/4.0/