Bibliographic Metadata
Bibliographic Metadata
- TitleLearning How to Speak: Imitation-Based Refinement of Syllable Production in an Articulatory-Acoustic Model
- Author
- Published
- LanguageEnglish
- Document typeConference Proceedings
- URN
- DOI
Restriction-Information
- The document is publicly available on the WWW
Links
- Social MediaShare
- ReferenceNo Reference available
- IIIF
Files
Classification
Abstract
This paper proposes an efficient neural network model for learning the articulatory-acoustic forward and inverse mapping of consonant-vowel sequences including coarticulation effects. It is shown that the learned models can generalize vowels as well as consonants to other contexts and that the need for supervised training examples can be reduced by refining initial forward and inverse models using acoustic examples only. The models are initially trained on smaller sets of examples and then improved by presenting auditory goals that are imitated. The acoustic outcomes of the imitations together with the executed actions provide new training pairs. It is shown that this unsupervised and imitation-based refinement significantly decreases the error of the forward as well as the inverse model. Using a state-of-the-art articulatory speech synthesizer, our approach allows to reproduce the acoustics from learned articulatory trajectories, i.e. we can listen to the results and rate their quality by error measures and perception.
Stats
- The PDF-Document has been downloaded 10 times.
