Phone duration modeling of the Serbian language: Comparative evaluation of different models

Sovilj-Nikić S.; Sovilj-Nikić I.

Mоlimо vаs kоristitе оvај idеntifikаtоr zа citirаnjе ili оvај link dо оvе stаvkе: https://open.uns.ac.rs/handle/123456789/5427

Nаziv:	Phone duration modeling of the Serbian language: Comparative evaluation of different models
Аutоri:	Sovilj-Nikić S. Sovilj-Nikić I.
Dаtum izdаvаnjа:	1-јан-2015
Čаsоpis:	Recent Advances in Language and Communication
Sažetak:	© 2015 by Nova Science Publishers, Inc. All rights reserved. Having in mind the importance of segmental duration from a perceptive point of view, a specialized module for modeling segmental duration in natural speech is a very important component of a text-to-speech (TTS) system in order to produce high quality synthetic speech which sounds natural. In this study various machine learning techniques were applied for phone duration modeling of the Serbian language. In this chapter different phone duration prediction models for the Serbian language using linear regression, tree-based algorithms and meta-learning algorithms such as additive regression, bagging and stacking algorithm are presented. Phone duration models have been developed for the full phoneme set of the Serbian language as well as for vowels and consonants separately. A large speech corpus and a feature set of 21 parameters describing phones and their contexts were used for the prediction of segmental duration. Phone duration modeling is based on attributes such as the current segment identity, preceding and following segment types, manner of articulation (for consonants) and voicing of neighboring phones, lexical stress, part-of-speech, word length, the position of the segment in the syllable, the position of the syllable in a word, the position of a word in a phrase, phrase break level, etc. These features have been extracted from the large speech database for the Serbian language. The phone duration model obtained using additive regression method outperformed the other models developed for the Serbian language within this study using different modeling techniques. This model obtained for the full phoneme set outperforms the second best model by approximately 1.3% and 1% in terms of the relative reduction of the root-mean-squared error and the mean absolute error, respectively.
URI:	https://open.uns.ac.rs/handle/123456789/5427
ISBN:	9781634828130
Nаlаzi sе u kоlеkciјаmа:	Naučne i umetničke publikacije

Prikаzаti cеlоkupаn zаpis stаvki

Prеglеd/i stаnicа

10

Prоtеklа nеdеljа
2

Prоtеkli mеsеc
0

prоvеrеnо 10.05.2024.

Google Scholar^TM

Prоvеritе

Аlt mеtrikа

Stаvkе nа DSpace-u su zаštićеnе аutоrskim prаvimа, sа svim prаvimа zаdržаnim, оsim аkо nije drugačije naznačeno.

Prеglеd/i stаnicа

Google ScholarTM

Аlt mеtrikа

Google Scholar^TM