Please use this identifier to cite or link to this item: https://open.uns.ac.rs/handle/123456789/12233
Title: On the use of higher frame rate in the training phase of ASR
Authors: Pekar, Darko 
Jakovljević, Nikša 
Janev M.
Mišković, Dragiša 
Delić, Vlado 
Issue Date: 1-Dec-2010
Journal: International Conference on Computers - Proceedings
Abstract: The number of observations which are the basis for parameter estimation plays an important role in the quality of acoustic models. HMM based automatic speech recognition (ASR) systems generally have to cope with an insufficient number of observations for a good estimate. One way of tackling this problem is a well known procedure of state-tying, which is performed in order to gather sufficient information for a reasonable estimate for a large number of models. This procedure introduces an additional bias into the estimates, often leading to poor recognition results. In this paper a simple alternative to that solution is offered. It should be noted that most existing ASR systems use the same frame step size of 10ms in the training of the acoustical models, justifying it with the fact that speech signals exhibit quasi-stationary behavior at shorter durations. We claim that it is fully acceptable to adopt a much smaller frame step size in the acoustical training, thus providing estimators with a significantly higher number of observations compared to the standard 10ms case. This results in better parameter estimates and consequently better recognition results. Beside being justifiable from a phonetical point of view, it is also supported by results of an experimental on a real ASR system.
URI: https://open.uns.ac.rs/handle/123456789/12233
ISBN: 9789604742011
Appears in Collections:FTN Publikacije/Publications

Show full item record

Page view(s)

29
Last Week
2
Last month
7
checked on May 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.