Please use this identifier to cite or link to this item: https://open.uns.ac.rs/handle/123456789/3624
Title: Language model optimization for a deep neural network based speech recognition system for Serbian
Authors: Pakoci, Edvin 
Popović, Boris
Pekar, Darko 
Issue Date: 1-Jan-2017
Journal: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract: © Springer International Publishing AG 2017. This paper presents the results obtained using several variants of trigram language models in a large vocabulary continuous speech recognition (LVCSR) system for the Serbian language, based on the deep neural network (DNN) framework implemented within the Kaldi speech recognition toolkit. This training approach allows parallelization using several threads on either multiple GPUs or multiple CPUs, and provides a natural-gradient modification to the stochastic gradient descent (SGD) optimization method. Acoustic models are trained over a fixed number of training epochs with parameter averaging in the end. This paper discusses recognition using different language models trained with Kneser-Ney or Good-Turing smoothing methods, as well as several pruning parameter values. The results on a test set containing more than 120000 words and different utterance types are explored and compared to the referent results with GMM-HMM speaker-adapted models for the same speech database. Online and offline recognition results are compared to each other as well. Finally, the effect of additional discriminative training using a language model prior to the DNN stage is explored.
URI: https://open.uns.ac.rs/handle/123456789/3624
ISBN: 9783319664286
ISSN: 3029743
DOI: 10.1007/978-3-319-66429-3_48
Appears in Collections:FTN Publikacije/Publications

Show full item record

SCOPUSTM   
Citations

8
checked on May 10, 2024

Page view(s)

20
Last Week
7
Last month
0
checked on May 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.