A language model for highly inflective non-agglutinative languages

Ostrogonac S.; Mišković, Dragiša; Sečujski, Milan; Pekar, Darko; Delić, Vlado

Mоlimо vаs kоristitе оvај idеntifikаtоr zа citirаnjе ili оvај link dо оvе stаvkе: https://open.uns.ac.rs/handle/123456789/11482

Nаziv:	A language model for highly inflective non-agglutinative languages
Аutоri:	Ostrogonac S. Mišković, Dragiša Sečujski, Milan Pekar, Darko Delić, Vlado
Dаtum izdаvаnjа:	12-дец-2012
Čаsоpis:	2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics, SISY 2012
Sažetak:	This paper proposes a method of creating language models for highly inflective non-agglutinative languages. Three types of language models were considered - a common n-gram model, an n-gram model of lemmas and a class n-gram model. The last two types were specially designed for the Serbian language reflecting its unique grammar structure. All the language models were trained on a carefully collected data set incorporating several literary styles and a great variety of domain-specific textual documents in Serbian. Language models of the three types were created for different sets of textual corpora and evaluated by perplexity values they have given on the test data. A log-linear combination of the common, lemma-based and class n-gram models that was also created shows promising results in overcoming the data sparsity problem. However, the evaluation of this combined model in the context of a large vocabulary continuous speech recognition system (LVCSR) is yet to be done in order to establish the improvement in terms of word error rate (WER). © 2012 IEEE.
URI:	https://open.uns.ac.rs/handle/123456789/11482
ISBN:	9781467347518
DOI:	10.1109/SISY.2012.6339510
Nаlаzi sе u kоlеkciјаmа:	FTN Publikacije/Publications

Prikаzаti cеlоkupаn zаpis stаvki

SCOPUS^TM
Nаvоđеnjа

5

prоvеrеnо 09.09.2023.

Prеglеd/i stаnicа

37

Prоtеklа nеdеljа
4

Prоtеkli mеsеc
12

prоvеrеnо 10.05.2024.

Google Scholar^TM

Prоvеritе

Аlt mеtrikа

Stаvkе nа DSpace-u su zаštićеnе аutоrskim prаvimа, sа svim prаvimа zаdržаnim, оsim аkо nije drugačije naznačeno.

SCOPUSTM Nаvоđеnjа

Prеglеd/i stаnicа

Google ScholarTM

Аlt mеtrikа

SCOPUS^TM
Nаvоđеnjа

Google Scholar^TM