A data-centric approach for portuguese speech recognition: language model and its implications

Alvarenga, João Paulo Reis; Merschmann, Luiz Henrique de Campos; Luz, Eduardo José da Silva

Use este identificador para citar ou linkar para este item: http://repositorio.ufla.br/jspui/handle/1/59506

Registro completo de metadados

Campo DC	Valor	Idioma
dc.creator	Alvarenga, João Paulo Reis	-
dc.creator	Merschmann, Luiz Henrique de Campos	-
dc.creator	Luz, Eduardo José da Silva	-
dc.date.accessioned	2024-09-26T16:36:34Z	-
dc.date.available	2024-09-26T16:36:34Z	-
dc.date.issued	2023	-
dc.identifier.citation	ALVARENGA, J. P. R.; MERSCHMANN, L. H. de C.; LUZ, E. J. da S. A data-centric approach for portuguese speech recognition: language model and its implications. IEEE Latin America Transactions, [S.l.], v. 21, n. 4, p. 546-556, 2023.	pt_BR
dc.identifier.uri	https://latamt.ieeer9.org/index.php/transactions/article/view/7464	pt_BR
dc.identifier.uri	http://repositorio.ufla.br/jspui/handle/1/59506	-
dc.description.abstract	Recent advances in Automatic Speech Recognition have made it possible to achieve a quality never seen before in the literature, both for languages with abundant data, such as English, which has a large number of studies and for the Portuguese language, which has a more limited amount of resources and studies. The most recent advances address speech recognition problems with Transformers based models, which have the capability to perform the speech recognition task directly from the raw signal, without the need for manual feature extraction. Some studies have already shown that it is possible to further improve the quality of the transcription of these models using language models within the decoding stage, however, the real impact of such language models is still not clear, especially for the Brazilian Portuguese scenario. Also, it is known that the quality of the data used for training the models is of paramount importance, however, there are few works in the literature addressing this issue. This work explores the impact of language models applied to Portuguese speech recognition both in terms of data quality and computational performance, with a data-centric approach. We propose an approach to measure similarity between datasets and, thus, assist in decision-making during training. The approach indicates paths for the advancement of the state-of-the-art aiming at Portuguese speech recognition, showing that it is possible to reduce the size of the language model by 80% and still achieve error rates around 7.17% for the Common Voice dataset. The source code is available at https://github.com/joaoalvarenga/language-model-evaluation.	pt_BR
dc.language	en_US	pt_BR
dc.publisher	Institute of Electrical and Electronics Engineers	pt_BR
dc.rights	restrictAccess	pt_BR
dc.source	IEEE Latin America Transactions	pt_BR
dc.subject	Automatic speech recognition	pt_BR
dc.subject	Language model	pt_BR
dc.subject	Brazilian portuguese	pt_BR
dc.subject	Wav2vec2	pt_BR
dc.subject	KenLM	pt_BR
dc.title	A data-centric approach for portuguese speech recognition: language model and its implications	pt_BR
dc.type	Artigo	pt_BR
Aparece nas coleções:	DCC - Artigos publicados em periódicos

Arquivos associados a este item:

Não existem arquivos associados a este item.

Mostrar registro simples do item Recomendar este item Visualizar estatísticas

Ferramentas do administrador