Please use this identifier to cite or link to this item: http://repositorio.ufla.br/jspui/handle/1/59596
Title: Seleção de variáveis em modelos de regressão: uma avaliação do uso de redes de probabilidades condicionais
Other Titles: Variable selection in regression models: an evaluation of the use of conditional probability networks
Authors: Oliveira, Izabela Regina Cardoso de
Bueno Filho, Júlio Sílvio de Sousa
Cerqueira, Pedro Henrique Ramos
Fernandes, Tales Jesus
Dorneles, Elaine Maria Seles
Keywords: Redes bayesianas
Grafo acíclico direcionado
Modelo probabilístico
Métodos stepwise
Bayesian networks
Directed acyclic graph
Probabilistic model
Stepwise methods
Issue Date: 23-Oct-2024
Publisher: Universidade Federal de Lavras
Citation: MELO, Roger Almeida Pereira. Seleção de variáveis em modelos de regressão: uma avaliação do uso de redes de probabilidades condicionais. 2024. 167 p. Tese (Doutorado em Estatística e Experimentação Agropecuária) - Universidade Federal de Lavras, Lavras, 2024.
Abstract: The Bayesian network is a method presented by Judea Pearl in 1985 that describes a probabi- listic graphical model, which represents a set of variables and their conditional dependencies with a directed acyclic graph. The vertices (or nodes) represent propositions (or variables), and directed edges (or arcs) signify the probabilistic dependencies between these variables. The objective of this study is to evaluate the use of Bayesian networks for the selection of variables in regression models. This technique is compared with stepwise methods in simulation sce- narios that consider different sample sizes, correlations between the variables (responses and variables) and different numbers of variables. In addition to the simulation study, we present a practical application of Bayesian networks in this context. For this purpose, data from a study conducted between 2018 and 2019 involving veterinarians in Minas Gerais were used to iden- tify the most important risk factors associated with accidental exposure to antiviral vaccines, specifically, vaccines for Brucella abortus (Brucellosis). One of the results of interest in this study was the prevalence of brucellosis among these professionals, which was estimated using a logistic regression model. According to the Bayesian network, the most important covariates associated with accidental exposure to vaccines were knowledge about the symptoms of brucel- losis, whether the professional had performed premature childbirth procedures or abortions in the previous six months and the frequency with which the professional used personal protective equipment. All analyses were performed in R software using the bnlearn package. We recom- mend combining stepwise methods with Bayesian Networks, as stepwise methods are effective for automatic variable selection, while Bayesian Networks excel at visualizing and understan- ding indirect associations between variables. This combined approach enriches the analysis, providing a more comprehensive and detailed view of the results.
URI: http://repositorio.ufla.br/jspui/handle/1/59596
Appears in Collections:Estatística e Experimentação Agropecuária - Doutorado (Teses)



This item is licensed under a Creative Commons License Creative Commons