Use este identificador para citar ou linkar para este item: http://repositorio.ufla.br/jspui/handle/1/59902
Título: Avaliação de modelos de aprendizado de máquina para predição do diabetes mellitus
Título(s) alternativo(s): Evaluation of machine learning models for predicting diabetes mellitus
Autores: Guimaraes, Paulo Henrique Sales
Pereira, Geraldo Magela da Cruz
Oliveira, Anderson Castro Soares de
Paixão, Crysttian Arantes
Palavras-chave: Vigitel
Aprendizado de Máquina
Machine learning
Predição de Diabetes
Prediction of Diabetes
Data do documento: 10-Abr-2025
Editor: Universidade Federal de Lavras
Citação: MACÁRIO, Noé Osório. Avaliação de modelos de aprendizado de máquina para predição do diabetes mellitus. 2025. 92 p. Dissertação (Mestrado em Estatística e Experimentação Agropecuária) - Universidade Federal de Lavras, Lavras, 2025.
Resumo: The present work evaluates the performance of different models of machine learning (ML) in the prediction of Diabetes, a chronic condition of great relevance for the public health. Using the VIGITEL (2023) data, which include more than 21 thousand observations, a full pre- processing process was carried out, which evolved selection of variables, balancing of groups, treatment of missing values and data standardization. The analyzed programs were Decision Trees, Random Forests, Naive Bayes, Artificial Neural Nets and XGBoost. The evaluation of the performance of the models was held on the basis of metrics such as sensibility and area under the ROC curve, fundamental to identify positive cases and make an efficient discrimination of the groups. The XGBoost model stood out as the most efficient, presenting the better metrics of sensibility, specificity and area under a ROC curve in almost all approaches (considered all the variables, MIC- Maximal Information Coefficient and PCA - Principal Component Analysis), either for balanced data either unbalanced, which shows its predictive superior capacity. Contrarily, the model of Decision Tree had the worst performance, highlighting its limitations when applied to unbalanced data. The results strengthen the potential of learning machine in the earlier detection of chronic diseases, such as Diabetes, underlining its relevance to master medical diagnostics, optimize costs and give crucial support for clinical interventions more efficient.
URI: http://repositorio.ufla.br/jspui/handle/1/59902
Aparece nas coleções:Estatística e Experimentação Agropecuária - Mestrado (Dissertações)



Este item está licenciada sob uma Licença Creative Commons Creative Commons