The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring

Aniche, Mauricio; Maziero, Erick; Durelli, Rafael; Durelli, Vinicius

The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring

dc.creator	Aniche, Mauricio
dc.creator	Maziero, Erick
dc.creator	Durelli, Rafael
dc.creator	Durelli, Vinicius
dc.date.accessioned	2021-07-16T16:42:43Z
dc.date.available	2021-07-16T16:42:43Z
dc.date.issued	2020
dc.description.abstract	Refactoring is the process of changing the internal structure of software to improve its quality without modifying its external behavior. Empirical studies have repeatedly shown that refactoring has a positive impact on the understandability and maintainability of software systems. However, before carrying out refactoring activities, developers need to identify refactoring opportunities. Currently, refactoring opportunity identification heavily relies on developers' expertise and intuition. In this paper, we investigate the effectiveness of machine learning algorithms in predicting software refactorings. More specifically, we train six different machine learning algorithms (i.e., Logistic Regression, Naive Bayes, Support Vector Machine, Decision Trees, Random Forest, and Neural Network) with a dataset comprising over two million refactorings from 11,149 real-world projects from the Apache, F-Droid, and GitHub ecosystems. The resulting models predict 20 different refactorings at class, method, and variable-levels with an accuracy often higher than 90%. Our results show that (i) Random Forests are the best models for predicting software refactoring, (ii) process and ownership metrics seem to play a crucial role in the creation of better models, and (iii) models generalize well in different contexts.	pt_BR
dc.description.provenance	Submitted by Daniele Faria (danielefaria@ufla.br) on 2021-07-16T15:38:58Z No. of bitstreams: 0	en
dc.description.provenance	Approved for entry into archive by André Calsavara (andre.calsavara@biblioteca.ufla.br) on 2021-07-16T16:42:43Z (GMT) No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2021-07-16T16:42:43Z (GMT). No. of bitstreams: 0 Previous issue date: 2020	en
dc.identifier.citation	ANICHE, M. et al. The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring. IEEE Transactions on Software Engineering, [S. I.], 2020. DOI: 10.1109/TSE.2020.3021736.	pt_BR
dc.identifier.uri	https://repositorio.ufla.br/handle/1/46769
dc.identifier.uri	https://doi.ieeecomputersociety.org/10.1109/TSE.2020.3021736	pt_BR
dc.language	en	pt_BR
dc.publisher	Institute of Electrical and Electronic Engineers - IEEE	pt_BR
dc.rights	restrictAccess	pt_BR
dc.source	IEEE Transactions on Software Engineering	pt_BR
dc.subject	Biological system modeling	pt_BR
dc.subject	Predictive models	pt_BR
dc.subject	Context modeling	pt_BR
dc.subject	Prediction agorithms	pt_BR
dc.subject	Algoritmos de aprendizado de máquina	pt_BR
dc.subject	Modelagem de sistemas biológicos	pt_BR
dc.subject	Modelos preditivos	pt_BR
dc.subject	Modelagem de contexto	pt_BR
dc.subject	Refatoração de software	pt_BR
dc.title	The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring	pt_BR
dc.type	Artigo	pt_BR

Arquivos

Licença do pacote

Agora exibindo 1 - 1 de 1

Nome:: license.txt
Tamanho:: 953 B
Formato:: Item-specific license agreed upon to submission
Descrição:

Baixar

Coleções

DCC - Artigos publicados em periódicos