Discovering the Clinical Knowledge about Breast Cancer Diagnosis Using Rule-Based Machine Learning Algorithms. Health Educ Health Promot 2022; 10 (1) :89-97 URL: http://hehp.modares.ac.ir/article-4-55598-en.html
Aims: Breast cancer represents one of the most prevalent cancers and is also the main cause of cancer-related deaths in women globally. Thus, this study was aimed to construct and compare the performance of several rule-based machine learning algorithms in predicting breast cancer. Instrument & Methods: The data were collected from the Breast Cancer Registry database in the Ayatollah Taleghani Hospital, Abadan, Iran, from December 2017 to January 2021 and had information from 949 non-breast cancer and 554 breast cancer cases. Then the mean values and K-nearest neighborhood algorithm were used for replacing the lost quantitative and qualitative data fields, respectively. In the next step, the Chi-square test and binary logistic regression were used for feature selection. Finally, the best rule-based machine learning algorithm was obtained based on comparing different evaluation criteria. The Rapid Miner Studio 7.1.1 and Weka 3.9 software were utilized. Findings: As a result of feature selection the nine variables were considered as the most important variables for data mining. Generally, the results of comparing rule-based machine learning demonstrated that the J-48 algorithm with an accuracy of 0.991, F-measure of 0.987, and also AUC of 0.9997 had a better performance than others. Conclusion: It’s found that J-48 facilitates a reasonable level of accuracy for correct BC risk prediction. We believe it would be beneficial for designing intelligent decision support systems for the early detection of high-risk patients that will be used to inform proper interventions by the clinicians.