Enhanced Breast Cancer Classification through Data Fusion Modeling
DOI:
https://doi.org/10.53469/jtpes.2024.04(01).11Keywords:
Breast cancer, Intelligent diagnosis, Adaboost, BP network, RBF network, Naïve Bayes, ClassifierAbstract
This study addresses issues of classifier instability and poor adaptability to sample distribution in intelligent breast cancer diagnosis. We propose a novel classifier construction algorithm based on Adaboost, integrating BP, RBF, and Naïve Bayes networks. Firstly, multiple weak classifiers are trained using different classification algorithms. Subsequently, a weight allocation strategy is employed, increasing the weight of misclassified diseased samples as healthy and decreasing the weight of misclassified healthy samples as diseased during data distribution processing. Finally, the adjusted weights are used to recombine the weak classifiers into a strong classifier. Experimental validation on the Wisconsin Breast Cancer (WBCD) dataset from the UCI (University of California, Irvine) database demonstrates the superiority of the proposed classification model over individual algorithms. This algorithm's application is expected to enhance the accuracy and stability of breast cancer diagnosis, providing valuable insights for the further development of intelligent diagnostic systems.
References
Global Cancer Observatory (GCO). . International Agency for Research on Cancer. World Health Organization. 2018.
Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. . From data mining to knowledge discovery in databases. AI magazine, 1996,17(3): 37.
Cruz-Roa, A., Ovalle, J. E., Madabhushi, A., Oscherwitz, T., & Khan, A. . Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent. Scientific reports, 2018,8(1): 1-11.
Reference for variable redundancy in breast cancer datasets.
Reference for the importance of dimensionality reduction in identifying primary factors in breast cancer.
Reference for common dimensionality reduction methods in breast cancer research.
Reference for the absolute values of standardized sample data in non-linear regression.
Reference for conducting significance search within a second-order range in non-linear regression.
Reference for PCA allowing the direct identification of primary influencing factors.
Reference for non-linear regression determining primary influencing variables by assessing confidence.
Reference for specific operational methods in non-linear regression examples.
Reference for optimizing the dataset for constructing a hybrid ensemble model in breast cancer diagnosis.
Reference for the widespread use of the Backpropagation neural network.
Reference for the possibility of falling into local optima in Backpropagation neural networks.
Reference for the introduction of Radial Basis Function networks to address local optima in BP networks.
Reference for the excellent performance of Naïve Bayes networks in classifying small-scale data.
Reference for the incorporation of Naïve Bayes into the hybrid model.
Reference for the adoption of the AdaBoost algorithm in the ensemble of the hybrid model.
Reference for the selection of the exponential function of errors for data weight modification in the hybrid model.
Reference for the essence of Principal Component Analysis (PCA) in linear dimensionality reduction.
Reference for the specific application of PCA in choosing the top principal components for dimensionality reduction.
Reference for the classification evaluation metrics used in the study.
Reference for the 10-fold cross-validation and the presentation of evaluation metrics in Table 2.
Reference for the rationale behind choosing the Stepwise Regression method over Principal Component Analysis.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Fei Zhang, Mingxuan Xiao, Weimin Wang, Yufeng Li, Xu Yan
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.