Enhanced Breast Cancer Classification through Data Fusion Modeling

Authors

  • Fei Zhang Department of Computer and Science, Trine University, Phoenix 85201, US
  • Mingxuan Xiao Department of Computer and Science, Southwest Jiao Tong University, Chengdu 610000, China
  • Weimin Wang Department of Computer and Science, Hong Kong University of Science and Technology, Hong Kong 999077, Hong Kong
  • Yufeng Li Department of Electronics and Computer Science, University of Southampton, Southampton SO19, UK
  • Xu Yan Department of Computer and Science, Trine University, Phoenix 85201, US

DOI:

https://doi.org/10.53469/jtpes.2024.04(01).11

Keywords:

Breast cancer, Intelligent diagnosis, Adaboost, BP network, RBF network, Naïve Bayes, Classifier

Abstract

This study addresses issues of classifier instability and poor adaptability to sample distribution in intelligent breast cancer diagnosis. We propose a novel classifier construction algorithm based on Adaboost, integrating BP, RBF, and Naïve Bayes networks. Firstly, multiple weak classifiers are trained using different classification algorithms. Subsequently, a weight allocation strategy is employed, increasing the weight of misclassified diseased samples as healthy and decreasing the weight of misclassified healthy samples as diseased during data distribution processing. Finally, the adjusted weights are used to recombine the weak classifiers into a strong classifier. Experimental validation on the Wisconsin Breast Cancer (WBCD) dataset from the UCI (University of California, Irvine) database demonstrates the superiority of the proposed classification model over individual algorithms. This algorithm's application is expected to enhance the accuracy and stability of breast cancer diagnosis, providing valuable insights for the further development of intelligent diagnostic systems.

References

Global Cancer Observatory (GCO). . International Agency for Research on Cancer. World Health Organization. 2018.

Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. . From data mining to knowledge discovery in databases. AI magazine, 1996,17(3): 37.

Cruz-Roa, A., Ovalle, J. E., Madabhushi, A., Oscherwitz, T., & Khan, A. . Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent. Scientific reports, 2018,8(1): 1-11.

Reference for variable redundancy in breast cancer datasets.

Reference for the importance of dimensionality reduction in identifying primary factors in breast cancer.

Reference for common dimensionality reduction methods in breast cancer research.

Reference for the absolute values of standardized sample data in non-linear regression.

Reference for conducting significance search within a second-order range in non-linear regression.

Reference for PCA allowing the direct identification of primary influencing factors.

Reference for non-linear regression determining primary influencing variables by assessing confidence.

Reference for specific operational methods in non-linear regression examples.

Reference for optimizing the dataset for constructing a hybrid ensemble model in breast cancer diagnosis.

Reference for the widespread use of the Backpropagation neural network.

Reference for the possibility of falling into local optima in Backpropagation neural networks.

Reference for the introduction of Radial Basis Function networks to address local optima in BP networks.

Reference for the excellent performance of Naïve Bayes networks in classifying small-scale data.

Reference for the incorporation of Naïve Bayes into the hybrid model.

Reference for the adoption of the AdaBoost algorithm in the ensemble of the hybrid model.

Reference for the selection of the exponential function of errors for data weight modification in the hybrid model.

Reference for the essence of Principal Component Analysis (PCA) in linear dimensionality reduction.

Reference for the specific application of PCA in choosing the top principal components for dimensionality reduction.

Reference for the classification evaluation metrics used in the study.

Reference for the 10-fold cross-validation and the presentation of evaluation metrics in Table 2.

Reference for the rationale behind choosing the Stepwise Regression method over Principal Component Analysis.

Downloads

Published

2024-01-30

How to Cite

Zhang, F., Xiao, M., Wang, W., Li, Y., & Yan, X. (2024). Enhanced Breast Cancer Classification through Data Fusion Modeling. Journal of Theory and Practice of Engineering Science, 4(01), 79–85. https://doi.org/10.53469/jtpes.2024.04(01).11