Integrating DEA with Machine Learning for Predictive Modeling in Breast Cancer

Authors

  • Zahra Bagheri * Department of Mathematics, Khomeinishahr Branch, Islamic Azad University, Isfahan, Iran.
  • Faranak Taghizadeh Firoozabadi Department of Mathematics, Kerman Branch, Islamic Azad University, Kerman.

https://doi.org/10.48314/ijorai.v1i1.53

Abstract

This study proposes an integrated methodology combining Data Envelopment Analysis (DEA) with Machine Learning (ML) to enhance predictive modeling in healthcare data analysis, specifically for breast cancer datasets. The methodology begins with essential data preprocessing steps, including data cleaning, normalization, and outlier detection, to ensure the dataset's quality and consistency. After preprocessing, DEA is applied to calculate efficiency scores for Decision-Making Units (DMUs), such as hospitals or clinics, assessing their resource utilization and performance. These efficiency scores are then incorporated as a new feature into the dataset, providing additional insights into the performance of each DMU. Various ML models are trained using the augmented dataset, and their predictive accuracy is compared to models trained on the original dataset. The inclusion of DEA-derived efficiency scores is shown to improve model performance and interpretability. The results suggest that integrating DEA efficiency scores with ML models enhances the accuracy and transparency of predictions, offering a promising approach for decision-making in complex domains like healthcare. Future research could explore the application of deep learning techniques or extend this methodology to other sectors such as energy management or financial analysis.  

Keywords:

Data envelopment analysis, Machine learning, Breast cancer dataset, Feature selection

References

  1. [1] Turing, A. M. (1950). Mind. Oxford university press, 59(236), 433–460. https://www.jstor.org/stable/2251299

  2. [2] Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386. https://psycnet.apa.org/doi/10.1037/h0042519

  3. [3] Emrouznejad, A., & Yang, G. (2018). A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio-economic planning sciences, 61, 4–8. https://doi.org/10.1016/j.seps.2017.01.008

  4. [4] Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European journal of operational research, 2(6), 429–444. https://doi.org/10.1016/0377-2217(78)90138-8

  5. [5] Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management science, 30(9), 1078–1092. https://doi.org/10.1287/mnsc.30.9.1078

  6. [6] Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655

  7. [7] Breiman, L. (2001). Using iterated bagging to debias regressions. Machine learning, 45(3), 261–277. https://doi.org/10.1023/A:1017934522171

  8. [8] Friedman, J. H. (2002). Stochastic gradient boosting. Computational statistics & data analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

  9. [9] Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The annals of statistics, 29(5), 1189–1232. http://www.jstor.org/stable/2699986

  10. [10] Guillen, M. D., Aparicio, J., & Esteve, M. (2023). Gradient tree boosting and the estimation of production frontiers. Expert systems with applications, 214, 119134. https://doi.org/10.1016/j.eswa.2022.119134

  11. [11] Charles, V., Gherman, T., & Zhu, J. (2021). Data envelopment analysis and big data: A systematic literature review with bibliometric analysis. In Data-enabled analytics: DEA for big data (pp. 1–29). Cham: Springer international publishing. https://doi.org/10.1007/978-3-030-75162-3_1

  12. [12] Russell, S., & Norvig, P. (2020). Artificial intelligence: A modern approach. In Pearson series in artifical intelligence. Pearson. https://www.amazon.com/Artificial-Intelligence-A-Modern-pproach/dp/0134610997#

  13. [13] Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning. In Stat sci (pp. 140–155). New York: Springer. http://dx.doi.org/10.1117/1.2819119

Published

2025-03-14

How to Cite

Bagheri, Z. ., & Taghizadeh Firoozabadi, F. . (2025). Integrating DEA with Machine Learning for Predictive Modeling in Breast Cancer. International Journal of Operations Research and Artificial Intelligence , 1(1), 20-28. https://doi.org/10.48314/ijorai.v1i1.53

Similar Articles

You may also start an advanced similarity search for this article.