Integrating DEA with Machine Learning for Predictive Modeling in Breast Cancer

Zahra  Bagheri; Faranak  Taghizadeh Firoozabadi

doi:10.48314/ijorai.v1i1.53

Authors

Zahra Bagheri * Department of Mathematics, Khomeinishahr Branch, Islamic Azad University, Isfahan, Iran.
Faranak Taghizadeh Firoozabadi Department of Mathematics, Kerman Branch, Islamic Azad University, Kerman.

https://doi.org/10.48314/ijorai.v1i1.53

Abstract

This study proposes an integrated methodology combining Data Envelopment Analysis (DEA) with Machine Learning (ML) to enhance predictive modeling in healthcare data analysis, specifically for breast cancer datasets. The methodology begins with essential data preprocessing steps, including data cleaning, normalization, and outlier detection, to ensure the dataset's quality and consistency. After preprocessing, DEA is applied to calculate efficiency scores for Decision-Making Units (DMUs), such as hospitals or clinics, assessing their resource utilization and performance. These efficiency scores are then incorporated as a new feature into the dataset, providing additional insights into the performance of each DMU. Various ML models are trained using the augmented dataset, and their predictive accuracy is compared to models trained on the original dataset. The inclusion of DEA-derived efficiency scores is shown to improve model performance and interpretability. The results suggest that integrating DEA efficiency scores with ML models enhances the accuracy and transparency of predictions, offering a promising approach for decision-making in complex domains like healthcare. Future research could explore the application of deep learning techniques or extend this methodology to other sectors such as energy management or financial analysis.

Keywords:

Data envelopment analysis, Machine learning, Breast cancer dataset, Feature selection

References

[1] Turing, A. M. (1950). Mind. Oxford university press, 59(236), 433–460. https://www.jstor.org/stable/2251299

[2] Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386. https://psycnet.apa.org/doi/10.1037/h0042519

[3] Emrouznejad, A., & Yang, G. (2018). A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio-economic planning sciences, 61, 4–8. https://doi.org/10.1016/j.seps.2017.01.008

[4] Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European journal of operational research, 2(6), 429–444. https://doi.org/10.1016/0377-2217(78)90138-8

[5] Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management science, 30(9), 1078–1092. https://doi.org/10.1287/mnsc.30.9.1078

[6] Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655

[7] Breiman, L. (2001). Using iterated bagging to debias regressions. Machine learning, 45(3), 261–277. https://doi.org/10.1023/A:1017934522171

[8] Friedman, J. H. (2002). Stochastic gradient boosting. Computational statistics & data analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

[9] Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The annals of statistics, 29(5), 1189–1232. http://www.jstor.org/stable/2699986

[10] Guillen, M. D., Aparicio, J., & Esteve, M. (2023). Gradient tree boosting and the estimation of production frontiers. Expert systems with applications, 214, 119134. https://doi.org/10.1016/j.eswa.2022.119134

[11] Charles, V., Gherman, T., & Zhu, J. (2021). Data envelopment analysis and big data: A systematic literature review with bibliometric analysis. In Data-enabled analytics: DEA for big data (pp. 1–29). Cham: Springer international publishing. https://doi.org/10.1007/978-3-030-75162-3_1

[12] Russell, S., & Norvig, P. (2020). Artificial intelligence: A modern approach. In Pearson series in artifical intelligence. Pearson. https://www.amazon.com/Artificial-Intelligence-A-Modern-pproach/dp/0134610997#

[13] Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning. In Stat sci (pp. 140–155). New York: Springer. http://dx.doi.org/10.1117/1.2819119

Integrating DEA with Machine Learning for Predictive Modeling in Breast Cancer

Authors

Abstract

Keywords:

References

Downloads

Published

Issue

Section

How to Cite

Similar Articles

home

Submission Button

Guide-for-authors

Journal Info

Special Issues

Editors

Reviewers

Contact Us

Archives

Article in press

volume