Using Modified SBM and Decision Tree to Find Targets in Breast Cancer Data
Abstract
This study presents a combined approach using a modified Slacks-Based Measure (SBM) and a Decision Tree algorithm to identify target patients within breast cancer data. Traditional Data Envelopment Analysis (DEA) models often classify multiple patients as equally efficient, which limits the ability to distinguish between them. By modifying the SBM model to better handle input and output slacks, we aim to capture more accurate efficiency levels. We apply this method to the Scikit-learn breast cancer dataset, treating each patient as a Decision-Making Unit (DMU). The Decision Tree algorithm is used to identify the most significant features influencing efficiency. These key features are assigned higher weights in the SBM model to refine the analysis. The results allow for the identification of biologically significant target patients who demonstrate distinct efficiency profiles. This approach offers a useful tool for discovering hidden patterns in medical data and supports data-driven decision-making in cancer diagnosis and treatment planning.
Keywords:
Modified slacks-based measure model, Decision tree, Data envelopment analysis, Decision-making unit, Efficiency in cancer patientsReferences
- [1] Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European journal of operational research, 2(6), 429–444. https://doi.org/10.1016/0377-2217(78)90138-8
- [2] Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management science, 30(9), 1078–1092. https://doi.org/10.1287/mnsc.30.9.1078
- [3] Tone, K. (2001). A slacks-based measure of efficiency in data envelopment analysis. European journal of operational research, 130(3), 498–509. https://doi.org/10.1016/S0377-2217(99)00407-5
- [4] Emrouznejad, A., & Yang, G. (2018). A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio-economic planning sciences, 61, 4–8. https://doi.org/10.1016/j.seps.2017.01.008
- [5] Tone, K. (2002). A strange case of the cost and allocative efficiencies in DEA. Journal of the operational research society, 53(11), 1225–1231. https://doi.org/10.1057/palgrave.jors.2601438
- [6] Andersen, P., & Petersen, N. C. (1993). A procedure for ranking efficient units in data envelopment analysis. Management science, 39(10), 1261–1264. https://doi.org/10.1287/mnsc.39.10.1261
- [7] Lee, H. S. (2021). An integrated model for SBM and super-SBM DEA models. Journal of the operational research society, 72(5), 1174–1182. https://doi.org/10.1080/01605682.2020.1755900
- [8] Mohi ud din, N., Dar, R. A., Rasool, M., Assad, A. (2022). Breast cancer detection using deep learning: Datasets, methods, and challenges ahead. Computers in biology and medicine, 149, 106073. https://doi.org/10.1016/j.compbiomed.2022.106073
- [9] Lavanya, D., & Rani, D. K. U. (2011). Analysis of feature selection with classification: Breast cancer datasets. Indian journal of computer science and engineering (IJCSE), 2(5), 756–763. https://ijcse.com/docs/INDJCSE11-02-05-167.pdf
- [10] Navada, A., Ansari, A. N., Patil, S., & Sonkamble, B. A. (2011). Overview of use of decision tree algorithms in machine learning. 2011 IEEE control and system graduate research colloquium (pp. 37–42). IEEE. https://doi.org/10.1109/ICSGRC.2011.5991826