Rejection Threshold Optimization using 3D ROC Curves: Novel Findings on Biomedical Datasets
DOI:
https://doi.org/10.18201/ijisae.2021167933Keywords:
Decision threshold optimization, rejection threshold optimization, 3D ROC curves, Naive BayesAbstract
Reject option is introduced in classification tasks to prevent potential misclassifications. Although optimization of error-reject trade-off has been widely investigated, it is shown that error rate itself is not an appropriate performance measure, when misclassification costs are unequal or class distributions are imbalanced. ROC analysis is proposed as an alternative approach to performance evaluation in terms of true positives (TP) and false positives (FP). Considering classification with reject option, we need to represent the tradeoff between TP, FP and rejection rates. In this paper, we propose 3D ROC analysis to determine the optimal rejection threshold as an analogy to decision threshold optimization in 2D ROC curves. We have demonstrated our proposed method with Naive Bayes classifier on Heart Disease dataset and validated the efficiency of the method on Pima Indians Diabetes dataset. Our experiments reveal that classification with optimized rejection threshold significantly improves true positive rates in medical datasets. Furthermore, false positive rates remain the same with rejection rates below 10%.
Downloads
References
C.K. Chow, “On optimum recognition error and reject tradeoff,” IEEE Trans. Information Theory, vol. 16, no. 1, pp. 41-46, January 1970.
P. Pudil, J. Novovicova, S. Blaha, J. Kittler, “Multistage pattern recognition with reject option.” in Proc. 11th IAPR International Conference on Pattern Recognition Vol.II. Conference B: Pattern Recognition Methodology and Systems, The Hague, Netherlands, 1992, pp. 92-95
P. Vcelak, M. Kryl, M. Kratochvil, J. Kleckova, “Identification and classification of DICOM files with burned-in text content,” International Journal of Medical Informatics, vol. 126, pp. 128-137, June 2019.
B. Hanczar, “Performance visualization spaces for classification with rejection option”, Pattern Recognition, vol. 96, Dec. 2019.
N. Gorski, "Optimizing error-reject trade off in recognition systems," in Proc. Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, 1997, pp. 1092-1096 vol.2
T. Fawcett, “An introduction to roc analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, June 2006.
T. Menzies, J. Greenwald and A. Frank, "Data mining static code attributes to learn defect predictors," in IEEE Transactions on Software Engineering, vol. 33, no. 1, pp. 2-13, Jan. 2007.
A. Tosun, B. Turhan, A. Bener, “Ensemble of software defect predictors: a case study,” in Proc. ESEM, Kaiserslautern, Germany, pp. 318-320, Oct. 2008.
A. Uyar, N. Ciray, A. Bener, M. Bahceci, “3p: Personalized pregnancy prediction in ivf treatment process,” in Proc. First Int. Conf. Electronic Healthcare for the 21st Century, London, UK, Sept. 2008, pp. 58-65.
L. Hansen, C. Liisberg, P. Salamon, “The error-reject tradeoff”, Open Systems and Information Dynamics,” vol. 4, no.10, 2000
F. Tortorella, “An optimal reject rule for binary classifiers.” in Proc. of the Joint IAPR International Workshops on Advances in Pattern Recognition, London, UK, Sept. 2000, pp. 611-620.
M. A. Maloof, "On machine learning, ROC analysis, and statistical tests of significance," Object recognition supported by user interaction for service robots, Quebec City, Quebec, Canada, 2002, pp. 204-207 vol.2.
M.R. Hassan, M.M. Hossain, J. Bailey, K. Ramamohanarao, “Improving k-nearest neighbour classification with distance functions based on receiver operating characteristics,” in Proc. of the ECML PKDD, vol. 5211, Springer, Berlin, Heidelberg, 99. 489-504.
C.M. Santos-Pereira, A.M., Pires, “On optimal reject rules and roc curves,” Pattern Recognition Letters, vol. 26, no. 7, pp. 943-952, May 2005.
Z. Ceylan, “Diagnosis of Breast Cancer Using Improved Machine Learning Algorithms Based on Bayesian Optimization”, IJISAE, vol. 8, no. 3, pp. 121-130, Sep. 2020.
A. Tosun, A. Bener, "Reducing false alarms in software defect prediction by decision threshold optimization," 2009 3rd International Symposium on Empirical Software Engineering and Measurement, Lake Buena Vista, FL, 2009, pp. 477-480.
A. Uyar, A. Bener, H. N. Ciray, “Predictive Modeling of Implantation Outcome in an In Vitro Fertilization Setting: An Application of Machine Learning Methods,” Med. Decis. Making, vol. 35, no. 6, Aug. 2015, pp. 714-725.
Su Wang et al., "3D ROC analysis for medical imaging diagnosis," 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, 2005, pp. 7545-7548.
F. M. Ham, R. Acharyya and Young-Chan Lee, "Speaker verification using 3-D ROC curves for increasing imposter rejections," The 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, BC, 2006, pp. 2561-2565.
Y. Du, C. Chang, “3d combinational curves for accuracy and performance analysis of positive biometrics identification,” Optics and Lasers in Engineering, vol. 46, no. 6, pp. 477-490, June 2008.
T. Landgrebe, D.M.J. Tax, P. Paclík, R.P.W. Duin, “The interaction between classification and reject performance for distance-based reject-option classifiers.” Pattern Recognition Letters, vol. 27, no. 8, pp. 908-917, June 2006, 10.1016/j.patrec.2005.10.015
A. Asuncion, D. Newman, UCI machine learning repository, 2007, Available: http://www.ics.uci.edu/~mlearn/ MLRepository.html
E. Seğmen , A. Uyar, “Performance analysis of classification models for medical diagnostic decision support systems,” 2013 21st Signal Processing and Communications Applications Conference (SIU), Haspolat, 2013, pp. 1-4.
I. H. Witten, E. Frank, “Data Mining: Practical machine learning tools and techniques,” 2nd ed., San Francisco, USA: Morgan Kaufmann, 2005,
R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Int. Joint Conf. on AI, Quebec, Canada, Aug. 1995, pp. 1137-1145.
C. Kaynak, E. Alpaydin, “Multistage cascading of multiple classifiers: One man's noise is another man's data,” presented at the 17th International Conference on Machine Learning, Stanford, CA, USA, June 2000.
Downloads
Published
How to Cite
Issue
Section
License
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.