Predictive Modelling of Kidney Stones Using Clinical Urine and Blood Parameters
Keywords:
kidney stone prediction, nephrolithiasis, machine learning, XGBoost, LightGBM, CatBoost, SMOTE, Optuna, stacking ensemble, SHAP, explainable AI, feature engineering, clinical decision support.Abstract
This paper presents a comprehensive machine learning (ML) framework for non-imaging kidney stone prediction from clinical urine and blood parameters. A dataset of 350 patient records with 35 features was processed through a systematic pipeline comprising KNN-based missing value imputation, RobustScaler normalization, and SMOTE class balancing. Eight clinically motivated engineered features — including volume-corrected oxalate and calcium concentrations, pH-uric acid risk index, and a composite clinical risk score — were derived and validated through mutual information feature selection. Nineteen ML models spanning traditional classifiers, tree-based ensembles, and gradient boosting frameworks (XGBoost, LightGBM, CatBoost) were trained and systematically compared. Bayesian hyperparameter optimization via Optuna achieved a cross-validation ROC-AUC of 0.9315 in the Stacking ensemble, while K-Nearest Neighbors attained the best test-set accuracy (80.0%, AUC 0.8435). SHAP-based Explainable AI analysis confirmed alignment between model predictions and established nephrology risk factors. Probability calibration reduced expected calibration error from 0.087 to 0.041, and threshold optimization was employed to maximize clinical sensitivity. The proposed framework demonstrates that rigorous preprocessing, domain-driven feature engineering, and ensemble optimization can yield clinically useful predictive performance from routine laboratory data in moderate-sized patient cohorts.
Downloads
References
S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017.
T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 785–794, 2016.
G. Ke et al., "LightGBM: A highly efficient gradient boosting decision tree," Adv. Neural Inf. Process. Syst., vol. 30, 2017.
A. V. Dorogush, V. Ershov, and A. Gulin, "CatBoost: Gradient boosting with categorical features support," arXiv:1810.11363, 2018.
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.
F. Pedregosa et al., "Scikit-learn: Machine learning in Python," J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, "Optuna: A next-generation hyperparameter optimization framework," Proc. 25th ACM SIGKDD, pp. 2623–2631, 2019.
O. Troyanskaya et al., "Missing value estimation methods for DNA microarrays," Bioinformatics, vol. 17, no. 6, pp. 520–525, 2001.
O. Parikh, R. Shah, and N. Patel, "Comparative analysis of ML classifiers for kidney stone risk prediction," IEEE Access, vol. 8, pp. 177432–177445, 2020.
M.-H. Kuo, C.-H. Chen, and W.-P. Lin, "Urinary stone risk classification using gradient boosting," J. Urol., vol. 201, no. 4S, pp. e456, 2019.
L.-C. Jiang, C.-M. Chen, and T.-F. Wang, "Random forest-based prediction of kidney stone composition," Urology, vol. 112, pp. 28–33, 2018.
M. Amiri, R. Yousefi, and C. Lucas, "SVM-based prediction of recurrent nephrolithiasis from urinary biomarkers," Comput. Methods Programs Biomed., vol. 126, pp. 111–121, 2016.
G. C. Curhan, "Epidemiology of stone disease," Urol. Clin. North Am., vol. 34, no. 3, pp. 287–293, 2007.
H.-A. Tiselius, "Metabolic evaluation of patients with stone disease," Urol. Int., vol. 59, pp. 131–141, 1997.
M. S. Pearle, E. A. Goldfarb, and D. S. Assimos, "Medical management of kidney stones: AUA guideline," J. Urol., vol. 192, no. 2, pp. 316–324, 2014.
P. Rajpurkar et al., "AI in medical diagnosis: Physician confidence with explainable AI," npj Digit. Med., vol. 5, p. 12, 2022.
S. Arık and T. Pfister, "TabNet: Attentive interpretable tabular learning," Proc. AAAI, vol. 35, pp. 6679–6687, 2021.
Global Burden of Disease 2019 Collaborators, "Global burden of urolithiasis 1990–2019," Eur. Urol., vol. 80, pp. 682–690, 2021.
C. E. Kim et al., "Federated learning for kidney stone prediction across multiple institutions," J. Am. Med. Inform. Assoc., vol. 29, no. 8, pp. 1455–1463, 2022.
A. L. Goldfarb, "Nutritional factors in the pathogenesis and prophylaxis of calcium nephrolithiasis," Kidney Int., vol. 60, pp. 729–744, 2001.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


