Predictive Modelling of Kidney Stones Using Clinical Urine and Blood Parameters

Authors

  • Priyamvad Ranjan, Ritik Kumar, Ronit Baweja

Keywords:

kidney stone prediction, nephrolithiasis, machine learning, XGBoost, LightGBM, CatBoost, SMOTE, Optuna, stacking ensemble, SHAP, explainable AI, feature engineering, clinical decision support.

Abstract

This paper presents a comprehensive machine learning (ML) framework for non-imaging kidney stone prediction from clinical urine and blood parameters. A dataset of 350 patient records with 35 features was processed through a systematic pipeline comprising KNN-based missing value imputation, RobustScaler normalization, and SMOTE class balancing. Eight clinically motivated engineered features — including volume-corrected oxalate and calcium concentrations, pH-uric acid risk index, and a composite clinical risk score — were derived and validated through mutual information feature selection. Nineteen ML models spanning traditional classifiers, tree-based ensembles, and gradient boosting frameworks (XGBoost, LightGBM, CatBoost) were trained and systematically compared. Bayesian hyperparameter optimization via Optuna achieved a cross-validation ROC-AUC of 0.9315 in the Stacking ensemble, while K-Nearest Neighbors attained the best test-set accuracy (80.0%, AUC 0.8435). SHAP-based Explainable AI analysis confirmed alignment between model predictions and established nephrology risk factors. Probability calibration reduced expected calibration error from 0.087 to 0.041, and threshold optimization was employed to maximize clinical sensitivity. The proposed framework demonstrates that rigorous preprocessing, domain-driven feature engineering, and ensemble optimization can yield clinically useful predictive performance from routine laboratory data in moderate-sized patient cohorts.

Downloads

Download data is not yet available.

References

S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017.

T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 785–794, 2016.

G. Ke et al., "LightGBM: A highly efficient gradient boosting decision tree," Adv. Neural Inf. Process. Syst., vol. 30, 2017.

A. V. Dorogush, V. Ershov, and A. Gulin, "CatBoost: Gradient boosting with categorical features support," arXiv:1810.11363, 2018.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.

F. Pedregosa et al., "Scikit-learn: Machine learning in Python," J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.

T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, "Optuna: A next-generation hyperparameter optimization framework," Proc. 25th ACM SIGKDD, pp. 2623–2631, 2019.

O. Troyanskaya et al., "Missing value estimation methods for DNA microarrays," Bioinformatics, vol. 17, no. 6, pp. 520–525, 2001.

O. Parikh, R. Shah, and N. Patel, "Comparative analysis of ML classifiers for kidney stone risk prediction," IEEE Access, vol. 8, pp. 177432–177445, 2020.

M.-H. Kuo, C.-H. Chen, and W.-P. Lin, "Urinary stone risk classification using gradient boosting," J. Urol., vol. 201, no. 4S, pp. e456, 2019.

L.-C. Jiang, C.-M. Chen, and T.-F. Wang, "Random forest-based prediction of kidney stone composition," Urology, vol. 112, pp. 28–33, 2018.

M. Amiri, R. Yousefi, and C. Lucas, "SVM-based prediction of recurrent nephrolithiasis from urinary biomarkers," Comput. Methods Programs Biomed., vol. 126, pp. 111–121, 2016.

G. C. Curhan, "Epidemiology of stone disease," Urol. Clin. North Am., vol. 34, no. 3, pp. 287–293, 2007.

H.-A. Tiselius, "Metabolic evaluation of patients with stone disease," Urol. Int., vol. 59, pp. 131–141, 1997.

M. S. Pearle, E. A. Goldfarb, and D. S. Assimos, "Medical management of kidney stones: AUA guideline," J. Urol., vol. 192, no. 2, pp. 316–324, 2014.

P. Rajpurkar et al., "AI in medical diagnosis: Physician confidence with explainable AI," npj Digit. Med., vol. 5, p. 12, 2022.

S. Arık and T. Pfister, "TabNet: Attentive interpretable tabular learning," Proc. AAAI, vol. 35, pp. 6679–6687, 2021.

Global Burden of Disease 2019 Collaborators, "Global burden of urolithiasis 1990–2019," Eur. Urol., vol. 80, pp. 682–690, 2021.

C. E. Kim et al., "Federated learning for kidney stone prediction across multiple institutions," J. Am. Med. Inform. Assoc., vol. 29, no. 8, pp. 1455–1463, 2022.

A. L. Goldfarb, "Nutritional factors in the pathogenesis and prophylaxis of calcium nephrolithiasis," Kidney Int., vol. 60, pp. 729–744, 2001.

Downloads

Published

31.05.2026

How to Cite

Priyamvad Ranjan. (2026). Predictive Modelling of Kidney Stones Using Clinical Urine and Blood Parameters. International Journal of Intelligent Systems and Applications in Engineering, 14(1s), 1676 –. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/8400

Issue

Section

Research Article