Machine Learning Predictions of Postoperative Outcomes in Adult Male Circumcision

Machine Learning Predictions
of Postoperative Outcomes
in Adult Male Circumcision

Leonid Shpaner, M.S.¹
Giuseppe Saitta, M.D.²

GitHub Repo

Controls

Next: Right Arrow or Space
Previous: Left Arrow
Start: Home
Finish: End
Overview: Esc
Speaker notes: S
Fullscreen: F
Zoom: Alt + Click

Background

Male circumcision is one of the world’s most common surgical procedures, with well‐established benefits (e.g. reduced infection, inflammation) but variable outcomes in adults due to anatomical variation and comorbidities.
Accurate preoperative risk prediction could guide individualized planning and improve patient counseling.
We retrospectively analyzed 194 adult circumcisions (Milan, 2023–24) to train logistic regression, random forest, and support vector machines.

Introduction

Adult male circumcision presents unique clinical challenges due to comorbidities and anatomical variability.
The procedure offers public health benefits, including reduced infections and STDs.
CO₂ laser circumcision has gained popularity for its advantages: less bleeding, faster healing, and shorter recovery.
Prior studies (e.g., Leonardi & Saitta) support the clinical efficacy of laser techniques.
This study uses supervised machine learning to predict short-term complications based on surgical method.

It introduces a data-driven approach for
individualized preoperative risk assessment.

Distribution of Data For All Numeric Values

Cohort Inclusion, Preprocessing, and Modeling Flow

Feature Space

194 patients | 13 columns

Age (years)
BMI
Surgical Technique
Intraop. Blood Loss (ml)
Intraop. Heart Rate (bpm)
Intraop. Pulse Ox (%)

Surgical Time (min)
BMI (Obese)
BMI (Overweight)
BMI (Underweight)
Intraop. SBP
Intraop. DBP
Diabetes

Dataset Preparation

All coding done in Python 3.11.11 using the following libraries:

Performed modeling via stratified 10-fold cross-validation (i.e. StratifiedKFold with k=10), ensuring each fold preserves our outcome’s class proportions.

Age-Related Distributions

Surgical Techniques

Establishing Linear Relationships

Table 1: Continuous Variables With Respect To Surgical Technique

Variable	Mean	SD	Median	Min	Max	Mode	Count	Traditional (n = 132)	Laser (n = 62)	P-value
Age_years	43.13	21.88	34.00	18.00	93.00	18.00	194	37.34 (22.29)	55.47 (14.86)	< 0.001
BMI	24.07	3.00	23.68	17.34	36.57	21.63	194	24.08 (3.03)	24.06 (2.97)	0.97
Intraop_DBP	82.94	15.87	90.00	10.00	100.00	90.00	194	86.97 (11.12)	74.35 (20.54)	< 0.001
Intraop_Mean_Heart_Rate_bpm	75.93	5.68	80.00	60.00	88.00	80.00	194	76.48 (5.66)	74.76 (5.57)	0.05
Intraop_Mean_Pulse_Ox_Percent	96.69	1.78	97.00	91.00	99.00	98.00	194	96.70 (1.80)	96.66 (1.76)	0.87
Intraop_SBP	122.37	10.61	120.00	100.00	150.00	120.00	194	121.82 (11.84)	123.55 (7.26)	0.21
Intraoperative_Blood_Loss_ml	7.90	15.51	0.00	0.00	100.00	0.00	194	11.59 (17.64)	0.03 (0.25)	< 0.001
Surgical_Time_min	28.18	5.64	27.50	15.00	40.00	28.00	194	28.70 (5.44)	27.06 (5.94)	0.07

Table 1: Categorical Variables With Respect To Surgical Technique

Variable	Count	Proportion (%)	Traditional (n = 132)	Laser (n = 62)	P-value
Age Group	194	100.00	132	62	< 0.001
age_group = 18-29	85	43.81	79 (59.85%)	6 (9.68%)
age_group = 60-69	30	15.46	9 (6.82%)	21 (33.87%)
age_group = 30-39	22	11.34	16 (12.12%)	6 (9.68%)
age_group = 70-79	20	10.31	10 (7.58%)	10 (16.13%)
age_group = 50-59	17	8.76	4 (3.03%)	13 (20.97%)
age_group = 40-49	10	5.15	7 (5.30%)	3 (4.84%)
age_group = 80-89	7	3.61	4 (3.03%)	3 (4.84%)
age_group = 90-99	3	1.55	3 (2.27%)	0 (0.00%)
BMI_Category_Obese	194	100.00	132	62	0.9918
BMI_Category_Obese = 0	183	94.33	124 (93.94%)	59 (95.16%)
BMI_Category_Obese = 1	11	5.67	8 (6.06%)	3 (4.84%)
BMI_Category_Overweight	194	100.00	132	62	0.5894
BMI_Category_Overweight = 0	141	72.68	98 (74.24%)	43 (69.35%)
BMI_Category_Overweight = 1	53	27.32	34 (25.76%)	19 (30.65%)
Diabetes	194	100.00	132	62	< 0.001
Diabetes = 0	164	84.54	121 (91.67%)	43 (69.35%)
Diabetes = 1	30	15.46	11 (8.33%)	19 (30.65%)
Complications: Bleeding, Edema, Pain, or Infection	194	100.00	57 (43.18%)	1 (1.61%)	< 0.001
Complications = 0	136	70.10	75 (56.82%)	61 (98.39%))
Complications = 1	58	29.90	57 (43.18%)	1 (1.61%)

Outcome by Risk Factors and Surgical Techniques

Table 2. Hyperparameter Tuning Configurations

Model	Algorithm Type	Resampling Method	Hyperparameters Tuned	Tuning Range/Options	Final Selected Hyperparameter Configuration
Logistic Regression	Linear Classifier	None, SMOTE, ROS	Penalty (penalty)	L2	Resampling method = SMOTE penalty = L2 C = 1.0
Logistic Regression	Linear Classifier	None, SMOTE, ROS	Inverse Regularization Strength (C)	0.0001, 1.0	Resampling method = SMOTE penalty = L2 C = 1.0
Random Forest	Ensemble Classifier	None, SMOTE, ROS	Number of Estimators (n_estimators )	10, 50	Resampling method = SMOTE n_estimators = 50 max_depth = None min_samples_split = 5
			Maximum Depth (max_depth)	None, 10
			Minimum Samples Split (min_samples_split)	2, 5
Support Vector Machines	Kernel-based Classifier	None, SMOTE, ROS	Kernel Type (kernel)	Linear, rbf, poly, sigmoid	Resampling method = None kernel = rbf C = 100 gamma = auto
			Cost Parameter (C)	0.0001 to 100 (log scale)
			Gamma (gamma)	0.001, 0.01, 0.05, 0.1, 0.2, 0.5, scale, auto

Model Calibration

Performance Assessment

Metrics (Point Estimate, 95% CI)	Logistic Regression	Random Forest Classifier	Support Vector Machines
Precision/PPV	0.571 (0.469–0.663)	0.706 (0.596–0.810)	0.725 (0.623–0.826)
Average Precision	0.809 (0.704–0.899)	0.737 (0.613–0.875)	0.832 (0.735–0.913)
Sensitivity/Recall	0.897 (0.817–0.966)	0.828 (0.719–0.917)	0.862 (0.765–0.939)
Specificity	0.713 (0.632–0.783)	0.853 (0.786–0.910)	0.860 (0.797–0.915)
F1-Score	0.698 (0.603–0.778)	0.762 (0.667–0.837)	0.787 (0.706–0.857)
AUC ROC	0.900 (0.849–0.943)	0.887 (0.826–0.940)	0.907 (0.855–0.950)
Brier Score	0.137 (0.117–0.160)	0.105 (0.077–0.136)	0.105 (0.077–0.134)
Threshold (Point Estimate)	0.439	0.318	0.238

Support Vector Machines (SVM)

\(\displaystyle \min_{w,b,\{\xi_i\}}\) \(\displaystyle \left(\tfrac12\|w\|^2 + C\sum_i \xi_i\right)\) \(\mathrm{s.t.}\) \(\displaystyle y_i\left(w^\top\phi(x_i)+b\right)\ge1-\xi_i,\) \(\displaystyle \xi_i\ge0\)

\(\displaystyle K\left(x,x'\right)=\) \(\displaystyle \exp\!\left(-\gamma\,\|x - x'\|^2\right),\) \(\displaystyle \gamma = \tfrac{1}{2\sigma^2}\)

SHAP (SHapley Additive exPlanations)

Conclusions

Promising Clinical Signals:
Even with a relatively modest dataset, the support vector machine (SVM) model uncovered a few compelling and clinically aligned trends:
- Laser circumcision was consistently associated with lower predicted complication risk, matching our clinical impressions.
- Intraoperative blood loss and surgical technique emerged as the most influential features across all patients, validating their relevance in predicting postoperative outcomes.
- High-risk predictions tended to cluster around patients with traditional technique, older age, diabetes, and higher BMI, which is consistent with established risk profiles.
- The model also demonstrated strong calibration (low Brier scores), meaning its risk estimates appear reliable, not just accurate in terms of classification.
- Importantly, the model was highly sensitive to complication risk in traditional cases, showing excellent alignment with expectations.

References

Leonardi R, Saitta G. Laser Circumcision in Adult Males: A Modern Approach for Improved Outcomes. In: Surgical Advances in Urology. IntechOpen; 2022. https://doi.org/10.5772/intechopen.106084
Lundberg SM, Lee S-I. (2017). A unified approach to interpreting model predictions. NeurIPS, 30: 4765–4774. PDF
Demas CP, Khan S, Mandava SH, et al. The effect of diabetes on postoperative outcomes following male urethral sling placement. Can Urol Assoc J. 2016;10(7–8):E251–E254. https://doi.org/10.5489/cuaj.3613
Talini C, Antunes LA, de Carvalho BCN, et al. Circumcision: postoperative complications that required reoperation. Einstein (São Paulo). 2018;16(3):eAO4241. https://doi.org/10.1590/S1679-45082018AO4241
Van Calster B, McLernon DJ, van Smeden M, et al. Calibration: The Achilles heel of predictive analytics. BMC Med. 2019;17:230. https://doi.org/10.1186/s12916-019-1466-7
Funnell A, Shpaner L, Petousis P. Model Tuner (v0.0.31b) [Software]. Zenodo. https://doi.org/10.5281/zenodo.12727322
Shpaner L, Gil O. EDA Toolkit (v0.0.16) [Software]. Zenodo. https://doi.org/10.5281/zenodo.13162633
Shpaner L. Model Metrics (v0.0.3a) [Software]. Zenodo. https://doi.org/10.5281/zenodo.14879819
Harris CR, Millman KJ, van der Walt SJ, et al. Array programming with NumPy. Nature. 2020;585(7825):357–362. https://doi.org/10.1038/s41586-020-2649-2
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research . 2011;12:2825–2830. https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html

Questions?

Leonid Shpaner, M.S.
email: Lshpaner@ucla.edu

Giuseppe Saitta, M.D.
email: gsaitta@hotmail.it