Machine Learning Predictions
of Postoperative Outcomes
in Adult Male Circumcision

CUT_MD logo
Leonid Shpaner, M.S.1
Giuseppe Saitta, M.D.2

GitHub Repo

Controls

  • Next: Right Arrow or Space
  • Previous: Left Arrow
  • Start: Home
  • Finish: End
  • Overview: Esc
  • Speaker notes: S
  • Fullscreen: F
  • Zoom: Alt + Click
Controls diagram

Background

  • Male circumcision is one of the world’s most common surgical procedures, with well‐established benefits (e.g. reduced infection, inflammation) but variable outcomes in adults due to anatomical variation and comorbidities.
  • Accurate preoperative risk prediction could guide individualized planning and improve patient counseling.
  • We retrospectively analyzed 194 adult circumcisions (Milan, 2023–24) to train logistic regression, random forest, and support vector machines.

Introduction

  • Adult male circumcision presents unique clinical challenges due to comorbidities and anatomical variability.
  • The procedure offers public health benefits, including reduced infections and STDs.
  • CO₂ laser circumcision has gained popularity for its advantages: less bleeding, faster healing, and shorter recovery.
  • Prior studies (e.g., Leonardi & Saitta) support the clinical efficacy of laser techniques.
  • This study uses supervised machine learning to predict short-term complications based on surgical method.
  • Syringe
  • It introduces a data-driven approach for
    individualized preoperative risk assessment.

Distribution of Data For All Numeric Values

Cohort Inclusion, Preprocessing, and Modeling Flow

Patients Evaluated (n = 202) Excluded: Under 18 (n = 8) Final Cohort: Adult Males ≥ 18 (n = 194) Data Preprocessing • Feature engineering • Comorbidity filtering • No missing data Surgical Modality • Traditional (n = 132) • Laser (n = 62) Modeling & Evaluation • LR, RF, SVM • 10-fold CV • Balanced class weights Model Calibration • Platt scaling • Threshold tuning (β = 1, 2) Outcome: Bleeding, Edema Pain, Infection within 7 days Final Sample Used (n = 194, 100%)

Feature Space

194 patients | 13 columns

  1. Age (years)
  2. BMI
  3. Surgical Technique
  4. Intraop. Blood Loss (ml)
  5. Intraop. Heart Rate (bpm)
  6. Intraop. Pulse Ox (%)
  1. Surgical Time (min)
  2. BMI (Obese)
  3. BMI (Overweight)
  4. BMI (Underweight)
  5. Intraop. SBP
  6. Intraop. DBP
  7. Diabetes

Dataset Preparation

  • All coding done in Python 3.11.11 using the following libraries:
      • Pandas, NumPy, Model_Tuner, Matplotlib, Seaborn, SHAP, and Scikit-Learn
  • Performed modeling via stratified 10-fold cross-validation (i.e. StratifiedKFold with k=10), ensuring each fold preserves our outcome’s class proportions.

Age-Related Distributions

Surgical Techniques

Establishing Linear Relationships

Table 1: Continuous Variables With Respect To Surgical Technique
Variable Mean SD Median Min Max Mode Count Traditional
(n = 132)
Laser
(n = 62)
P-value
Age_years 43.13 21.88 34.00 18.00 93.00 18.00 194 37.34 (22.29) 55.47 (14.86) < 0.001
BMI 24.07 3.00 23.68 17.34 36.57 21.63 194 24.08 (3.03) 24.06 (2.97) 0.97
Intraop_DBP 82.94 15.87 90.00 10.00 100.00 90.00 194 86.97 (11.12) 74.35 (20.54) < 0.001
Intraop_Mean_Heart_Rate_bpm 75.93 5.68 80.00 60.00 88.00 80.00 194 76.48 (5.66) 74.76 (5.57) 0.05
Intraop_Mean_Pulse_Ox_Percent 96.69 1.78 97.00 91.00 99.00 98.00 194 96.70 (1.80) 96.66 (1.76) 0.87
Intraop_SBP 122.37 10.61 120.00 100.00 150.00 120.00 194 121.82 (11.84) 123.55 (7.26) 0.21
Intraoperative_Blood_Loss_ml 7.90 15.51 0.00 0.00 100.00 0.00 194 11.59 (17.64) 0.03 (0.25) < 0.001
Surgical_Time_min 28.18 5.64 27.50 15.00 40.00 28.00 194 28.70 (5.44) 27.06 (5.94) 0.07
Table 1: Categorical Variables With Respect To Surgical Technique
Variable Count Proportion (%) Traditional (n = 132) Laser (n = 62) P-value
Age Group 194 100.00 132 62 < 0.001
age_group = 18-29 85 43.81 79 (59.85%) 6 (9.68%)
age_group = 60-69 30 15.46 9 (6.82%) 21 (33.87%)
age_group = 30-39 22 11.34 16 (12.12%) 6 (9.68%)
age_group = 70-79 20 10.31 10 (7.58%) 10 (16.13%)
age_group = 50-59 17 8.76 4 (3.03%) 13 (20.97%)
age_group = 40-49 10 5.15 7 (5.30%) 3 (4.84%)
age_group = 80-89 7 3.61 4 (3.03%) 3 (4.84%)
age_group = 90-99 3 1.55 3 (2.27%) 0 (0.00%)
BMI_Category_Obese 194 100.00 132 62 0.9918
BMI_Category_Obese = 0 183 94.33 124 (93.94%) 59 (95.16%)
BMI_Category_Obese = 1 11 5.67 8 (6.06%) 3 (4.84%)
BMI_Category_Overweight 194 100.00 132 62 0.5894
BMI_Category_Overweight = 0 141 72.68 98 (74.24%) 43 (69.35%)
BMI_Category_Overweight = 1 53 27.32 34 (25.76%) 19 (30.65%)
Diabetes 194 100.00 132 62 < 0.001
Diabetes = 0 164 84.54 121 (91.67%) 43 (69.35%)
Diabetes = 1 30 15.46 11 (8.33%) 19 (30.65%)
Complications: Bleeding, Edema, Pain, or Infection 194 100.00 57 (43.18%) 1 (1.61%) < 0.001
Complications = 0 136 70.10 75 (56.82%) 61 (98.39%))
Complications = 1 58 29.90 57 (43.18%) 1 (1.61%)

Outcome by Risk Factors and Surgical Techniques

Table 2. Hyperparameter Tuning Configurations

Model

Algorithm Type

Resampling Method

Hyperparameters Tuned

Tuning Range/Options

Final Selected
Hyperparameter Configuration

Logistic Regression

Linear Classifier

None, SMOTE, ROS

Penalty (penalty)

L2

Resampling method = SMOTE
penalty = L2
C = 1.0

Inverse Regularization Strength (C)

0.0001, 1.0

Random Forest

Ensemble Classifier

None, SMOTE, ROS

Number of Estimators (n_estimators )

10, 50

Resampling method = SMOTE
n_estimators = 50
max_depth = None
min_samples_split = 5

Maximum Depth (max_depth)

None, 10

Minimum Samples Split (min_samples_split)

2, 5

Support Vector Machines

Kernel-based Classifier

None, SMOTE, ROS

Kernel Type (kernel)

Linear, rbf, poly, sigmoid

Resampling method = None
kernel = rbf
C = 100
gamma = auto

Cost Parameter (C)

0.0001 to 100 (log scale)

Gamma (gamma)

0.001, 0.01, 0.05, 0.1, 0.2, 0.5, scale, auto

Model Calibration

Performance Assessment

Metrics (Point Estimate, 95% CI) Logistic Regression Random Forest Classifier Support Vector Machines
Precision/PPV 0.571 (0.469–0.663) 0.706 (0.596–0.810) 0.725 (0.623–0.826)
Average Precision 0.809 (0.704–0.899) 0.737 (0.613–0.875) 0.832 (0.735–0.913)
Sensitivity/Recall 0.897 (0.817–0.966) 0.828 (0.719–0.917) 0.862 (0.765–0.939)
Specificity 0.713 (0.632–0.783) 0.853 (0.786–0.910) 0.860 (0.797–0.915)
F1-Score 0.698 (0.603–0.778) 0.762 (0.667–0.837) 0.787 (0.706–0.857)
AUC ROC 0.900 (0.849–0.943) 0.887 (0.826–0.940) 0.907 (0.855–0.950)
Brier Score 0.137 (0.117–0.160) 0.105 (0.077–0.136) 0.105 (0.077–0.134)
Threshold (Point Estimate)
0.439
0.318
0.238

Support Vector Machines (SVM)

\(\displaystyle \min_{w,b,\{\xi_i\}}\) \(\displaystyle \left(\tfrac12\|w\|^2 + C\sum_i \xi_i\right)\) \(\mathrm{s.t.}\) \(\displaystyle y_i\left(w^\top\phi(x_i)+b\right)\ge1-\xi_i,\) \(\displaystyle \xi_i\ge0\)
\(\displaystyle K\left(x,x'\right)=\) \(\displaystyle \exp\!\left(-\gamma\,\|x - x'\|^2\right),\) \(\displaystyle \gamma = \tfrac{1}{2\sigma^2}\)

SHAP (SHapley Additive exPlanations)

Conclusions

  • Promising Clinical Signals:
    Even with a relatively modest dataset, the support vector machine (SVM) model uncovered a few compelling and clinically aligned trends:
    • Laser circumcision was consistently associated with lower predicted complication risk, matching our clinical impressions.
    • Intraoperative blood loss and surgical technique emerged as the most influential features across all patients, validating their relevance in predicting postoperative outcomes.
    • High-risk predictions tended to cluster around patients with traditional technique, older age, diabetes, and higher BMI, which is consistent with established risk profiles.
    • The model also demonstrated strong calibration (low Brier scores), meaning its risk estimates appear reliable, not just accurate in terms of classification.
    • Importantly, the model was highly sensitive to complication risk in traditional cases, showing excellent alignment with expectations.

References

  1. Leonardi R, Saitta G. Laser Circumcision in Adult Males: A Modern Approach for Improved Outcomes. In: Surgical Advances in Urology. IntechOpen; 2022. https://doi.org/10.5772/intechopen.106084
  2. Lundberg SM, Lee S-I. (2017). A unified approach to interpreting model predictions. NeurIPS, 30: 4765–4774. PDF
  3. Demas CP, Khan S, Mandava SH, et al. The effect of diabetes on postoperative outcomes following male urethral sling placement. Can Urol Assoc J. 2016;10(7–8):E251–E254. https://doi.org/10.5489/cuaj.3613
  4. Talini C, Antunes LA, de Carvalho BCN, et al. Circumcision: postoperative complications that required reoperation. Einstein (São Paulo). 2018;16(3):eAO4241. https://doi.org/10.1590/S1679-45082018AO4241
  5. Van Calster B, McLernon DJ, van Smeden M, et al. Calibration: The Achilles heel of predictive analytics. BMC Med. 2019;17:230. https://doi.org/10.1186/s12916-019-1466-7
  6. Funnell A, Shpaner L, Petousis P. Model Tuner (v0.0.31b) [Software]. Zenodo. https://doi.org/10.5281/zenodo.12727322
  7. Shpaner L, Gil O. EDA Toolkit (v0.0.16) [Software]. Zenodo. https://doi.org/10.5281/zenodo.13162633
  8. Shpaner L. Model Metrics (v0.0.3a) [Software]. Zenodo. https://doi.org/10.5281/zenodo.14879819
  9. Harris CR, Millman KJ, van der Walt SJ, et al. Array programming with NumPy. Nature. 2020;585(7825):357–362. https://doi.org/10.1038/s41586-020-2649-2
  10. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research . 2011;12:2825–2830. https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
Thank You

Questions?

Leonid Shpaner, M.S.
email: Lshpaner@ucla.edu

Giuseppe Saitta, M.D.
email: gsaitta@hotmail.it