PROJECT RADAR: Performance Indicators During Nerve-sparing Robotic Radical Prostatectomy Simulation
By: Leelakrishna Channa, BS, University of Connecticut School of Medicine, Farmington; Jared Bieniek, MD, Tallwood Urology & Kidney Institute, Hartford HealthCare, Connecticut | Posted on: 03 Aug 2023
Nerve-sparing (NS) robot-assisted radical prostatectomy (RARP) is a highly technical procedure that attempts to strike a delicate balance between maximal preservation of the neurovascular bundle (NVB) and preserving negative tumor margins. Postoperative potency recovery rates following NS-RARP among skilled surgeons range between 8% and 49% regardless of preservation technique.1-3 With 288,300 projected new prostate cancer diagnoses in the United States for 2023,4-5 focused work in improving skills assessment and feedback to urologists undergoing NS-RARP offers the potential to improve performance and thus benefit a significant patient population. Traditional methods of skills assessment using global rating systems like GEARS (Global Evaluative Assessment of Robotic Skills) are subjective video-based assessment dependent on the reviewers and requiring the full attention of an expert surgeon for what could amount to hours of time.6 Machine learning (ML) offers a promising alternative approach toward classification of technical skills that has been successful in classifying surgeons in both simulated and clinical settings.7-9 Our team has previously developed and validated a procedural hydrogel simulation model for NS-RARP embedded with integrated force sensors in the simulated NVB capable of measuring retraction and tension on the NVB (Figure 1, A and B).10 During AUA2022 in New Orleans, 50 board-certified urologists with varying experience in robotic surgery each completed the NS portion of our RARP model. Force sensor data from each simulation model were supplemented with video and kinematic data collected directly from the da Vinci console via an integrated data recorder (ie, “black box”).
Kinematic objective performance indicators (OPIs) were calculated, force data were analyzed, videos were annotated for 10 surgical gestures identified through hierarchical task analysis, and compound gesture motifs were input into several deep-learning ML methodologies to evaluate and predict the surgical expertise of users.11 Data analysis used a 3-element Gaussian mixture clustering algorithm to initially segregate participant data into 3 groups according to robotic and RARP-specific caseload: super users (SU), high-volume (HV), and low-volume (LV; Figure 2). OPIs were categorized into 3 broad categories: robotic efficiency, tool motion, and operative fluidity. A significant difference (< .05) across groups was seen in 31 OPIs: 8 across all 3 groups, 10 when comparing SU vs HV and HV vs LV, and 6 when comparing HV vs LV groups (Figure 2). Force metrics showed significant difference comparing SU and HV to LV (Figure 3). A significant negative correlation was found between force metrics and robotic volume, RARP volume, and 13 OPIs: 5 robotic efficiency, 4 tool motion, and 4 operative fluidity OPIs (Figure 3). Following initial analysis, all 3 data sources were combined and used to generate supervised classification algorithms for prediction of surgical expertise. Surgical expertise cutoff was based on a literature-derived learning curve for stabilization of functional RARP outcomes at 250 cases.12 K-nearest neighbors, support vector machines, and logistic regression (LR) architectures were evaluated for predictive capabilities for both single (ie, force, or gestures, or OPIs) and all 3 combined (ie, force and gestures, gestures and OPIs, etc) data modalities. Initial classification was performed with LR and further optimized via recursive feature elimination. Performance was evaluated via accuracy, precision, recall, F-score, and receiver operating characteristic area under the curve (AUC) scores, and supplemented with variable permutation importance calculations to assess individual variable contributions to model decision-making.
For single-modality analysis, LR performed best with force (64% accuracy, 0.64 AUC), and support vector machines performed best with gestures (84% accuracy, 0.94 AUC) and OPIs (80% accuracy, 0.89 AUC). LR outperformed other architectures with accuracy (80%, 82%, 85%, 84%) and AUC (0.90, 0.93, 0.94, 0.94) for all multiple combinations of streams (force+OPI, force+gesture, gesture+OPI, force+gesture+OPI). The highest-performing predictive model achieved utilized all 3 modalities and achieved 86% accuracy with AUC of 0.96, higher than any previously published works in this field. Variable permutation importance of the highest-performing model incorporated all modalities within the most contributory variables, indicative that OPIs, gestures, and force all contribute meaningfully toward accurate characterization of surgical experience (Figure 4).
This implementing of ML algorithms for highly accurate classification of surgical expertise offers promising implications for future objective assessment of overall proficiency within the NS portion of the RARP procedure. Our simulated platform offered the advantage of a standardized environment absent of confounding patient factors but lacked true clinical data. We aim to use this methodology for assessing clinical performance directly linked to potency outcomes to identify surgical patterns for NS-RARP in experts that are most predictive of postoperative potency.
- Kowalczyk KJ, Huang AC, Hevelone ND, et al. Stepwise approach for nerve sparing without countertraction during robot-assisted radical prostatectomy: technique and outcomes. Eur Urol. 2011;60(3):536-547.
- Vickers A, Savage C, Bianco F, et al. Cancer control and functional outcomes after radical prostatectomy as markers of surgical quality: analysis of heterogeneity between surgeons at a single cancer center. Eur Urol. 2011;59(3):317-322.
- Capogrosso P, Vertosick EA, Benfante NE, et al. Are we improving erectile function recovery after radical prostatectomy? Analysis of patients treated over the last decade. Eur Urol. 2019;75(2):221-228.
- Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73(1):17-48.
- Dehn T. Prostate cancer treatment. Ann R Coll Surg Engl. 2006;88(5):439-444.
- Goh AC, Goldfarb DW, Sander JC, Miles BJ, Dunkin BJ. Global evaluative assessment of robotic skills: validation of a clinical assessment tool to measure robotic surgical skills. J. Urol. 2012;187(1):247-252.
- Lam K, Chen J, Wang Z, et al. Machine learning for technical skill assessment in surgery: a systematic review. NPJ Digit Med. 2022;5(1):24.
- Hung AJ, Oh PJ, Chen J, et al. Experts vs super-experts: differences in automated performance metrics and clinical outcomes for robot-assisted radical prostatectomy. BJU Int. 2019;123(5):861-868.
- Hung AJ, Chen J, Ghodoussipour S, et al. A deep-learning model using automated performance metrics and clinical features to predict urinary continence recovery after robot-assisted radical prostatectomy. BJU Int. 2019;124(3):487-495.
- Witthaus MW, Farooq S, Melnyk R, et al. Incorporation and validation of clinically relevant performance metrics of simulation (CRPMS) into a novel full-immersion simulation platform for nerve-sparing robot-assisted radical prostatectomy (NS-RARP) utilizing three-dimensional printing and hydroge: incorporating clinical metrics in a RARP model. BJU Int. 2020;125(2):322-332.
- Vedula SS, Malpani AO, Tao L, et al. Analysis of the structure of surgical activity for a suturing and knot-tying task. PloS One. 2016;11(3):e0149174.
- Secin FP, Savage C, Abbou C, et al. The learning curve for laparoscopic radical prostatectomy: an international multicenter study. J. Urol. 2010;184(6):2291-2296.