Autonomous Soft Tissue Surgical Systems with Learned Dexterity

1. Clinical Need

Approximately 50 million surgical procedures are performed in the United States each year. The Agency for Healthcare Research and Quality estimates the annual cost of preventable adverse events in hospitalized patients at $17.1 billion, with post-surgical complications constituting the largest component (Van Den Bos et al., “The $17.1 Billion Problem: The Annual Cost of Measurable Medical Errors,” Health Affairs, 2011). A study published in Annals of Surgery found that 1 in 10 patients who died within 90 days of surgery did so because of a preventable medical error.

Surgeon-dependent variability is a root cause. In laparoscopic cholecystectomy — the most common intra-abdominal operation in the US, with over 1.2 million cases performed annually (StatPearls, NCBI Bookshelf) — bile duct injury rates have not declined significantly over the past 30 years despite advances in instrumentation. Conversion from laparoscopic to open procedure occurs in 1–10% of cases, increasing hospital charges from an average of $23,946 to $32,446 per case (Giger et al., Surgical Endoscopy, 2012). Suture consistency — spacing, depth, and tension — directly affects anastomotic leak rates and healing outcomes, yet varies substantially between surgeons and across procedures performed by the same surgeon under fatigue.

The fundamental limitation of current teleoperated surgical robots is that they amplify the capabilities of a present, attentive surgeon but cannot compensate for human variability, fatigue, or the global shortage of experienced surgeons. The World Health Organization estimates that 5 billion people lack access to safe, affordable surgical care, with the disparity most acute in regions where surgeon-to-population ratios are 100-fold lower than in high-income countries.

The unmet clinical need is a surgical system capable of executing defined procedural phases — tissue manipulation, suturing, clipping, cutting — autonomously under human supervision, with mechanical consistency that equals or exceeds expert surgeon performance regardless of operator fatigue, geographic location, or institutional surgical volume.

2. State of the Art

Three parallel research programs have converged to establish the technical feasibility of Level 4 autonomous surgery — systems that execute surgical tasks independently while a supervising surgeon monitors and can intervene.

Autonomous suturing with near-infrared tracking

The Smart Tissue Autonomous Robot (STAR) program at Johns Hopkins University, led by Axel Krieger (Associate Professor of Mechanical Engineering), has progressed through three generations of autonomous soft tissue surgery systems. The foundational work (Shademan et al., Science Translational Medicine, 2016) demonstrated supervised autonomous intestinal anastomosis in live porcine models using plenoptic 3D imaging and near-infrared fluorescent markers for real-time tissue tracking. The system’s suture consistency — spacing and bite depth — exceeded expert surgeon performance on multiple metrics. The work was supported by NIH NIBIB awards R01EB020610 and R21EB024707.

Autonomous laparoscopic surgery

Saeidi et al. (Science Robotics, 2022) advanced STAR to laparoscopic operation — the clinically dominant approach for abdominal surgery. The system completed intestinal anastomosis in live porcine models with 83% of sutures placed autonomously. Suture spacing coefficient of variance was 0.08 (autonomous) versus 0.14 (expert manual laparoscopic), with comparable leak pressures. Animals survived one week post-procedure without anastomotic complications. This is the first demonstration of autonomous minimally invasive soft tissue surgery in a living animal model.

Hierarchical imitation learning for multi-step procedures

The SRT-H framework (Krieger et al., Science Robotics, 2025) demonstrated autonomous execution of the clipping-and-cutting phase of laparoscopic cholecystectomy — a multi-step sequence of 17 distinct surgical tasks including duct identification, clip placement, and vessel transection — in ex vivo porcine gallbladders. The system achieved 100% task completion across 8 unseen specimens without human intervention. Trained on approximately 18,000 demonstrations collected from over 30 porcine procedures, SRT-H uses a hierarchical architecture combining a language-conditioned high-level planner with a low-level motion policy, enabling real-time adaptation to anatomical variation and response to spoken corrections during execution.

Gap between research and clinical deployment

No commercial surgical robot operates at Level 4 autonomy. A systematic review (Attanasio et al., npj Digital Medicine, 2024) found that all 53 FDA-cleared surgical robots function at Level 1–3, with zero at Level 4 or Level 5. The da Vinci 5 system, FDA-cleared in 2024, introduces force feedback and enhanced imaging but remains a teleoperated platform requiring continuous surgeon control. The gap between what has been demonstrated in research and what is commercially available is wide — and defined by manufacturing, regulatory, and safety engineering challenges rather than algorithmic limitations.

3. Foundational Research

Shademan A, Decker RS, Opfermann JD, Leonard S, Krieger A, Kim PCW (2016). “Supervised autonomous robotic soft tissue surgery.” Science Translational Medicine, 8(337), 337ra64. DOI: 10.1126/scitranslmed.aad9398. PubMed: 27147588.

First in vivo supervised autonomous soft tissue surgery using the STAR system. Used plenoptic 3D and near-infrared fluorescent (NIRF) imaging to track tissue deformation during intestinal anastomosis in porcine models. The autonomous system produced more consistent suture spacing (coefficient of variance 0.07 vs. 0.15 for expert manual technique) and higher leak pressure tolerance than both manual laparoscopic and robot-assisted approaches. This established that autonomous surgical systems can achieve superior mechanical consistency compared to expert human performance — the quantitative basis for the clinical argument that autonomy improves outcomes, not merely automates labor.

Saeidi H, Opfermann JD, Kam M, Wei S, Leonard S, Hsieh MH, Kang JU, Krieger A (2022). “Autonomous robotic laparoscopic surgery for intestinal anastomosis.” Science Robotics, 7(62), eabj2908. DOI: 10.1126/scirobotics.abj2908. PubMed: 35080901.

Advanced STAR to laparoscopic operation. The system performed end-to-end intestinal anastomosis in live porcine models, placing 83% of sutures autonomously with the remaining 17% requiring minor human adjustment for needle re-grasping. Suture spacing coefficient of variance was 0.08 (autonomous) versus 0.14 (expert manual laparoscopic). Anastomoses survived one-week in vivo without complication. This transition from open to minimally invasive autonomous surgery is significant because laparoscopic surgery introduces additional constraints — limited workspace, restricted instrument degrees of freedom, and remote manipulation through trocar ports — that magnify the difficulty of autonomous tissue interaction.

Krieger A et al. (2025). “SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning.” Science Robotics, 10(104), eadt5254. DOI: 10.1126/scirobotics.adt5254. PubMed: 40632876.

Demonstrated autonomous execution of a multi-step surgical phase: clipping and cutting of the cystic duct and artery during laparoscopic cholecystectomy across 17 distinct tasks on 8 unseen ex vivo porcine gallbladders with 100% task completion. The hierarchical architecture combines a language-conditioned high-level planner with a low-level imitation learning policy, trained on 18,000 demonstrations from 34 porcine procedures. The framework accepts spoken corrections during execution. This represents the highest level of demonstrated surgical autonomy in a clinically realistic multi-step procedure — the transition from single-task autonomy (suturing) to multi-task procedural autonomy.

Attanasio A, Scaglioni B, De Momi E, Fiorini P, Valdastri P (2024). “Levels of autonomy in FDA-cleared surgical robots: a systematic review.” npj Digital Medicine, 7, 104. DOI: 10.1038/s41746-024-01102-y. PubMed: 38671232.

Analyzed 53 FDA-cleared surgical robotic systems using the six-level autonomy framework (Level 0–5). Results: 86% operate at Level 1 (robot assistance under continuous surgeon control), 14% at Level 2–3 (task or conditional autonomy), and zero at Level 4 or Level 5. Classification was predominantly via 510(k) (83%), with a growing number via De Novo. This systematic review quantifies the regulatory and commercial gap: no pathway for Level 4+ has been established, and the first entrant to define it sets the classification standard for all subsequent competitors.

ARPA-H ALISS Program Award (2024). PI: Robert J. Webster III, Vanderbilt University. Award: up to $12 million. Consortium: Vanderbilt, Johns Hopkins, University of Utah, University of Tennessee. Project period: 2024–2027.

ARPA-H’s Autonomy at a Less Invasive Scale in Surgery (ALISS) program targets fully autonomous tumor resection from trachea and prostate within three years, initially in simulated conditions. The program funds placement of Virtuoso Surgical Systems at consortium sites for AI/ML development. This $12M federal investment establishes autonomous surgery as a named federal research priority with explicit commercialization expectations — ARPA-H’s mandate is to accelerate health research toward practical applications, not to fund basic science.

4. Competitive Landscape

Intuitive Surgical (Sunnyvale, CA). Market capitalization exceeding $180 billion. Da Vinci platform holds approximately 80% robotic surgery market share globally. The da Vinci 5, FDA-cleared in 2024, remains Level 1 teleoperation. Intuitive’s business model depends on surgeon operators and per-procedure instrument sales, creating organizational inertia against autonomy that would reduce surgeon dependence.

Medtronic Hugo RAS and CMR Surgical Versius are pursuing Level 1 teleoperation market share through price competition and modular design, not autonomy development.

No commercial entity offers a Level 4 autonomous surgical system. The academic programs demonstrating Level 4 capabilities — Johns Hopkins STAR/SRT-H, Vanderbilt ALISS consortium — are research-stage without commercial translation vehicles. The gap between demonstrated research capability and commercial availability is defined by manufacturing, regulatory, and safety engineering challenges rather than algorithmic limitations.

5. Addressable Scope

Bottom-up calculation (US laparoscopic surgery)

Annual laparoscopic cholecystectomy (US): 1,200,000 (StatPearls, NCBI Bookshelf)
Procedures at robotics-equipped hospitals (~40%): 480,000
Autonomous system per-procedure fee premium: $3,500 (instrument kit + platform usage, below current $3,500–$4,500 robotic surgery premium)
Initial SAM (cholecystectomy): 480,000 × $3,500 = $1.68 billion annually
Expansion to appendectomy (300,000/yr), hernia repair (800,000/yr), colorectal (170,000/yr): +700,000 procedures at equipped hospitals
Expanded US TAM: (480,000 + 700,000) × $3,500 = $4.13 billion annually

Top-down cross-check

The global surgical robotics market was valued at $8.31 billion in 2025, projected to reach $12.83 billion by 2030 at 9.07% CAGR (Mordor Intelligence, 2025). Autonomous surgical systems capturing 15–20% of the 2030 market yields $1.9–$5.4 billion — consistent with the bottom-up estimate.

Reimbursement pathway

Robotic-assisted laparoscopic cholecystectomy is reimbursed under CPT 47562 (laparoscopic cholecystectomy, Medicare average: $652) and CPT 47563 (with cholangiography, $709). No separate code exists for robotic assistance. Autonomous systems would initially bill under the same codes with potential for a new technology add-on payment (NTAP), transitioning to dedicated reimbursement codes as utilization data accumulates.

6. Research Gaps and HHA Contribution

Three specific gaps separate published results from a deployable autonomous surgical product. Each maps to a specific HHA team member’s expertise.

Gap 1: Regulatory strategy for Level 4 autonomous surgical systems

No regulatory precedent exists for Level 4 surgical autonomy. All 53 FDA-cleared surgical robots operate at Level 1–3 (Attanasio et al., 2024). The FDA’s 2023 AI/ML guidance addresses algorithm updates but does not specifically address surgical autonomy levels. The entity that establishes the regulatory template — whether via De Novo classification, PMA, or a novel framework — sets the standard all subsequent entrants must follow.

HHA contribution: Hass Dhia’s biomedical sciences background and experimental design expertise enable framing autonomous surgery in clinical language that FDA reviewers expect. The regulatory submission requires defining clinical endpoints (anastomotic integrity, complication rates, time-to-recovery) that map engineering metrics (suture spacing CoV, force profiles) to patient outcomes — a translation between engineering and clinical domains that requires deep understanding of both surgical physiology and AI system behavior.

Why the originating labs haven’t closed this gap: Krieger’s lab at Johns Hopkins is a mechanical engineering research group. Their publications advance the algorithmic and systems engineering frontier, but they have neither regulatory affairs staff nor the institutional mandate to pursue FDA submissions. Universities publish papers; companies file regulatory submissions. The ALISS consortium explicitly targets research demonstration, not commercialization.

Gap 2: Sensor-rich end-effectors manufactured at clinical volumes

SRT-H uses custom instrumentation designed for research environments. Clinical deployment of 150,000–480,000 procedures per year requires single-use or limited-reuse sterile instruments manufactured with consistent quality under ISO 13485. The engineering challenge includes material selection for sterilization compatibility, sensor integration in a cost-constrained form factor, and automated assembly and packaging.

HHA contribution: Ahmed’s manufacturing engineering expertise directly addresses the lab-to-production transition. Design for manufacturability analysis of research instrument architectures identifies production-incompatible design choices before they are locked in. Tolerance analysis ensures sensor placement consistency across production volumes. Process validation establishes manufacturing repeatability for quality system compliance.

Why the originating labs haven’t closed this gap: Academic robotics labs hand-fabricate instruments. Manufacturing at clinical volumes requires industrial engineering capabilities — automated assembly lines, incoming material inspection, statistical process control — that are outside the scope and expertise of mechanical engineering research groups. This is a manufacturing problem, not a research problem.

Gap 3: Safety architectures for human-supervised autonomous operation

Current demonstrations use research-grade safety monitoring. Clinical deployment requires validated failure detection, graceful degradation to human control, real-time surgical scene understanding, and audit trails satisfying FDA quality system requirements. The safety architecture must be demonstrated across statistically significant patient populations and validated against defined failure modes.

HHA contribution: Haedar Hadi’s ML expertise maps to the safety-critical algorithm design: anomaly detection (out-of-distribution anatomy identification), confidence calibration (knowing when the system’s uncertainty exceeds safe operating thresholds), and evaluation methodology for comparing autonomous versus surgeon-performed procedures. Building robust evaluation frameworks for safety-critical AI systems is a systems engineering challenge that requires both ML depth and benchmark design rigor.

Why the originating labs haven’t closed this gap: Research demonstrations operate under controlled conditions with researchers monitoring every step. The transition from controlled research to robust clinical operation requires systematic failure mode analysis, adversarial testing, and formal verification methods that are engineering disciplines separate from the algorithm research that produced the demonstrated capabilities.

7. Comparable Funded Projects

Program	PI / Institution	Amount	Year
ARPA-H ALISS	Robert J. Webster III, Vanderbilt University	Up to $12M	2024
NIH NIBIB R01EB020610	Axel Krieger, Johns Hopkins University	R01 (est. $1.5–2.5M)	2016–
NIH NIBIB R21EB024707	Axel Krieger, Johns Hopkins University	R21 (est. $275K)	2017–
NSF NRI-3.0	Multiple PIs, multiple institutions	Multiple awards	2021–

The ARPA-H ALISS award is the defining signal: a $12M federal commitment specifically naming “Autonomy at a Less Invasive Scale in Surgery” validates autonomous surgery as a funded research priority with explicit clinical translation expectations. ARPA-H’s mandate — distinct from NIH’s basic science mission — is to accelerate health research toward practical deployment. The ALISS program targets demonstration within 3 years, establishing a compressed timeline that creates urgency for complementary commercialization-focused efforts.

NIH NIBIB’s progression from R21 (exploratory, ~$275K) to R01 (significant, ~$1.5–2.5M) funding for the STAR program reflects sustained institutional confidence in autonomous surgery as a viable research direction. The R01 funded the work that produced three landmark publications in Science Translational Medicine and Science Robotics.

8. Opportunity Assessment

TRL evidence chain: TRL 4 for the integrated system. Autonomous suturing: TRL 5 (in vivo, live porcine, Saeidi 2022). Multi-step procedural autonomy: TRL 4 (ex vivo, 100% completion, SRT-H 2025). Manufacturing-ready instrumentation: TRL 2–3. Safety architecture: TRL 2. Rate-limiting component: manufacturing and safety engineering, not algorithms.

Risk	Level	Mitigation
Anatomical variation beyond training distribution (adhesions, cystic duct variants in 12–18% of patients)	Moderate	Initial deployment restricted to elective cases screened by preoperative imaging. Anomaly detection flags out-of-distribution anatomy for human takeover. Go/no-go at M12: if anomaly detector achieves <2% false negative rate on a held-out set of 200 variant anatomy cases, proceed to expanded indication.
Regulatory uncertainty for Level 4 classification (no precedent; could require PMA)	Moderate-high	Pre-submission with CDRH framing system as “surgeon-supervised autonomous assistance” (analogous to Level 2–3 ADAS in automotive). ARPA-H endorsement strengthens narrative. Locked algorithm for initial submission; PCCP for adaptive capability via supplement.
Manufacturing cost for sensor-rich single-use instruments	Moderate	Initial launch as limited-reuse instruments (validated 10 procedures). Component cost reduction via ASIC integration, roll-to-roll sensor fabrication, automated assembly. DFM analysis begins month 1.

Regulatory pathway

De Novo classification is the most probable pathway, as no predicate exists for Level 4 surgical autonomy. The system would be classified as Class II with special controls. The AI/ML algorithm falls under FDA’s 2023 guidance on AI/ML-enabled Device Software Functions. A locked algorithm (trained then frozen) is recommended for initial clearance, simplifying the submission. A Predetermined Change Control Plan (PCCP) for adaptive capability can be added via supplemental submission after establishing a safety track record. The regulatory precedent for AI-controlled medical devices includes NeuroPace RNS System (responsive neurostimulation for epilepsy, PMA P100026) as a Class III example, and multiple De Novo AI/ML radiology tools as Class II examples.

Estimated timeline: 4–5.5 years to De Novo authorization. The classification, once granted, creates a regulatory moat: subsequent entrants can file 510(k) referencing this De Novo as predicate, but only after the first entrant has established the classification.

Proposed experimental approach (first 6 months)

Months 1–3: Reproduce SRT-H architecture on a standardized surgical robot platform (da Vinci Research Kit or Virtuoso). Validate baseline autonomous suturing performance on phantom tissue (target: CoV ≤ 0.10 for suture spacing). Begin DFM analysis of research instruments. Months 4–6: Extend to ex vivo porcine cholecystectomy, targeting 90% autonomous task completion across 20 specimens. Develop anomaly detection module for out-of-distribution anatomy identification. Submit pre-submission request to CDRH.

9. Team Capabilities

Co-Principal Investigator

Hass Dhia

MS Biomedical Sciences, medical school background (anatomy TA). Deep knowledge of laparoscopic surgical anatomy — abdominal wall layers, Calot’s triangle, cystic duct variants, hepatic hilum anatomy — and surgical physiology including tissue response to manipulation and electrocautery effects. Maps to: clinical endpoint definition for autonomous system evaluation, preclinical validation protocol design, anatomical variant identification for anomaly detection training, and regulatory submission framing in clinical language. AI infrastructure architect with experience building multi-agent orchestration systems, directly applicable to designing the hierarchical control and safety monitoring architecture.

Lead Principal Investigator

Haedar Hadi

MS Computer Science (Boston University, Information Systems focus). ML model development and evaluation methodology expertise. Maps to: imitation learning policy architecture (behavioral cloning, DAgger), anomaly detection for out-of-distribution surgical scene identification, confidence calibration for safety-critical autonomy thresholds, and benchmark design for comparing autonomous versus surgeon-performed procedures. Scalable compute infrastructure experience supports training policies on large-scale surgical video datasets (SRT-H used 18,000 demonstrations).

Key Team Member — Director of Manufacturing

Ahmed

Director of Manufacturing with expertise in design for manufacturability (DFM), production scaling, quality systems, and process optimization. Maps to: sensor-integrated surgical instrument production at clinical volumes, sterilization-compatible material selection, automated assembly process design, and ISO 13485 quality system implementation.

Most research proposals in surgical robotics end at “it works in the lab.” This proposal includes explicit DFM milestones at every phase, ensuring that prototype instrument decisions consider production scaling, tolerance analysis, sterilization compatibility, and quality systems from day one. This addresses the valley of death between TRL 4–5 research prototypes and TRL 7+ deployable medical devices — the specific gap where most funded surgical robotics research stalls. Without manufacturing engineering from project inception, research instrument designs propagate into production-incompatible architectures requiring costly ground-up redesign.

10. Recommended Next Steps

Funder programs to target

ARPA-H Open BAA — Autonomous surgical systems with manufacturing-integrated development. Estimated ask: $3–5M over 3 years (aligned with ALISS program scope).
NIH NIBIB R01 — Adaptive surgical autonomy with safety-constrained learning. Estimated ask: $1.5–2.5M over 5 years.
NSF SBIR/STTR Phase II — Manufacturable sensor-integrated surgical instruments. Estimated ask: $1M over 2 years.
NIH NCATS CTSA — Clinical translation pathway for autonomous surgical systems. Support for preclinical validation and pre-submission meetings.

Proposed 24-month milestone timeline

M1–3 R&D: Reproduce SRT-H on standardized platform. Baseline suturing validation (CoV ≤ 0.10). DFM: Instrument architecture analysis, material selection for sterilization. Regulatory: Literature review for De Novo justification.
M4–6 R&D: Ex vivo cholecystectomy (20 specimens, target 90% autonomous). Anomaly detection module v1. DFM: First prototype production-intent instrument. Regulatory: Pre-submission request to CDRH.
M7–12 R&D: In vivo porcine cholecystectomy (10 animals, target 85% autonomous, 1-week survival). Safety architecture v1 with validated failure modes. DFM: Limited-reuse instrument validation (10 cycles). Regulatory: CDRH pre-submission meeting and classification feedback.
M13–18 R&D: Expanded in vivo validation (30 animals, variant anatomy subset). Evaluation benchmark against expert surgeons. DFM: Pilot production line (50 instruments/month). Regulatory: De Novo submission preparation, design controls documentation.
M19–24 R&D: Pivotal study design and protocol development. Multi-site preclinical validation. DFM: Scale to 200 instruments/month. ISO 13485 audit preparation. Regulatory: De Novo submission filing or pivotal study IND.