Reinforcement Learning-Adaptive Rehabilitation Exoskeletons for Post-Stroke Gait Recovery

1. Clinical Need

Stroke is the leading cause of long-term disability in the United States. The CDC reports 795,000+ strokes per year, and 50–60% of survivors retain chronic lower-limb motor deficits. This translates to approximately 3.8–4.6 million Americans living with stroke-related gait impairment. The CDC estimates $56 billion in annual stroke-related costs, encompassing direct medical expenditure, rehabilitation services, and lost productivity.

Current rehabilitation follows a standard protocol: physical therapist-supervised gait training, 3–5 sessions per week, at $150–$300 per session. The therapist-to-patient ratio is the fundamental throughput bottleneck — there are not enough trained rehabilitation therapists to serve the existing patient population at the frequency and duration required for measurable neuroplastic recovery.

Commercially available lower-limb exoskeletons (ReWalk, Ekso Bionics EksoNR, Parker Hannifin Indego) use fixed pre-programmed gait trajectories that do not adapt to the patient’s current motor capacity, fatigue state, or recovery progress. The challenge point framework (Guadagnoli & Lee, Motor Control, 2004) establishes that optimal motor learning requires assistance calibrated to the learner’s current skill level — too much assistance prevents active motor engagement, too little causes failure and discouragement. A 34-RCT meta-analysis (1,166 patients, Frontiers in Neurology, 2024) confirms that exoskeleton-assisted gait training improves outcomes relative to conventional therapy, but the unresolved question is how to optimize the assistance strategy across the recovery trajectory. Fixed trajectories cannot solve this problem. Adaptive control can.

2. State of the Art

Four converging research threads define the current frontier in adaptive exoskeleton control.

Human-in-the-loop optimization

Zhang et al. (Science, 2017, Carnegie Mellon University) applied covariance matrix adaptation evolution strategy (CMA-ES) to optimize ankle exoskeleton assistance in real time, achieving 24.2±7.4% metabolic cost reduction in healthy subjects. Slade et al. (Nature, 2022, Stanford University) advanced this with Bayesian optimization for hip exoskeleton control in outdoor walking, demonstrating 23±8% metabolic reduction and 9±4% increase in self-selected walking speed, with 4x faster convergence than CMA-ES. Both studies validate the principle that data-driven optimization outperforms hand-tuned exoskeleton control — but both require hours of subject-specific calibration.

Sim-to-real reinforcement learning

Luo et al. (Nature, 2024, NC State University / University of Michigan) trained RL policies on a 50 degree-of-freedom musculoskeletal model with 208 muscles, then deployed them zero-shot on a 3.2 kg hip exoskeleton. Results: 24.3% metabolic reduction in walking, 13.1% in running, and 15.4% on stairs (n=8 subjects per condition). Training required 8 hours on a single RTX 3090 GPU. Zero human experiments were needed during the optimization phase — the simulation environment replaced months of human-in-the-loop data collection.

Task-agnostic neural control

Molinaro et al. (Nature, 2024, Georgia Institute of Technology) used a deep neural network to estimate biological joint moments from wearable sensor data (R²=0.83 prediction accuracy), enabling a clothing-integrated exoskeleton to reduce metabolic cost by 5.3–19.7% across 10 different locomotion activities without task-specific tuning. This demonstrates that a single learned controller can generalize across walking, running, stair climbing, and incline traversal.

Pathological gait adaptation

Chavarrias et al. (arXiv:2503.11433, 2025, Spanish National Research Council) implemented a TD3 reinforcement learning agent for knee exoskeleton control under spastic conditions. Using a digital twin with differentiable spastic reflex models, the agent achieved 10.6% torque reduction and 8.9% settling time improvement in simulated hemiparetic gait. This is the first published work applying deep RL to exoskeleton control explicitly for pathological movement patterns.

All four threads have been validated on human subjects or high-fidelity computational models. None have been combined into a rehabilitation product. Every FDA-cleared exoskeleton on the market still uses fixed gait trajectories.

3. Foundational Research

Zhang J, Fiers P, Witte KA, Jackson RW, Poggensee KL, Atkeson CG, Collins SH (2017). “Human-in-the-loop optimization of exoskeleton assistance during walking.” Science 356(6344):1280–1284. DOI: 10.1126/science.aal5054. PMID: 28642437.

Applied CMA-ES optimization to a tethered ankle exoskeleton, iteratively adjusting torque profiles based on real-time metabolic cost measurements via portable respirometry. Achieved 24.2±7.4% reduction in metabolic cost of walking in healthy subjects (n=11) compared to walking without the device. Optimization required approximately 1 hour of walking per subject. Established that data-driven optimization of exoskeleton assistance parameters produces results superior to hand-tuned control — the foundational result that motivates all subsequent RL-based approaches, including the proposed rehabilitation-specific extension.

Slade P, Kochenderfer MJ, Delp SL, Collins SH (2022). “Personalizing exoskeleton assistance while walking in the real world.” Nature 610(7931):277–282. DOI: 10.1038/s41586-022-05191-1. PMID: 36224415.

Deployed Bayesian optimization on a portable hip exoskeleton during outdoor walking on varied terrain. Achieved 23±8% metabolic reduction and 9±4% faster self-selected walking speed (n=10 healthy subjects), with 4x faster convergence than CMA-ES. The outdoor deployment demonstrated robustness to real-world variability — uneven surfaces, turns, speed changes. For the proposed rehabilitation application, the Bayesian optimization framework provides the rapid personalization mechanism needed to adapt to heterogeneous stroke impairment profiles without requiring hours of per-patient data collection.

Luo S, Jiang M, Zhang S, Zhu J, Yu S, Dominguez Silva I, Wang T, Rouse E, Zhou B, Yuk H, Su H (2024). “Experiment-free exoskeleton assistance via learning in simulation.” Nature 630(8016):353–359. DOI: 10.1038/s41586-024-07382-4. PMID: 38867127.

Trained reinforcement learning policies entirely in simulation using a 50 degree-of-freedom musculoskeletal model with 208 muscles, then transferred zero-shot to a physical 3.2 kg hip exoskeleton. Achieved 24.3% metabolic reduction in walking, 13.1% in running, and 15.4% on stairs (n=8 subjects per condition). Training time: 8 hours on a single RTX 3090 GPU. This result eliminates the need for human-in-the-loop optimization during policy development. For the proposed work, extending the simulation environment with hemiparetic musculoskeletal models enables training rehabilitation-specific RL policies without requiring stroke patients during the optimization phase.

Molinaro DD, King AS, Young AJ (2024). “Biomechanical-awareness-based task-agnostic assistance from a hip exoskeleton.” Nature 635(8038):337–344. DOI: 10.1038/s41586-024-08157-7. PMID: 39537888.

Developed a deep neural network that estimates biological joint moments from IMU and EMG sensor data with R²=0.83 prediction accuracy, deployed on a clothing-integrated hip exoskeleton. Demonstrated 5.3–19.7% metabolic cost reduction across 10 locomotion activities (walking, running, stair ascent/descent, inclines, load carrying) without activity-specific tuning (n=10 subjects). The task-agnostic generalization is significant for rehabilitation: stroke patients do not walk on treadmills in labs — they navigate homes, curbs, stairs, and uneven sidewalks. A controller that generalizes across activities addresses this clinical reality.

Chavarrias A, Torricelli D, Rocon E (2025). “Reinforcement Learning-Based Adaptive Control for Knee Exoskeletons: Managing Spasticity in Neurological Conditions.” arXiv:2503.11433.

Implemented a Twin Delayed Deep Deterministic Policy Gradient (TD3) reinforcement learning agent for knee exoskeleton control in the presence of velocity-dependent spastic reflexes. Built a digital twin incorporating a differentiable spastic reflex model calibrated to clinical Modified Ashworth Scale grades. Achieved 10.6% torque reduction and 8.9% settling time improvement in simulated hemiparetic gait. This is the first published work explicitly addressing RL-based exoskeleton control for pathological movement. The differentiable spasticity model provides a starting point for the proposed hemiparetic simulation environment.

Meta-analysis: Exoskeleton-assisted gait training after stroke (2024). Frontiers in Neurology, 2024. 34 randomized controlled trials, 1,166 participants.

Systematic review and meta-analysis of exoskeleton-assisted gait rehabilitation in stroke survivors. Results showed statistically significant improvements in motor control (Fugl-Meyer Assessment), gait velocity, step length, cadence, and functional independence (Functional Ambulation Category) compared to conventional therapy alone. Effect sizes varied by device type and training intensity but consistently favored exoskeleton-assisted training. This meta-analysis confirms that the hardware platform — powered lower-limb exoskeletons — is a validated rehabilitation modality. The research question is not whether exoskeletons help, but how to optimize the assistance strategy.

4. Competitive Landscape

Ekso Bionics (San Rafael, CA). 17% market share in rehabilitation exoskeletons. Primary product: EksoNR for institutional rehabilitation, Indego Personal for home use. All devices use fixed gait trajectories with manual therapist adjustment of assistance levels. Proposed merger with Applied Digital announced December 2025. No disclosed adaptive control research program.

Lifeward / ReWalk Robotics (Yokneam, Israel). ReWalk 7 launched April 2025 with Medicare HCPCS K1007 reimbursement at $94,617 per device. Fixed gait patterns tuned by clinician during fitting. No adaptive control.

Parker Hannifin Indego (Macedonia, OH). Fixed trajectory control, primarily targeting spinal cord injury rather than stroke rehabilitation. FDA-cleared for institutional and personal use.

The gap is software, not hardware. Zero commercial exoskeletons use reinforcement learning-adaptive control. The mechanical platforms exist. The actuators, sensors, and structural components are commercially available. What does not exist is a control system that adapts assistance in real time based on the patient’s neuromotor state and recovery trajectory. This is a software and algorithm problem, not a hardware problem — which means the barrier to entry is research expertise, not capital equipment.

5. Addressable Scope

Bottom-up calculation (US, post-stroke gait rehabilitation)

US strokes per year: 795,000 (CDC)
Survivors with chronic gait impairment eligible for exoskeleton rehabilitation: ~260,000
Per-patient rehabilitation cost (CPT 97116 gait training + CPT 97530 therapeutic activities, 36-session course): $7,200
Annual rehabilitation services scope: 260,000 × $7,200 = $1.87 billion
Institutional device procurement (hospitals, rehabilitation centers): ~$225 million
Personal exoskeleton market (home use, K1007 reimbursement at $94,617): potential $4.73 billion
Conservative total addressable scope: $2.1 billion
Full addressable scope including personal devices: $3.2 billion

Top-down cross-check

Global rehabilitation robotics market: $1.5 billion (2024), projected $4.2 billion by 2033 at 12.5% CAGR (Verified Market Reports). RL-adaptive exoskeletons capturing 15–25% of the rehabilitation robotics market by 2033 yields $630 million–$1.05 billion — consistent with the bottom-up estimate for the US institutional segment.

Reimbursement

CPT 97116 (gait training), CPT 97530 (therapeutic activities), HCPCS K1007 ($94,617 for personal exoskeleton). Medicare coverage determination for K1007 in 2024 validates the reimbursement pathway for exoskeleton devices in the US market.

6. Research Gaps and HHA Contribution

Three specific gaps separate published results from a deployable RL-adaptive rehabilitation exoskeleton. Each maps to specific HHA team expertise.

Gap 1: RL adaptation for neurological impairment, not healthy users

Every published RL exoskeleton study (Zhang 2017, Slade 2022, Luo 2024, Molinaro 2024) optimized assistance for healthy subjects. Hemiparetic gait is fundamentally different: asymmetric step lengths, reduced hip and knee flexion on the paretic side, circumduction, foot drop, velocity-dependent spasticity, and rapid fatigue. No laboratory has demonstrated sim-to-real RL transfer on stroke patients. Chavarrias et al. (2025) addressed spasticity in simulation only, without human subject validation.

Why this gap persists: Academic labs optimize for metabolic cost reduction in healthy subjects because it is a clean, publishable result with straightforward measurement (portable respirometry). Stroke rehabilitation requires clinical expertise to define what “better gait” means for a patient with left-side hemiparesis and Modified Ashworth Scale grade 2 ankle plantarflexor spasticity. RL researchers are not rehabilitation scientists.

HHA contribution: Hass Dhia’s biomedical and clinical background (MS Biomedical Sciences, medical school anatomy TA) provides the domain expertise to define stroke-specific reward functions. Which muscles to target, how to weight gait symmetry versus velocity, how to incorporate EMG-based voluntary activation metrics, how to stratify patients by impairment severity — these are clinical questions that determine whether the RL agent optimizes for the right objective. Haedar Hadi’s RL expertise implements the reward function as a trainable policy.

Gap 2: Progressive challenge calibration (rehabilitation-specific reward design)

All published exoskeleton RL research optimizes for making walking easier — minimizing metabolic cost, maximizing walking speed, reducing joint loading. Rehabilitation requires the opposite: progressively withdrawing assistance to force active motor engagement. The challenge point framework (Guadagnoli & Lee, 2004) establishes that motor learning is maximized when task difficulty is calibrated to the learner’s current ability. A rehabilitation exoskeleton that makes walking too easy prevents neuroplastic recovery. One that provides too little support causes falls and discouragement.

Why this gap persists: The reward function for rehabilitation is not metabolic cost. It must incorporate gait symmetry (paretic vs. non-paretic step length ratio), voluntary EMG activation on the paretic side, therapist-defined recovery milestones (Functional Ambulation Category progression), and fatigue monitoring — a multi-objective optimization problem that requires both clinical rehabilitation knowledge and RL algorithm design expertise. These disciplines do not coexist in any single research group.

HHA contribution: Hass frames the clinical problem — defining the multi-objective reward function that balances assistance, challenge, safety, and recovery progression. Haedar designs the RL architecture — constrained multi-objective policy optimization with safety bounds, curriculum learning for progressive difficulty, and evaluation frameworks for comparing adaptive versus fixed assistance in clinical trials.

Gap 3: Manufacturing at clinical volumes

Research exoskeletons are hand-assembled, 3D-printed, single-purpose prototypes with bill-of-materials costs of $15,000–$25,000. Clinical deployment requires ISO 13485-certified manufacturing, IEC 60601 electrical safety compliance, 50,000-hour mean time between failures, and production capacity of 1,000–10,000 units per year. No academic lab has addressed these requirements because they are manufacturing engineering problems.

HHA contribution: Ahmed (Director of Manufacturing) integrates Design for Manufacturability from day one. Motor selection (brushless DC vs. series elastic actuators), compliant mechanism design for joint interfaces, medical-grade sensor integration (strain gauges, IMUs, EMG electrodes), tolerance analysis for actuated joints, and quality system documentation. His manufacturing expertise ensures that every prototype decision considers production scaling, supplier qualification, and batch consistency testing.

Most research proposals end at “it works in the lab.” This proposal includes explicit DFM milestones at every phase, ensuring that prototype decisions consider production scaling, tolerance analysis, and quality systems from day one. This addresses the valley of death between TRL 4–5 prototypes and TRL 7+ deployable systems — the gap where most funded rehabilitation robotics research stalls.

Why originating labs have not closed these gaps

The RL groups (Luo at NC State, Slade at Stanford, Molinaro at Georgia Tech) are biomechanics and robotics labs without clinical rehabilitation expertise or manufacturing infrastructure. The rehabilitation robotics groups (clinical trial sites using Ekso, ReWalk) use fixed-trajectory commercial devices and do not develop control algorithms. Exoskeleton manufacturers (Ekso, Lifeward, Parker) have not invested in adaptive control R&D. The gaps persist because they require integration across adaptive RL algorithms, clinical rehabilitation science, and manufacturing engineering — disciplines that do not coexist in any single academic lab or commercial entity.

7. Comparable Funded Projects

Source	PI / Entity	Amount	Focus
NSF CAREER CMMI-1944655	Hao Su, NC State University	~$500K	Funded the foundational work behind Luo et al. (Nature, 2024) — sim-to-real RL for hip exoskeleton control. Demonstrated experiment-free optimization via musculoskeletal simulation. Directly validates NSF appetite for RL-based exoskeleton research.
NSF Future of Work 2231419	Multiple PIs	~$750K	Adaptive exoskeleton control using reinforcement learning for workplace ergonomics and injury prevention. Validates NSF funding for RL-adaptive wearable robotics outside the rehabilitation context.
NIH Intramural	Thomas C. Bulea, NIH Clinical Center	Intramural	Pediatric knee exoskeleton with sensor-driven assistance for children with cerebral palsy. Demonstrates NIH interest in adaptive exoskeleton control for neurological conditions — adjacent to post-stroke rehabilitation.
NIH NIDILRR Center Grant	Multiple PIs	$4.625M	Rehabilitation Engineering Research Center (2015–2020) for wearable rehabilitation robots. Five-year center grant funding development and clinical testing of powered exoskeletons for gait rehabilitation. Validates NIDILRR as a primary funder for this research area.
CMS Medicare	K1007 Determination (2024)	$94,617/device	Medicare coverage determination establishing HCPCS K1007 reimbursement code for personal exoskeletons. Sets the reimbursement ceiling for home-use devices and validates the commercial pathway for exoskeleton rehabilitation technology.

Combined, these funding sources demonstrate sustained government investment in exoskeleton rehabilitation technology across NSF (algorithm development), NIH (clinical application), NIDILRR (rehabilitation engineering centers), and CMS (reimbursement infrastructure). The RL-adaptive control layer — the specific gap this proposal addresses — sits at the intersection of all four funding streams.

8. Opportunity Assessment

TRL evidence chain: TRL 5 for the integrated RL-adaptive rehabilitation exoskeleton system. Sim-to-real RL for healthy-user exoskeletons: TRL 5–6 (validated on human subjects in outdoor environments, Slade 2022; Luo 2024). RL for pathological gait: TRL 3–4 (simulation only, Chavarrias 2025). Rehabilitation exoskeleton hardware: TRL 7–8 (FDA-cleared devices on market). Integration of rehabilitation-specific RL with existing hardware: not yet demonstrated — integrated system at TRL 5 based on component maturity.

Research Question 1: Can RL policies generalize across heterogeneous stroke impairments?

Stroke impairment varies by lesion location, severity, chronicity, spasticity pattern, and comorbidities. A single RL policy trained on one impairment profile may not generalize to the clinical population.

Mitigation: Extend the Luo et al. musculoskeletal simulation environment with parameterized hemiparetic models spanning Modified Ashworth Scale grades 0–4, asymmetry ratios from 0.5–0.9, and variable fatigue profiles. Train a population of policies across impairment parameters, then use rapid Bayesian personalization (Slade approach) to fine-tune to individual patients in under 10 minutes of walking.

Moderate

Research Question 2: Can progressive-challenge RL safely operate in patients with impaired balance?

Withdrawing exoskeleton assistance to promote motor learning creates fall risk in patients with compromised balance and proprioception. An RL agent that reduces support too aggressively could cause injury.

Mitigation: Safety-constrained RL (constrained policy optimization) with hard firmware-level torque limits that cannot be overridden by the learned policy. IMU-based fall detection with automatic transition to full support mode within 50 ms. Progressive challenge bounded by therapist-defined safety envelope updated at each clinical visit.

Moderate

Research Question 3: Regulatory pathway for AI-adaptive Class II medical device?

An exoskeleton with a locked (non-adaptive) algorithm qualifies for 510(k) clearance with Ekso/ReWalk as predicate devices. An adaptive algorithm that changes behavior based on patient data falls under FDA’s 2023 guidance on predetermined change control plans (PCCP) for AI/ML-enabled device software.

Mitigation: NeuroPace RNS System (responsive neurostimulation for epilepsy) provides regulatory precedent for an implanted device with adaptive AI-driven control cleared by FDA. The exoskeleton application is lower risk (external, non-implanted). PCCP framework allows pre-specified algorithm updates within validated bounds. Initial submission uses locked algorithm; PCCP amendment adds adaptive capability post-clearance.

Moderate

Proposed first 6 months

Build hemiparetic gait musculoskeletal simulation environment extending the Luo et al. 50-DOF model with parameterized spasticity, asymmetry, and fatigue modules (Haedar, Hass)
Design rehabilitation-specific multi-objective reward function incorporating gait symmetry, voluntary EMG, therapist milestones, and fatigue — with input from a therapist advisory panel (Hass)
Train RL policies in simulation across 50 impairment profiles spanning the clinical population (Haedar)
Validate on able-bodied subjects wearing asymmetric gait-constraining orthoses to simulate hemiparetic patterns before any stroke patient involvement (Hass, Haedar)

9. Team Capabilities

Co-Principal Investigator

Hass Dhia

MS Biomedical Sciences with medical school background (anatomy TA). AI infrastructure architect with experience in multi-agent orchestration and evaluation framework design. Maps to: clinical problem identification (which muscles to target — tibialis anterior for foot drop, gluteus medius for Trendelenburg gait, hip flexors for swing phase initiation), gait biomechanics analysis, patient stratification by impairment severity, rehabilitation-specific reward function design incorporating Fugl-Meyer motor assessment milestones, experimental protocol design for human subjects studies, and anatomical/physiological domain knowledge for validating the hemiparetic musculoskeletal simulation model against clinical gait analysis data.

Lead Principal Investigator

Haedar Hadi

MS Computer Science (Boston University, Information Systems focus) with cloud and database architecture expertise. Maps to: RL algorithm design (PPO and TD3 policy networks for continuous exoskeleton torque control), sim-to-real transfer methodology extending the Luo et al. simulation framework, reward function engineering for multi-objective optimization (gait symmetry, metabolic cost, voluntary activation, fatigue), evaluation frameworks and benchmark design for comparing adaptive versus fixed-trajectory assistance in controlled clinical trials, and scalable compute infrastructure for training RL policies on simulated patient populations spanning the impairment parameter space.

Key Team Member — Director of Manufacturing

Ahmed

Director of Manufacturing with expertise in Design for Manufacturability of actuated lower-limb devices. Maps to: motor selection (brushless DC actuators, series elastic elements, compliant mechanisms for joint interfaces), medical-grade sensor integration (strain gauges, 9-axis IMUs, surface EMG electrodes), production scaling from prototype to 1,000–10,000 units per year, quality systems (ISO 13485, IEC 60601 electrical safety), tolerance analysis for actuated hip/knee/ankle joints, and supplier qualification for medical-grade components. Most research proposals end at “it works in the lab.” This proposal includes explicit DFM milestones at every phase, ensuring that prototype decisions consider production scaling, tolerance analysis, and quality systems from day one. This addresses the valley of death between TRL 4–5 prototypes and TRL 7+ deployable systems — the gap where most funded rehabilitation robotics research stalls.

Acknowledged capability gaps

Lab access for human subjects testing: HHA does not operate a gait analysis laboratory with motion capture, force plates, or portable respirometry. Human subjects validation will be conducted via subcontract with a university rehabilitation engineering lab equipped with instrumented treadmills, 3D motion capture (Vicon/OptiTrack), and IRB infrastructure for stroke patient recruitment. Grant funds cover subcontract costs.

Electrode and sensor fabrication: Surface EMG electrodes, strain gauge arrays, and IMU modules will be sourced from established medical-grade suppliers (Delsys, Analog Devices, Bosch Sensortec) rather than fabricated in-house. This is a procurement decision, not a capability gap — no exoskeleton company fabricates its own sensors.

10. Recommended Next Steps

Target funder programs

NSF CBET (Biomedical Engineering) — $500K–$1.5M, 3-year awards for rehabilitation technology development. RL-adaptive exoskeleton control falls within the Neural Engineering and Rehabilitation Engineering program areas.
NIH NINDS R01 (Neurological Disorders and Stroke) — $500K–$2M per year for rehabilitation neuroscience. Post-stroke motor recovery with adaptive technology is a core NINDS priority.
NIH NICHD R21 (Rehabilitation Research) — $275K for exploratory/developmental rehabilitation technology. R21 mechanism for proof-of-concept validation of the RL-adaptive control approach before a full R01 application.
NIDILRR RERC (Rehabilitation Engineering Research Center) — $5M over 5 years for center grants in rehabilitation engineering. Exoskeleton rehabilitation is a documented RERC priority area (comparable: $4.625M wearable rehabilitation robots center, 2015–2020).
SBIR/STTR Phase I — $275K for feasibility studies of medical devices. Appropriate for initial validation of the adaptive control algorithm on a commercial exoskeleton platform.

Estimated funding range

Based on comparable awards: $1.5M–$3M for initial 24-month program (R01/NSF CAREER scale). Comparable: NSF CAREER CMMI-1944655 (Hao Su, NC State) funded the Luo et al. Nature 2024 work. NIH NIDILRR center grants fund $4.625M over 5 years for rehabilitation robotics programs. Full translation through first-in-human clinical trial: $5M–$8M over 36–48 months.

Proposed 24-month milestone timeline

M1–3 Simulation: Hemiparetic musculoskeletal simulation environment extending 50-DOF model with parameterized spasticity, asymmetry, and fatigue (Haedar, Hass). Clinical: Rehabilitation-specific reward function design with therapist advisory panel (Hass).
M1–6 Reward design: Multi-objective reward function incorporating gait symmetry ratio, paretic-side voluntary EMG activation, Functional Ambulation Category milestones, and fatigue monitoring. Iterative refinement with 3–5 rehabilitation therapists (Hass).
M3–9 RL training: Policy training across 50 simulated impairment profiles spanning Modified Ashworth Scale grades 0–4, step asymmetry ratios 0.5–0.9, and variable fatigue curves. Comparison of PPO, TD3, and SAC architectures (Haedar).
M4–12 DFM: Design for Manufacturability analysis of research exoskeleton platform. Motor selection, sensor integration, compliant joint mechanism design, production-path feasibility assessment (Ahmed).
M6–12 Validation: Able-bodied validation study (n=20) with asymmetric gait-constraining orthoses. IRB protocol development for subsequent stroke patient study. Bayesian personalization convergence testing (Hass, Haedar).
M12–18 Clinical pilot: Stroke patient pilot study (n=10) comparing RL-adaptive assistance versus fixed-trajectory control. Primary endpoints: Fugl-Meyer motor score change, 10-meter walk test, gait symmetry ratio. Crossover design (all participants).
M12–24 Manufacturing: Prototype with medical-grade components, ISO 13485 quality system documentation initiation, accelerated wear testing for actuated joints (Ahmed).
M18–24 Regulatory: Pre-submission meeting with FDA CDRH. 510(k) strategy development with locked algorithm as initial submission, PCCP amendment pathway for adaptive capability (Hass).