Juo-Tung (Justin) Chen
Juo-Tung (Justin) Chen
I am a PhD Student @ Johns Hopkins University in Mechanical Engineering, advised by Axel Krieger in the Intelligent Medical Robotic Systems and Equipment Lab (IMERSE). I am currently interning at NVIDIA (May 2026).
I received my Master's degree in Robotics from Johns Hopkins University and my Bachelor's degree in Biomechatronics Engineering from National Taiwan University. During my masters, I worked in Intuitive Computing Lab advised by Chien-Ming Huang, and during my undergrad, I worked in Robots and Medical Mechatronics Lab (RMML) advised by Ping-Lang Yen.
Research Interest
My research sits at the intersection of surgical robotics and robot learning, with a focus on building autonomous surgical systems. I am driven by the following questions:- How can we train surgical foundation models that generalize across procedures, robot platforms, and institutions by scaling multi-modal data collection?
- How can hierarchical imitation learning enable robots to complete long-horizon surgical tasks that require both high-level reasoning and precise low-level control?
- How can surgical tool pose estimation from monocular video unlock large-scale pretraining data for surgical foundation models — by recovering kinematics from the vast amounts of existing surgical footage that lack robot state recordings?
News
- [May, 2026] Started my internship at NVIDIA in Santa Clara, CA!
- [Apr, 2026] Our paper Open-H-Embodiment is on arXiv!
- [Feb, 2026] Our paper Cosmos-Surg-dVRK got accepted to RA-L!
- [Jan, 2026] ImitateCholec dataset paper published in Nature Scientific Data!
- [Sep, 2025] Our paper SutureBot was accepted to NeurIPS 2025!
- [Jul, 2025] Our paper SRT-H was featured on the cover of Science Robotics!
- [Jun, 2025] Our paper SurgiPose got accepted to IROS 2025!
Publications
* Equal contribution
Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics
Nigel Nelson*, Juo-Tung Chen*, Jesse Haworth*, Xinhao Chen*, Lukas Zbinden*, Dianye Huang*, ..., Nassir Navab, Mahdi Azizian, Sean D. Huver, Axel Krieger
Preprint 2026 Featured
Paper | Website | Dataset
Nigel Nelson*, Juo-Tung Chen*, Jesse Haworth*, Xinhao Chen*, Lukas Zbinden*, Dianye Huang*, ..., Nassir Navab, Mahdi Azizian, Sean D. Huver, Axel Krieger
Preprint 2026 Featured
Paper | Website | Dataset
We present Open-H-Embodiment, a large-scale multimodal dataset collected from multiple robotic surgical platforms to enable training of foundation models for medical robotics. The dataset spans diverse surgical procedures and provides synchronized video, robot kinematics, and tool state annotations to support the development of generalizable surgical AI systems.
ImitateCholec: A Multimodal Dataset for Long-Horizon Imitation Learning in Robotic Cholecystectomy
Pascal Hansen, Ji Woong Brian Kim, Antony Goldenberg, Juo-Tung Chen, Yuanzhe Amos Li, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Paul Maria Scheikl, Axel Krieger
Nature Scientific Data 2026
Paper | Dataset
Pascal Hansen, Ji Woong Brian Kim, Antony Goldenberg, Juo-Tung Chen, Yuanzhe Amos Li, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Paul Maria Scheikl, Axel Krieger
Nature Scientific Data 2026
Paper | Dataset
ImitateCholec is a multimodal dataset of expert robotic demonstrations for laparoscopic cholecystectomy, designed to support long-horizon imitation learning. It provides synchronized video, robot kinematics, and task annotations spanning full surgical procedures, enabling robots to learn from human surgical expertise at scale.
Cosmos-Surg-dVRK: World Foundation Model-based Automated Online Evaluation of Surgical Robot Policy Learning
Lukas Zbinden, Nigel Nelson, Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, Mahdi Azizian, Axel Krieger, Sean Huver
RA-L 2025
Paper | Website
Lukas Zbinden, Nigel Nelson, Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, Mahdi Azizian, Axel Krieger, Sean Huver
RA-L 2025
Paper | Website
We leverage world foundation models to enable automated online evaluation of surgical robot policies without requiring physical execution. Cosmos-Surg-dVRK generates realistic video rollout predictions that correlate with real-world task performance, enabling efficient policy iteration and reducing the time and cost of surgical robot learning.
SutureBot: A Precision Framework & Benchmark For Autonomous End-to-End Suturing
Juo-Tung Chen*, Jesse Haworth*, Nigel Nelson, Ji Woong Kim, Masoud Moghani, Chelsea Finn, Axel Krieger
NeurIPS 2025 Featured
Paper | Website | Dataset
Juo-Tung Chen*, Jesse Haworth*, Nigel Nelson, Ji Woong Kim, Masoud Moghani, Chelsea Finn, Axel Krieger
NeurIPS 2025 Featured
Paper | Website | Dataset
We present SutureBot, an end-to-end framework for autonomous robotic suturing that integrates visual perception, motion planning, and force-sensitive control into a unified imitation learning pipeline. We also introduce a comprehensive suturing benchmark with standardized metrics for evaluating performance across diverse tissue types and task configurations.
SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning
Ji Woong Kim, Juo-Tung Chen, Pascal Hansen, Lucy Shi, Antony Goldenberg, Samuel Schmidgall, Paul Scheikl, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Chelsea Finn, Axel Krieger
Science Robotics 2025 Featured
Paper | Website
Ji Woong Kim, Juo-Tung Chen, Pascal Hansen, Lucy Shi, Antony Goldenberg, Samuel Schmidgall, Paul Scheikl, Anton Deguet, Brandon White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Chelsea Finn, Axel Krieger
Science Robotics 2025 Featured
Paper | Website
SRT-H introduces a hierarchical framework for autonomous surgery combining a high-level language-conditioned planner with a low-level visuomotor policy trained via imitation learning. Featured on the cover of Science Robotics, the system successfully executes complex long-horizon surgical tasks by decomposing them into subtask sequences guided by natural language instructions.
SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
Juo-Tung Chen, XinHao Chen, Ji Woong Kim, Paul Maria Scheikl, Richard Jaepyeong Cha, Axel Krieger
IROS 2025
Paper
Juo-Tung Chen, XinHao Chen, Ji Woong Kim, Paul Maria Scheikl, Richard Jaepyeong Cha, Axel Krieger
IROS 2025
Paper
SurgiPose estimates 6-DoF kinematics of surgical tools from monocular endoscopic video by combining visual feature extraction with robot kinematics priors. This enables recovering tool pose from existing surgical footage that lacks robot state recordings, unlocking large-scale pretraining data for surgical foundation models.
Reducing Performance Variability and Overcoming Limited Spatial Ability: Targeted Training for Remote Robot Teleoperation
Juo-Tung Chen*, Tsung-Chi Lin*, Chien-Ming Huang
IROS 2024
Paper
Juo-Tung Chen*, Tsung-Chi Lin*, Chien-Ming Huang
IROS 2024
Paper
We investigate how targeted training can reduce performance variability in remote robot teleoperation and help users with limited spatial ability achieve expert-level proficiency. Our study identifies key perceptual and motor skills underlying teleoperation performance and proposes structured training protocols to address individual deficiencies.
Forgetful Large Language Models: Lessons Learned from Using LLMs in Robot Programming
Juo-Tung Chen, Chien-Ming Huang
AAAI Symposium 2023
Paper | Slides
Juo-Tung Chen, Chien-Ming Huang
AAAI Symposium 2023
Paper | Slides
We present lessons learned from deploying large language models for robot programming, focusing on the "forgetfulness" problem where LLMs fail to maintain consistent context across long interaction sequences. Our analysis surfaces practical guidelines for LLM-robot integration and highlights open challenges for the community.
Alchemist: LLM-Aided End-User Development of Robot Applications
Ulas Berk Karl, Juo-Tung Chen, Victor Nikhil Antony, Chien-Ming Huang
HRI 2024
Paper | Website
Ulas Berk Karl, Juo-Tung Chen, Victor Nikhil Antony, Chien-Ming Huang
HRI 2024
Paper | Website
Alchemist is an LLM-powered end-user development platform that enables non-experts to create robot applications through natural language interaction. By abstracting low-level robot APIs into conversational interfaces, Alchemist lowers the barrier to robot programming while maintaining flexibility for complex task specifications.