Field

Robotics

Embodied systems, control, manipulation, and navigation.

8 papers · latest 2026-04-23

Common topics in this field

Embodied Agents · 4 Robot Manipulation · 3 Vision-Language Models · 2 Navigation · 1

JoyAI-RA 0.1: A Foundation Model for Robotic Autonomy

Tianle Zhang, Zhihao Yuan, Dafeng Chi et al.

breakthrough🔴 AdvancedRobotics Embodied Agents

cs.ROcs.RO

Introduces JoyAI-RA, a vision-language-action foundation model that enhances robotic autonomy through improved generalization across diverse robotic embodiments and tasks.

Details → arXiv →

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance

Yupeng Zheng, Xiang Li, Songen Gu et al.

significant🔴 AdvancedRobotics Embodied Agents Vision-Language Models

cs.ROcs.RO

Presents a lightweight VLA model with world knowledge integration for efficient robot manipulation, enhancing spatial reasoning and task execution in compact robotic systems.

Details → arXiv →

Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input

Michael Ziegltrum, Jianhao Jiao, Tianhu Peng et al.

breakthrough🔴 AdvancedRobotics Navigation

cs.ROcs.RO

First to apply sparsely gated MoE to quadruped parkour, enabling efficient, high-performance locomotion on extreme terrain—reduces compute by 40% vs MLPs, making complex robotics feasible on edge hardware.

Details → arXiv →

$π_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities

Physical Intelligence, Bo Ai, Ali Amin et al.

breakthrough🔴 AdvancedRobotics Robot Manipulation

cs.LGcs.ROcs.LG

$π_{0.7}$ delivers emergent, zero-shot robotic capabilities via a steerable foundation model, enabling complex multi-stage tasks in unseen environments—transforming how robots generalize across tasks and embodiments in real-world settings.

Details → arXiv →

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

Yunsong Zhou, Hangxu Liu, Xuekun Jiang et al.

significant🔴 AdvancedRobotics Embodied Agents

cs.ROcs.AIcs.CV

SIM1 builds physics-aligned real-to-sim twins for deformable manipulation, letting purely synthetic training reach real-data parity at a fraction of collection cost and making sim-scaled robotics learning much more practical.

Details → arXiv →

PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing

Ruihang Xu, Dewei Zhou, Xiaolong Shen et al.

significant🔴 AdvancedRobotics Robot Manipulation

cs.CVcs.CV

Adds 3D geometry and physical constraints to image editing, plus a new benchmark, making object manipulation edits far more reliable for world-model, simulation, and synthetic-data workflows.

Details → arXiv →

E-VLA: Event-Augmented Vision-Language-Action Model for Dark and Blurred Scenes

Jiajun Zhai, Hao Shi, Shangwei Guo et al.

breakthrough🔴 AdvancedRobotics Embodied Agents Vision-Language Models

cs.CVcs.MMcs.RO

E-VLA uses event cameras—normally used in robotics—to let robots see and act in near-total darkness or blur, where normal cameras fail. This enables real-world robotic systems to operate reliably in challenging environments like smoke-filled rooms or fast-moving scenes.

Details → arXiv →

AnyUser: Translating Sketched User Intent into Domestic Robots

Songyuan Yang, Huibin Tan, Kailun Yang et al.

significant🔴 AdvancedRobotics Robot Manipulation

cs.ROcs.CVcs.HC

Programming domestic robots is often too complex for non-experts. This system allows users to instruct robots by simply sketching on a camera feed, removing the need for coding or pre-existing maps and making home robotics accessible to everyone.

Details → arXiv →