Topic
3D Vision
3D perception, reconstruction, neural rendering, and spatial reasoning.
7 papers · latest 2026-04-23
Most active fields for this topic
Yuxuan Xue, Ruofan Liang, Egor Zakharov et al.
Presents GeoRelight, a unified framework for joint geometrical relighting and 3D reconstruction using diffusion transformers, improving physical consistency and reducing error accumulation in single-image relighting.
Geunyoung Jung, Soohong Kim, Inseok Kong et al.
APC introduces a lightweight, transferable counterattack module that boosts 3D point cloud robustness without sacrificing accuracy—critical for real-time systems facing adversarial inputs in robotics or autonomous driving.
Zonghai Yao, Zhipeng Tang, Chengtao Lin et al.
Reframes patient education as dynamic multi-modal interaction, not static QA. Enables systems to guide users through images and respond to distress—critical for real-world medical AI interfaces.
Haoyu Zheng, Tianwei Lin, Wei Wang et al.
IAD-Unify unifies defect segmentation, explanation, and generation in one model, enabling end-to-end industrial inspection. A paradigm shift for AI-driven manufacturing quality control with real-time interpretability.
Rong Wang, Ruyi Zha, Ziang Cheng et al.
Uses 3D foundation priors to generate geometrically consistent orbital videos from single images, solving long-range view synthesis—a leap for AR/VR and robotics perception systems.
Haoxuan Han, Weijie Wang, Zeyu Zhang et al.
DDP shows that deliberately blurring images can make AI answer visual questions more accurately by forcing it to focus on core structures instead of distracting details. This flips conventional wisdom—less data can mean better performance, and it’s easy to plug into existing VQA systems.
Ahan Shabanov, Peter Hedman, Ethan Weber et al.
This paper changes how 3D scenes are built by removing the need for a rigid grid structure, allowing for more efficient and detailed models from just a few photos. It solves the problem of missing data in unobserved areas by generating plausible details rather than leaving gaps. Practitioners can use this to create lighter, faster 3D assets for games or VR without needing extensive camera rigs.