Topic

3D Vision

3D perception, reconstruction, neural rendering, and spatial reasoning.

7 papers · latest 2026-04-23

Most active fields for this topic

Computer Vision · 7

GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers

Yuxuan Xue, Ruofan Liang, Egor Zakharov et al.

significant🔴 AdvancedComputer Vision Diffusion Models 3D Vision

cs.CVcs.CV

Presents GeoRelight, a unified framework for joint geometrical relighting and 3D reconstruction using diffusion transformers, improving physical consistency and reducing error accumulation in single-image relighting.

Details → arXiv →

APC: Transferable and Efficient Adversarial Point Counterattack for Robust 3D Point Cloud Recognition

Geunyoung Jung, Soohong Kim, Inseok Kong et al.

significant🔴 AdvancedComputer Vision 3D Vision

cs.CVcs.CV

APC introduces a lightweight, transferable counterattack module that boosts 3D point cloud robustness without sacrificing accuracy—critical for real-time systems facing adversarial inputs in robotics or autonomous driving.

Details → arXiv →

Rethinking Patient Education as Multi-turn Multi-modal Interaction

Zonghai Yao, Zhipeng Tang, Chengtao Lin et al.

breakthrough🔴 AdvancedComputer Vision 3D Vision

cs.AIcs.CLcs.CV

Reframes patient education as dynamic multi-modal interaction, not static QA. Enables systems to guide users through images and respond to distress—critical for real-world medical AI interfaces.

Details → arXiv →

IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation

Haoyu Zheng, Tianwei Lin, Wei Wang et al.

breakthrough🔴 AdvancedComputer Vision 3D Vision

cs.CVcs.AIcs.CV

IAD-Unify unifies defect segmentation, explanation, and generation in one model, enabling end-to-end industrial inspection. A paradigm shift for AI-driven manufacturing quality control with real-time interpretability.

Details → arXiv →

Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors

Rong Wang, Ruyi Zha, Ziang Cheng et al.

breakthrough🔴 AdvancedComputer Vision Video Generation 3D Vision

cs.CVcs.CV

Uses 3D foundation priors to generate geometrically consistent orbital videos from single images, solving long-range view synthesis—a leap for AR/VR and robotics perception systems.

Details → arXiv →

Less Detail, Better Answers: Degradation-Driven Prompting for VQA

Haoxuan Han, Weijie Wang, Zeyu Zhang et al.

breakthrough🟡 IntermediateComputer Vision 3D Vision

cs.CVcs.CV

DDP shows that deliberately blurring images can make AI answer visual questions more accurately by forcing it to focus on core structures instead of distracting details. This flips conventional wisdom—less data can mean better performance, and it’s easy to plug into existing VQA systems.

Details → arXiv →

Free-Range Gaussians: Non-Grid-Aligned Generative 3D Gaussian Reconstruction

Ahan Shabanov, Peter Hedman, Ethan Weber et al.

significant🔴 AdvancedComputer Vision 3D Vision

cs.CVcs.CV

This paper changes how 3D scenes are built by removing the need for a rigid grid structure, allowing for more efficient and detailed models from just a few photos. It solves the problem of missing data in unobserved areas by generating plausible details rather than leaving gaps. Practitioners can use this to create lighter, faster 3D assets for games or VR without needing extensive camera rigs.

Details → arXiv →