← Back to topics

Topic

Diffusion Models

Diffusion-based generation for images, video, and multimodal outputs.

7 papers · latest 2026-04-23

Most active fields for this topic

Yuxuan Xue, Ruofan Liang, Egor Zakharov et al.

cs.CVcs.CV

Presents GeoRelight, a unified framework for joint geometrical relighting and 3D reconstruction using diffusion transformers, improving physical consistency and reducing error accumulation in single-image relighting.

Chaojie Mao, Chen-Wei Xie, Chongyang Zhong et al.

breakthrough🔴 AdvancedComputer VisionDiffusion Models
cs.CVcs.CV

Wan-Image transforms image generation from aesthetic synthesis to professional-grade control, enabling precise typography, identity preservation, and workflow integration—essential for designers and product builders needing pixel-perfect outputs.

Ruibing Hou, Mingyue Zhou, Yuwei Gui et al.

cs.CVcs.CV

EgoMotion introduces the first diffusion-based framework for egocentric vision-language motion generation, enabling realistic 3D human motion synthesis from first-person views—critical for immersive VR, robotics, and human-robot interaction systems.

Tanjim Rahaman Fardin, S M Zunaid Alam, Mahadi Hasan Fahim et al.

breakthrough🔴 AdvancedComputer VisionDiffusion Models
cs.CVcs.CV

MetaCloak-JPEG delivers JPEG-robust adversarial perturbations that block unauthorized DreamBooth deepfakes even after compression—essential for real-world privacy protection where images are routinely shared in degraded formats.

Farbod Alinezhad, Jianfei Cao, Gary J. Young et al.

breakthrough🔴 AdvancedMultimodalDiffusion Models
cs.LG

CDM is the first diffusion model for counterfactual longitudinal outcomes, enabling accurate, uncertainty-quantified treatment effect predictions—vital for clinical decision systems and causal AI in healthcare.

Hiba Dahmani, Nathan Piasco, Moussab Bennehar et al.

breakthrough🔴 AdvancedComputer VisionDiffusion Models
cs.CVcs.CV

SEM-ROVER enables scalable, geometrically coherent 3D driving scene generation via semantic voxel-guided diffusion—enabling realistic, large-scale simulation for autonomous driving systems without view limitations.

Hyunsoo Cha, Wonjung Woo, Byungjun Kim et al.

significant🔴 AdvancedComputer VisionDiffusion Models
cs.CVcs.CV

Vanast eliminates the need for separate try-on and animation steps by doing both in one go, reducing distortions and identity drift. This means you can generate realistic, coherent videos of people wearing new clothes from just one image—useful for e-commerce and virtual fashion without complex pipelines.

© 2026 A2A.pub — AI to Action. From papers to practice, daily.
Summaries are AI-assistedPrivacyTerms