AI Research Highlights

Tuesday, April 21, 2026

SafeAnchor: Preventing Cumulative Safety Erosion in Continual Domain Adaptation of Large Language Models

Dongxin Guo, Jikun Wu, Siu Ming Yiu

breakthrough🟡 IntermediateNLP LLM Reasoning Alignment & Safety

cs.LGcs.AIcs.LG

SafeAnchor reveals LLM safety is fragile and erodes cumulatively during domain adaptation. Practitioners must now actively preserve safety across updates—this is the first method to do so systematically in continual settings.

Details → arXiv →

How Adversarial Environments Mislead Agentic AI?

Zhonghao Zhan, Huichi Zhou, Zhenhao Li et al.

breakthrough🔴 AdvancedReasoning & Agents AI Agents

cs.AIcs.AI

Introduces the 'Trust Gap' in agentic AI, revealing that tools can be weaponized to mislead agents—demanding new evaluation standards that test skepticism, not just competence, for real-world deployment safety.

Details → arXiv →

Tool Learning Needs Nothing More Than a Free 8B Language Model

Chenming Tang, Hsiu-Yuan Huang, Weijie Liu et al.

breakthrough🟡 IntermediateNLP LLM Reasoning

cs.LGcs.CLcs.LG

TRUSTEE trains tool-calling agents without labeled data or commercial models, using dynamic environment synthesis with only an 8B LLM—democratizing powerful agent training for any builder with minimal resources.

Details → arXiv →

WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference

Zixuan Liu, Zhiyong Chen, Nan Xue et al.

breakthrough🔴 AdvancedMachine Learning Efficient Inference

cs.ITcs.AIcs.IT

WISV adapts speculative decoding verification to wireless conditions using semantic, not token-level, checks—dramatically improving edge-LLM latency and throughput in real-world mobile deployments.

Details → arXiv →

Using large language models for embodied planning introduces systematic safety risks

Tao Zhang, Kaixian Qu, Zhibin Li et al.

breakthrough🔴 AdvancedReasoning & Agents Alignment & Safety Embodied Agents

cs.AIcs.LGcs.RO

DESPITE reveals that even highly accurate LLM planners can systematically fail safety-critical tasks, exposing a critical gap between planning accuracy and real-world safety—essential for deploying robots in human environments.

Details → arXiv →

AIT Academy: Cultivating the Complete Agent with a Confucian Three-Domain Curriculum

Jiaqi Li, Lvyang Zhang, Yang Zhao et al.

breakthrough🔴 AdvancedReasoning & Agents AI Agents

cs.AIcs.AI

AIT Academy proposes the first principled curriculum for holistic agent development, addressing systemic gaps in current agent training—vital for builders aiming for general-purpose AI agents.

Details → arXiv →

Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

Wentao Zhang, Yan Zhuang, ZhuHang Zheng et al.

breakthrough🔴 AdvancedNLP RAG

cs.CRcs.AIcs.CR

DEJA exposes stealthy RAG failures that mimic valid responses, forcing a paradigm shift in security evaluation—essential for deploying reliable RAG systems that must detect subtle, non-obvious degradation.

Details → arXiv →

Human-Guided Harm Recovery for Computer Use Agents

Christy Li, Sky CH-Wang, Andi Peng et al.

breakthrough🔴 AdvancedReasoning & Agents AI Agents

cs.AIcs.CLcs.AI

Human-Guided Harm Recovery introduces the first formal framework for correcting harmful agent actions post-execution, enabling safe, real-world deployment of AI agents with human-aligned recovery protocols.

Details → arXiv →

Stability Implies Redundancy: Delta Attention Selective Halting for Efficient Long-Context Prefilling

Yujie Chen, Tailai Chen, Yifeng Gao et al.

breakthrough🔴 AdvancedMachine Learning Model Compression

cs.AIcs.AI

Introduces delta attention halting that detects semantic fixing points to skip redundant token processing, enabling hardware-compatible efficiency gains in long-context LLMs without sacrificing accuracy—critical for deploying scalable inference.

Details → arXiv →

From Natural Language to Executable Narsese: A Neuro-Symbolic Benchmark and Pipeline for Reasoning with NARS

Mina Gabriel, Pei Wang

significant🔴 AdvancedReasoning & Agents LLM Reasoning

cs.AIcs.AI

Presents a neuro-symbolic pipeline translating natural language into Narsese, enabling interpretable, uncertainty-aware reasoning—vital for building trustworthy AI systems requiring explicit logic over LLM hallucinations.

Details → arXiv →

MoE-nD: Per-Layer Mixture-of-Experts Routing for Multi-Axis KV Cache Compression

Libo Sun, Peixiong He, Po-Wei Harn et al.

significant🔴 AdvancedMachine Learning Model Compression Efficient Inference

cs.LGcs.CLcs.LG

MoE-nD tailors KV cache compression per layer, boosting accuracy over uniform methods. Practitioners should care because it enables longer context inference with minimal memory overhead without retraining.

Details → arXiv →

Copy-as-Decode: Grammar-Constrained Parallel Prefill for LLM Editing

Ziyang Liu

breakthrough🔴 AdvancedNLP LLM Reasoning

cs.CLcs.AIcs.CL

Copy-as-Decode revolutionizes LLM editing by replacing full regeneration with grammar-constrained copy-gen operations, slashing latency and improving precision—critical for real-time code/text editing systems.

Details → arXiv →

MetaCloak-JPEG: JPEG-Robust Adversarial Perturbation for Preventing Unauthorized DreamBooth-Based Deepfake Generation

Tanjim Rahaman Fardin, S M Zunaid Alam, Mahadi Hasan Fahim et al.

breakthrough🔴 AdvancedComputer Vision Diffusion Models

cs.CVcs.CV

MetaCloak-JPEG delivers JPEG-robust adversarial perturbations that block unauthorized DreamBooth deepfakes even after compression—essential for real-world privacy protection where images are routinely shared in degraded formats.

Details → arXiv →

Toward Zero-Egress Psychiatric AI: On-Device LLM Deployment for Privacy-Preserving Mental Health Decision Support

Eranga Bandara, Asanga Gunaratna, Ross Gore et al.

breakthrough🔴 AdvancedNLP LLM Reasoning

cs.AIcs.AI

First on-device LLM deployment for psychiatric decision support that eliminates cloud egress, preserving patient privacy in high-risk settings. Enables real-time, compliant mental health AI without data leakage risks.

Details → arXiv →

Too Correct to Learn: Reinforcement Learning on Saturated Reasoning Data

Zhenwen Liang, Yujun Zhou, Sidi Lu et al.

breakthrough🔴 AdvancedReasoning & Agents LLM Reasoning

cs.LGcs.LG

CUTS solves RL mode collapse in saturated reasoning by sampling from constrained top-K outputs, enabling continued learning even when models are already correct—vital for improving LLM reasoning robustness without manual data curation.

Details → arXiv →