Researcher Mode

Explore by field and topic

Jump straight into the slice of AI research you care about. The taxonomy layer turns the daily feed into a browsable map: fields for broad domains, topics for recurring research questions, and paper-level tags for faster triage.

Landmark guides

Long-arc reading paths for understanding a field, not just today’s feed.

See all guides →

Active fields

5

Tracked topics

15

Papers in latest available release

15

Fields

Broad domains for navigating the archive at a glance.

See all fields →

Reasoning & Agents

Reasoning, planning, tool use, and agentic workflows.

45 papers

Recent picks

UniToolCall: Unifying Tool-Use Representation, Data, and Evaluation for LLM Agents

FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

NLP

Language understanding, generation, extraction, and evaluation.

23 papers

Recent picks

Relax: An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning

Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo

Machine Learning

Core modeling, optimization, inference, and systems efficiency.

11 papers

Recent picks

Three Roles, One Model: Role Orchestration at Inference Time to Close the Performance Gap Between Small and Large Agents

MEMENTO: Teaching LLMs to Manage Their Own Context

KV Cache Offloading for Context-Intensive Tasks

Computer Vision

Image, video, and 3D perception plus visual generation.

7 papers

Recent picks

AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation

INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

SEM-ROVER: Semantic Voxel-Guided Diffusion for Large-Scale Driving Scene Generation

Robotics

Embodied systems, control, manipulation, and navigation.

4 papers

Recent picks

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing

E-VLA: Event-Augmented Vision-Language-Action Model for Dark and Blurred Scenes

Topics

Recurring problems and methods worth following over time.

See all topics →

AI Agents

Agentic systems, multi-agent coordination, and task planning.

37 papers

Recent picks

UniToolCall: Unifying Tool-Use Representation, Data, and Evaluation for LLM Agents

FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

LLM Reasoning

Papers about structured reasoning, proof solving, and long-chain problem solving.

23 papers

Recent picks

Relax: An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Decomposing and Reducing Hidden Measurement Error in LLM Evaluation Pipelines

Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism

Efficient Inference

Latency, serving, cache efficiency, and practical inference speed.

10 papers

Recent picks

Three Roles, One Model: Role Orchestration at Inference Time to Close the Performance Gap Between Small and Large Agents

MEMENTO: Teaching LLMs to Manage Their Own Context

KV Cache Offloading for Context-Intensive Tasks

Alignment & Safety

Alignment, preference learning, robustness, and safe deployment.

8 papers

Recent picks

ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models

Personalized RewardBench: Evaluating Reward Models with Human Aligned Personalization

BRIDGE: Multimodal-to-Text Retrieval via Reinforcement-Learned Query Alignment

RAG

Retrieval-augmented generation systems, evaluation, and retrieval-heavy workflows.

6 papers

Recent picks

Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning

Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

Tool Use

Function calling, API integration, and tool-augmented model behavior.

4 papers

Recent picks

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms

Embodied Agents

Reasoning and action grounded in the physical world.

3 papers

Recent picks

ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

E-VLA: Event-Augmented Vision-Language-Action Model for Dark and Blurred Scenes

Video Generation

Video synthesis, editing, and temporal generation systems.

3 papers

Recent picks

AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation

INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Physics-Aware Video Instance Removal Benchmark

Latest highlighted papers

Today's arXiv batch has not landed yet. Showing the latest available release from Tuesday, April 14, 2026.

Yijuan Liang, Xinghao Chen, Yifan Ge et al.

breakthrough🟡 IntermediateReasoning & AgentsAI Agents
cs.AIcs.AI

A unified 22k-tool, 390k-example tool-use stack that standardizes data and evaluation and lets an 8B model beat major commercial models on hard distractor-heavy calling.

Haoran Ding, Zhaoguo Wang, Haibo Chen

breakthrough🔴 AdvancedReasoning & AgentsAI Agents
cs.SEcs.AIcs.SE

This brings Hoare-style reasoning to 143k-line systems by inferring specs from caller intent, surfacing 522 new bugs in already-tested codebases.

Xiaomeng Hu, Yinger Zhang, Fei Huang et al.

breakthrough🟡 IntermediateReasoning & AgentsAI AgentsWorld Models
cs.CLcs.CL

OccuBench is a 100-scenario benchmark for professional agents across 65 domains that also injects hidden environment faults, exposing how brittle frontier models still are in real work settings.

Jinhua Wang, Biswa Sengupta

breakthrough🟡 IntermediateReasoning & AgentsAI Agents
cs.SEcs.AIcs.SE

This benchmark-driven translation of a production AI coding agent from Rust to Python shows how LLMs can migrate large systems continuously while staying competitive on real agent benchmarks.

CocoaBench Team, Shibo Hao, Zhining Zhang et al.

significant🟡 IntermediateReasoning & AgentsAI Agents
cs.CLcs.AIcs.CL

CocoaBench is a strong reality check for unified digital agents, with long-horizon tasks that force systems to combine vision, search, and coding in one workflow.

Liujie Zhang, Benzhe Ning, Rui Yang et al.

significant🔴 AdvancedNLPLLM Reasoning
cs.CLcs.CL

Relax is an open asynchronous RL engine for omni-modal post-training that doubles throughput on Qwen3-Omni-scale runs without sacrificing convergence.

© 2026 A2A.pub — AI to Action. From papers to practice, daily.
Summaries are AI-assistedPrivacyTerms