Mind DeepResearch Technical Report

MindDR Team, Li Auto Inc

Recommendation Score

breakthrough🔴 AdvancedReasoning & Agents Alignment & SafetyBenchmarkUseful for both

Research context

Primary field

Reasoning & Agents

Reasoning, planning, tool use, and agentic workflows.

Topics

Alignment & Safety

Paper type

Benchmark

Best for

Useful for both

arXiv categories

cs.AIcs.AI

Why It Matters

Demonstrates leading deep research performance with 30B models via a novel three-agent architecture and specialized training—proving high capability doesn't require trillion-parameter models, reshaping cost-efficiency in autonomous AI systems.

Abstract

We present Mind DeepResearch (MindDR), an efficient multi-agent deep research framework that achieves leading performance with only ~30B-parameter models through a meticulously designed data synthesis and multi-stage training pipeline. The core innovation of MindDR lies in a collaborative three-agent architecture (Planning Agent, DeepSearch Agent, and Report Agent) and a four-stage agent-specialized training pipeline comprising SFT cold-start, Search-RL, Report-RL and preference alignment. With this regime, MindDR demonstrates competitive performance even with ~30B-scale models. Specifically, MindDR achieves 45.7% on BrowseComp-ZH, 42.8% on BrowseComp, 46.5% on WideSearch, 75.0% on xbench-DS, and 52.5 on DeepResearch Bench, outperforming comparable-scale open-source agent systems and rivaling larger-scale models. MindDR has been deployed as an online product in Li Auto. Furthermore, we introduce MindDR Bench, a curated benchmark of 500 real-world Chinese queries from our internal product user interactions, evaluated through a comprehensive multi-dimensional rubric system rather than relying on a single RACE metric. On MindDR Bench, MindDR achieves a state-of-the-art score of 51.8.

More in Reasoning & Agents → More on Alignment & Safety →

View on arXiv → Download PDF →

Published April 16, 2026