The ATOM Report: Measuring the Open Language Model Ecosystem

Nathan Lambert, Florian Brand

Recommendation Score

significant🟢 BeginnerNLP LLM ReasoningMethodBest for builders

Research context

Primary field

NLP

Language understanding, generation, extraction, and evaluation.

Topics

LLM Reasoning

Paper type

Method

Best for

Best for builders

arXiv categories

cs.CYcs.AIcs.LGcs.CY

Why It Matters

Maps the open-model ecosystem across downloads, derivatives, inference share, and performance, useful for choosing which families are winning real adoption rather than just benchmarks.

Abstract

We present a comprehensive adoption snapshot of the leading open language models and who is building them, focusing on the ~1.5K mainline open models from the likes of Alibaba's Qwen, DeepSeek, Meta's Llama, that are the foundation of an ecosystem crucial to researchers, entrepreneurs, and policy advisors. We document a clear trend where Chinese models overtook their counterparts built in the U.S. in the summer of 2025 and subsequently widened the gap over their western counterparts. We study a mix of Hugging Face downloads and model derivatives, inference market share, performance metrics and more to make a comprehensive picture of the ecosystem.

More in NLP → More on LLM Reasoning →

View on arXiv → Download PDF →

Published April 8, 2026