MiMo-V2-Flash has a Mixture-of-Experts (MoE) architecture. (Image: Wikimedia Commons)
Xiaomi has rolled out a new open-weight AI model called MiMo-V2-Flash that is designed to handle complex reasoning, coding, and agentic AI tasks.
It is also capable of serving as a general-purpose assistant for everyday use, the Chinese smartphone and electric vehicle (EV) maker said. MiMo-V2-Flash delivers inference speeds of up to 150 tokens per second and operates at a low cost of $0.1 per million input tokens and $0.3 per million output tokens, according to Xiaomi. It has a total of 309 billion parameters. The parameters of an AI model denote its size and generally serve as an indicator of its processing capabilities.
The model has been made publicly available for download via Xiaomi’s developer portal MiMo Studio, Hugging Face, and its API platform. It is the latest open-weight AI model in Xiaomi’s Mimo family of models. The model’s launch not only signals Xiaomi’s foray beyond hardware and into foundational AI models but also positions the Chinese tech giant as a rival to prominent AI players such as DeepSeek, Anthropic, and OpenAI.
The introduction of MiMo-V2-Flash also comes at a time when Xiaomi is looking to bring AI agent-driven features across its phones, tablets, and EVs. “MiMo-V2-Flash is live. It’s just step 2 on our AGI roadmap, but I wanted to dump some notes on the engineering choices that actually moved the needle,” Luo Fuli, a former DeepSeek researcher who recently joined Xiaomi’s MiMo team, said in a post on X on December 17.
MiMo-V2-Flash is live. It’s just step 2 on our AGI roadmap, but I wanted to dump some notes on the engineering choices that actually moved the needle.
Architecture: We settled on a Hybrid SWA. It’s simple, elegant, and in our internal benchmarks, it outperformed other Linear…
— Fuli Luo (@luo_fuli14427) December 16, 2025
“Our progress in AI large models and applications has far exceeded our expectations,” Xiaomi president Lu Weibing said earlier this month, adding that the Beijing-based company believed that the deep integration of AI with the physical world could be the next frontier of technology.
MiMo-V2-Flash has a Mixture-of-Experts (MoE) architecture to split large neural networks, allowing it to balance performance and efficiency. It also looks to reduce the cost of processing long prompts by limiting how much past context the model needs to re-evaluate.
In terms of performance on benchmark tests, Xiaomi said that MiMo-V2-Flash obtained scores that made it on par with Moonshot AI’s Kimi K2 Thinking and DeepSeek V3.2 across most reasoning tests. It also surpassed Kimi K2 in long-context evaluations.
The model outperformed all its open AI model rivals on SWE-Bench Verified and scored 73.4 per cent. Xiaomi further claimed its score for coding tasks matched Anthropic’s Claude 4.5 Sonnet, with the former being built at a fraction of the cost.