Xiaomi MiMo Reasoning Model

by thm

4651d ago

179 comments

Comments (179)

Arcuru21h ago

From the paper, I was intrigued by how they handled their RL step for Code Data. They trained against hard but solvable code generation tasks by running unit testing. Is that training step done by the other models?

> Code Data For coding problems, we curate a high-quality training set comprising open-source datasets and our newly collected problem set. We remove problems without test cases. For problems with golden solutions, we exclude those where the golden solution failed to pass all test cases. For problems without golden solution, we discard problems where no test case can be solved in 16 rollouts of advanced reasoning models. Similar to math data, we utilize an SFT version of MiMo-7B to filter out easy problems that are perfectly solved in all 16 rollouts. This rigorous cleaning process yields 30K code problems.

> During each RL iteration, we evaluate thousands of problems to compute the rewards, with each problem potentially containing hundreds of test cases. To improve reward computing efficiency and eliminate GPU idle time, we developed an online judge environment that enables parallel execution of extremely high-volume unit tests.

lvl1551d ago

Why are there so many English-first AI models from China? Are they not interested in serving their own population? Or is it that if they publish Chinese-first models it won't get publicity in the West?

siliconc0w1d ago

This is incredibly strong coding performance for a 7b. I use Gemini Pro 2.5 which got 67.8 and this got 57.8, very close to Gemini 2.5 Flash which got 60.6.

I've become pretty skeptical about eval results given what we've heard about llama4 so we'll see where this lands on the closed evals but very impressive to see.

badmonster16h ago

MiMo-7B claims to outperform larger models like Qwen-32B and match OpenAI o1-mini on math/code benchmarks — all with a 7B model trained from scratch. Is this a sign that pretraining + RLHF optimization is finally outpacing scale? Or are we just getting better at benchmarking narrow capabilities?

rahimnathwani1d ago

When you guys use gguf files in ollama, do you normally create a modelfile to go with it, or just hope that whatever default ollama has work with the new model?

https://github.com/ollama/ollama/blob/main/docs%2Fmodelfile....

gizmodo591d ago

Its funny to see benchmarks where they omit the top performing models like O3 (Which is the best model in many benchmarks currently) and Gemini Pro/Claude 3.7.

xpe18h ago

The README says "RL" without specifying what kind of RL is used. Researchers: I know you are busy, and I know good writing takes time, but please don't skip this kind of detail.

jedisct11d ago

GGUF version (for LM Studio, Ollama, etc): https://huggingface.co/jedisct1/MiMo-7B-RL-GGUF

Jotalea1d ago

I wonder if they will use this model for their AI assistant on their Xiaomi 15 series phones. They most likely will. I'm not really sure what to expect from it.

Havoc18h ago

Been testing it a bit and overall pretty solid. The lengthy think times means one waits quite a while though. Longer than much larger models like say the recent qwen moe

That moe strikes me as the better overall tradeoff

vessenes1d ago

Umm wow. Great benchmarks. I’m looking forward to chatting with this one.

A couple things stand out to me — first is that the 7B model is trained on 25T tokens(!). This is Meta-scale training; Llama 4 Maverick was trained on 22T or so. (Scout, the smaller model: 40T).

Second, this is an interesting path to take - not a distilled model or an RL layer to get reasoning out of another model, but a from-scratch RL model with reasoning baked in; the claims seem to indicate you get a lot of extra efficiency per-parameter doing this.

I don’t have experience with Xiaomi models, so I’m cautious about this one until I play with it, but it looks like a super viable local reasoning model from the stats.

userbinator1d ago

...and searching for things related to multiple antennae just got harder.

They could've called it Xiaomimo.

mobilio1d ago

Waiting for GGUF or MLX models.

Probably within few hours will be released.

m4r1k1d ago

My Chinese friend told me MiMo doesn’t have a meaning in Chinese (of course Mi 米 = rice). Anybody have a clue for what it stands for?

CodeCompost1d ago

Open Source or Open Weights?

ramesh311d ago

These benchmark numbers cannot be real for a 7b model

w4yai1d ago

Anyone tried it ?

xmorse1d ago

Xiaomi is an amazing company

sida1d ago

Xiaomi in Chinese translates to "Little Rice"

Here is the meaning of the name

Described here: https://finance.sina.cn/tech/2020-11-26/detail-iiznctke33979...

在后来的讨论中，我突然想到了我最喜欢的一句话——“佛观一粒米，大如须弥山”。

Translated into English, it means:

“In the later discussions, I suddenly thought of one of my favorite sayings — ‘A Buddha sees a single grain of rice as vast as Mount Sumeru.’”

This expression emphasizes the idea that even something seemingly small (like a grain of rice) can hold immense significance or value when viewed from a different perspective.

Thanks to chatgpt for translating this