The Stealth Test — Extra Small

A mysterious AI model appeared on OpenRouter. Everyone assumed it was DeepSeek V4. It was Xiaomi. The misattribution tells a story about how we evaluate intelligence.

On March 11th, an unnamed model called Hunter Alpha appeared on OpenRouter. No attribution. No paper. No press release. Just a free, powerful model with a 1-trillion-parameter architecture and a million-token context window.

When asked who made it, the chatbot said: “I only know my name, my parameter scale and my context window length.”

Within days, the consensus was clear: this must be DeepSeek V4. The reasoning capabilities matched. The knowledge cutoff (May 2025) matched. The Chinese-language fluency matched. The fact that it was free and anonymous matched the style. Hunter Alpha climbed the OpenRouter leaderboard and hit one trillion tokens in total usage.

Then on March 19th, Xiaomi revealed that Hunter Alpha was MiMo-V2-Pro — an internal test build from their AI team MiMo, led by former DeepSeek researcher Luo Fuli.

Not DeepSeek. Xiaomi. The smartphone company. The EV maker. The one everyone associates with affordable hardware, not frontier AI.

The misattribution is the story.

When a strong model appears anonymously from China, the global AI community defaults to one assumption: DeepSeek. This makes sense. DeepSeek-V3 and R1 triggered a global stock market selloff. DeepSeek proved that frontier AI doesn’t require frontier budgets. The name carries weight.

But the instant attribution reveals something about how we process intelligence. We don’t evaluate the output on its own terms. We evaluate it through the lens of who we expect to produce it. A powerful anonymous Chinese model must be DeepSeek, because DeepSeek is the only Chinese lab we’ve assigned frontier status to.

This is the same cognitive shortcut that, in a different domain, assumes a breakthrough paper must come from Google Brain or OpenAI. The model is good, therefore it must be from the people we’ve already decided are good.

Luo Fuli’s post-reveal comment is telling: “I call this a quiet ambush — not because we planned it, but because the shift from chat to agent paradigm happened so fast, even we barely believed it.”

She adds: “People ask why we move so fast. I saw it firsthand building DeepSeek R1.”

This is interesting on several levels. The MiMo team’s DNA is literally DeepSeek’s DNA. Luo Fuli left DeepSeek to lead Xiaomi’s AI effort. The institutional knowledge transferred. The philosophical approach — train efficiently, release aggressively, test in the wild — transferred too.

But the shell changed. This isn’t a pure AI lab releasing a model. This is a consumer electronics company with $40+ billion in annual revenue, a smartphone empire, and a growing EV business. Xiaomi doesn’t need MiMo-V2-Pro to sell API access. It needs MiMo-V2-Pro to be the brain of agents that control Xiaomi devices, Xiaomi cars, Xiaomi smart homes.

The model is a means, not a product. And that changes the economics entirely.

The stealth testing methodology deserves attention. OpenRouter has become the dark pool of AI — a neutral platform where models can be launched anonymously, gather real-world usage data, and get unbiased feedback before the brand name introduces evaluation bias.

Zhipu AI did this in February with Pony Alpha (later revealed as GLM-5). Xiaomi did it in March with Hunter Alpha. The pattern is establishing itself: launch anonymous → let the model’s performance speak → reveal identity after the data is clean.

This is the model-evaluation equivalent of blind taste tests. Pepsi figured out decades ago that people rate Coke higher when they see the label. AI companies are figuring out the same thing about benchmarks. When developers know a model is from DeepSeek, they test it differently. When they think it might be DeepSeek, they test it with the hopeful intensity of believers.

The irony is that Hunter Alpha benefited from the DeepSeek misattribution. The belief that it was V4 drew attention, drove adoption, and created usage volume that Xiaomi could never have generated under its own brand. One trillion tokens of usage in a week — would MiMo-V2-Pro have achieved that if it launched with Xiaomi’s name on it?

Probably not. Brand is evaluation bias. And stealth testing is the only honest antidote.

The broader implication is about the structure of Chinese AI development. The narrative has been: DeepSeek is special, a unique exception to the rule that Chinese AI trails American AI. But Hunter Alpha suggests a different story: the capabilities are proliferating. The talent (Luo Fuli, among others) is spreading. The techniques (efficient training, MoE architectures, reasoning-focused optimization) are becoming institutional knowledge across multiple Chinese companies.

Xiaomi. Not an AI lab. A hardware company that happens to now produce frontier-class models. If Xiaomi can do this, so can ByteDance (already doing it with Doubao), Alibaba (Qwen), Tencent (Hunyuan), and probably a dozen others we haven’t heard of yet.

The DeepSeek moment wasn’t a one-time shock. It was a seed. And Hunter Alpha is proof that the seed has germinated.

MiMo-V2-Pro will partner with five major agent frameworks, including OpenClaw, for a week of free developer access. Xiaomi’s stock jumped 5.8% on the reveal.

For a model that spent a week pretending to be someone else, that’s a remarkable debut. The mystery was the marketing. The misattribution was the validation. And the reveal — it was us all along — is the kind of corporate theater that only works when the underlying product is genuinely good.

The model that nobody expected, from the company nobody expected, using the playbook everybody now expects. The stealth test passed.