The Qwen team from Alibaba have just released a new version of their open-source reasoning AI model with some impressive benchmarks.
Meet Qwen3-235B-A22B-Thinking-2507. Over the past three months, the Qwen team has been hard at work scaling up what they call the “thinking capability” of their AI, aiming to improve both the quality and depth of its reasoning.
The result of their efforts is a model that excels at the really tough stuff: logical reasoning, complex maths, science problems, and advanced coding. In these areas that typically require a human expert, this new Qwen model is now setting the standard for open-source models.
On reasoning benchmarks, Qwen’s latest open-source AI model achieves 92.3 on AIME25 and 74.1 on LiveCodeBench v6 for coding. It also holds its own in more general capability tests, scoring 79.7 on Arena-Hard v2, which measures how well it aligns with human preferences.

At its heart, this is a massive reasoning AI model from the Qwen team with 235 billion parameters in total. However, it uses Mixture-of-Experts (MoE), which means it only activates a fraction of those parameters – about 22 billion – at any one time. Think of it like having a huge team of 128 specialists on call, but only the eight best-suited for a specific task are brought in to actually work on it.
Perhaps one of its most impressive features is its massive memory. Qwen’s open-source reasoning AI model has a native context length of 262,144 tokens; a huge advantage for tasks that involve understanding vast amounts of information.
For the developers and tinkerers out there, the Qwen team has made it easy to get started. The model is available on Hugging Face. You can deploy it using tools like sglang or vllm to create your own API endpoint. The team also points to their Qwen-Agent framework as the best way to make use of the model’s tool-calling skills.
To get the best performance from their open-source AI reasoning model, the Qwen team have shared a few tips. They suggest an output length of around 32,768 tokens for most tasks, but for really complex challenges, you should boost that to 81,920 tokens to give the AI enough room to “think”. They also recommend giving the model specific instructions in your prompt, like asking it to “reason step-by-step” for maths problems, to get the most accurate and well-structured answers.
The release of this new Qwen model provides a powerful yet open-source reasoning AI that can rival some of the best proprietary models out there, especially when it comes to complex, brain-bending tasks. It will be exciting to see what developers ultimately build with it.
(Image by Tung Lam)
See also: AI Action Plan: US leadership must be ‘unchallenged’

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
The post Alibaba’s new Qwen reasoning AI model sets open-source records appeared first on AI News.