Alibaba Qwen2.5-Max officially released, surpassing GPT-4o and DeepSeek-V3
2025-01-29 16:02:28

According to Tongyi's official microblog, Qwen2.5-Max was officially released on January 29. Qwen2.5-Max has demonstrated world-leading model performance in mainstream authoritative benchmarks such as knowledge (MMLU-Pro for testing university-level knowledge), programming (LiveCodeBench), comprehensive assessment of comprehensive capabilities (LiveBench), and human preference alignment (Arena-Hard). The Tongyi team evaluated the performance of the instruction model version and base model version of Qwen2.5-Max respectively. The instruction model is the model version that everyone can directly experience through dialogue. In benchmarks such as Arena-Hard, LiveBench, LiveCodeBench, GPQA-Diamond and MMLU-Pro, Qwen2.5-Max is on par with Claude-3.5-Sonnet, and almost completely surpasses GPT-4o, DeepSeek-V3 and Llama-3.1-405B.
Email Subscription
Newsletters and emails are now available! Delivered on time, every weekday, to keep you up to date with North American business news.
ASIA TECH WIRE

Grasp technology trends

Download