Alibaba Qwen2.5-Max officially released, surpassing GPT-4o and DeepSeek-V3

2025-01-29 16:02:28

According to Tongyi's official microblog, Qwen2.5-Max was officially released on January 29. Qwen2.5-Max has demonstrated world-leading model performance in mainstream authoritative benchmarks such as knowledge (MMLU-Pro for testing university-level knowledge), programming (LiveCodeBench), comprehensive assessment of comprehensive capabilities (LiveBench), and human preference alignment (Arena-Hard). The Tongyi team evaluated the performance of the instruction model version and base model version of Qwen2.5-Max respectively. The instruction model is the model version that everyone can directly experience through dialogue. In benchmarks such as Arena-Hard, LiveBench, LiveCodeBench, GPQA-Diamond and MMLU-Pro, Qwen2.5-Max is on par with Claude-3.5-Sonnet, and almost completely surpasses GPT-4o, DeepSeek-V3 and Llama-3.1-405B.

Alibaba

Email Subscription

Newsletters and emails are now available! Delivered on time, every weekday, to keep you up to date with North American business news.

Weekly Highlights

                                    Southwest Airlines CEO: Although Boeing still has a lot of work to do, they seem to be moving in a good direction and we are more optimistic.
                            2025-01-31

                                    White House Press Secretary Levitt: Details about the Nvidia meeting are still unclear.
                            2025-02-01

                                    Apple (AAPL.O) said the Apple Pay outage has been resolved.
                            2025-01-29

                                    Jefferies: Raised Meta Platforms (META.O) price target from $715 to $810.
                            2025-01-30

                                    Sam Altman, founder and CEO of OpenAI: OpenAI will develop specialized hardware for AI.
                            2025-02-03