Alibaba releases updated AI model that “surpasses DeepSeek”

Avatar
Alibaba's new AI model set to rival ChatGPT, amidst AI bans

Chinese tech giant, Alibaba has announced that it has released a new version of its Qwen 2.5 artificial intelligence (AI) model. According to the company, the new AI model will outperform the highly acclaimed DeepSeek-V3, OpenAI and Meta’s open-source AI models.

“Qwen 2.5-Max outperforms … almost across the board GPT-4o, DeepSeek-V3 and Llama-3.1-405B,” Alibaba’s cloud unit said yesterday in an announcement posted on its official WeChat account. Qwen 2.5-Max’s release occurred on the first day of the Lunar New Year when most Chinese people are off work and with their families.

Industry watchers believe that Alibaba’s announcement reflects the pressure that Chinese startup, DeepSeek’s announcement is having not only on global rivals but domestic competition.

DeepSeek-V2 was released in May 2024 and quickly disrupted the Chinese AI market due to its aggressively low pricing. As a result, major Chinese tech companies such as ByteDance, Tencent, Baidu, and Alibaba have been compelled to lower their pricing structures in response to DeepSeek’s strategy.

Read also: Everything to know about DeepSeek, the Chinese AI startup disrupting the stock market and app stores

“The Jan. 10 release of DeepSeek’s AI assistant, powered by the DeepSeek-V3 model, as well as the Jan. 20 release of its R1 model, has shocked Silicon Valley and caused tech shares to plunge, with the Chinese startup’s purportedly low development and usage costs prompting investors to question huge spending plans by leading AI firms in the United States”, an analyst told Reuters today.

Alibaba releases updated  AI model that "surpasses DeepSeek"
DeepSeek

Recall that two days after the release of DeepSeek-R1, TikTok owner ByteDance released an update to its flagship AI model. During its announcement, TikTok claimed that its model outperformed Microsoft-backed OpenAI’s o1 in AIME. This echoed DeepSeek’s claim that its R1 model rivalled OpenAI’s o1 on several performance benchmarks.

AIME is an evaluation protocol that utilizes multiple LLM evaluations to measure how well AI models understand and respond to complex instructions. Similar benchmarks include MATH-500 and SWE-bench Verified.

Alibaba AI update: The DeepSeek effect lingers

Founded in May 2023 by Liang Wenfeng, a prominent figure in both the hedge fund and AI sectors, DeepSeek’s innovative approach to AI development has captured the world’s attention.

Its unique training methodology sets it apart from traditional approaches. The company employs a trial-and-error process to enhance its models, mimicking human learning through feedback. 

Its use of Mixture-of-Experts (MoE) architecture enables the models to activate only a fraction of their parameters at any given time. This reduces computational costs without compromising performance, allowing DeepSeek’s models to operate efficiently even on less powerful hardware.

While large Chinese tech companies like Alibaba have hundreds of thousands of employees, DeepSeek operates like a research lab, staffed mainly by young graduates and doctorate students from top Chinese universities.

DeepSeek
Liang-Wenfeng, Founder, DeepSeek

The affordability and efficiency of DeepSeek’s models have sparked a reevaluation of the resources needed for AI development. DeepSeek’s pricing strategy has been a game-changer. While its chatbot is free to use, the company charges just $0.55 per million input tokens and $2.19 per million output tokens for its API services. 

In contrast, OpenAI’s API services cost $15 and $60, respectively, for similar capabilities. This significant cost advantage has made DeepSeek an attractive option for developers and businesses looking to integrate AI into their operations without incurring exorbitant costs.

The company’s use of Nvidia’s less advanced H800 chips for training has challenged the prevailing narrative that AI requires substantial financial and computational investments. This approach has led industry observers to question whether the massive expenditures by U.S. tech giants are justified or sustainable in the long term.

However, the chatbot lacks some advanced features offered by competitors, such as AI-generated images and videos, as well as tools like Canva and customised GPTs.



Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!

Register for Technext Coinference 2023, the Largest blockchain and DeFi Gathering in Africa.

Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!