Over the weekend, the Chinese AI company DeepSeek made headlines when it overtook OpenAI’s ChatGPT as the most downloaded app on the Apple App Store. Its commercial success came after DeepSeek revealed in a number of articles that its latest R1 models, which are substantially less expensive for the company to produce and for customers to use, are on par with and sometimes even better than OpenAI’s top publicly available models.
What did DeepSeek do that OpenAI, which has a large budget, did not? Since OpenAI has been somewhat secretive about how it trained its GPT-o1 model, the previous leader on a number of benchmark tests, it is difficult to say for sure. However, there are some obvious distinctions between the strategies of the two businesses, as well as additional areas where DeepSeek seems to have achieved remarkable strides.
The main distinction, and the one that undoubtedly caused chip manufacturers’ stocks to plummet on Monday, is that DeepSeek is producing competitive models far more quickly than its larger rivals.
Deepseek’s R1 and R1-Zero achievement
The business’s most recent R1 and R1-Zero “reasoning” models are based on DeepSeek’s V3 base model, which the company claims was trained on older NVIDIA hardware—which, in contrast to the company’s cutting-edge chips, is legal for Chinese enterprises to purchase—for less than $6 million in computing expenditures. In contrast, GPT-4‘s training cost exceeded $100 million, according to OpenAI CEO Sam Altman.
“U.S. policies, such as the recent ban on advanced chip sales to China, have forced companies like DeepSeek to improve by optimizing the architecture of their models instead of throwing money at better hardware and Manhattan-sized data centers,” Karl Freund, founder of the industry analysis company Cambrian AI Research, advised.
“You can build a model quickly or you can do the hard work to build it efficiently,” Freund said. “The impact on Western companies will be that they’ll be forced to do the hard work that they’ve not been willing to undertake.”
The majority of the optimization methods employed by DeepSeek were not created by the company. Some have been suggested by its larger rivals, such as the usage of data formats that consume less RAM. Even readers who are not technical will get the impression from DeepSeek’s publications that the team used every tool they could find to reduce the amount of memory needed for training and built its model architecture to be as effective as possible on the outdated hardware it was utilizing.
In order to accomplish hard tasks, especially in arithmetic and coding, OpenAI was the first developer to create so-called reasoning models, which employ a mechanism known as chain-of-thought that simulates humans’ trial-and-error approach to problem solving. The business has not disclosed how it accomplished that.
Historically, the integration of reinforcement learning with human feedback (RLHF) has been used to enhance generative AI models. A group of AI replies are labeled by humans with their positive and negative traits, and the model is encouraged to imitate the positive traits—such as accuracy and coherency.
The pure reinforcement learning method did not yield flawless outcomes. The outputs of the R1-Zero model occasionally changed between languages and were hard to interpret. Therefore, DeepSeek developed a new training pipeline that combines multiple rounds of pure reinforcement learning with a very modest quantity of labeled data to push the model in the desired direction. The final model, R1, fared better than OpenAI’s GPT-o1 model on a number of human-designed arithmetic and coding problem sets.
China closely monitors the technological innovations and practices of Western companies, according to Bill Hannas and Huey-Meei Chang, experts on Chinese technology and policy at the Georgetown Center for Security and Emerging Technology. This has assisted its companies in finding ways to circumvent U.S. policies, such as chip embargoes, that are intended to give American companies an advantage.
They stated that while DeepSeek’s achievement is “a wake-up call to U.S. AI companies fixated with enormous (and expensive) solutions,” it is not a negative thing for the domestic business. The strategy employed at a number of Chinese state-funded laboratories is based on “doing more with less.”