Chinese artificial intelligence firm DeepSeek’s new AI model, R1, has sparked a firestorm of controversy and debate in the AI community. The company claims that its large language model cost just $5.6 million to train, a fraction of the price paid by other tech giants to develop similar models. This claim has raised questions about the high costs associated with training and running advanced AI workloads, as well as the potential for Chinese firms to outcompete their Western counterparts.
DeepSeek’s R1 model is a reasoning model that breaks down prompts into smaller pieces and considers multiple approaches before generating a response. It is designed to process complex problems in a way similar to humans. The model has achieved performance comparable to OpenAI’s o1 on reasoning tasks, including benchmarks such as AIME 2024, Codeforces, and MATH-500.
The company’s claims have been met with skepticism by some experts, who argue that the figures may be exaggerated or even fabricated. OpenAI, creator of the popular chatbot ChatGPT, has been quick to point out that its own models were not developed using the same approach as DeepSeek’s.
The issue of chip availability has also been raised, with some claiming that DeepSeek may have used export-controlled AI chips to develop its model. However, DeepSeek denies these claims, stating that it used mature Nvidia GPUs, including H800 and A100 chips, which are less advanced than the company’s cutting-edge H100 chips.
Despite the controversy, many experts believe that DeepSeek’s achievement marks a positive step for the AI industry. Yann LeCun, chief AI scientist at Meta, sees it as a victory for open-source AI models, rather than a win for China over the US. “To people who see the performance of DeepSeek and think: ‘China is surpassing the US in AI.’ You are reading this wrong. The correct reading is: ‘Open source models are surpassing proprietary ones’,” he said.
Ultimately, the true extent of DeepSeek’s claims remains to be seen, but its success has undoubtedly shaken the tech industry and raised important questions about the future of AI development.