A Little-Known Chinese AI Lab Stirs Panic in Silicon Valley with Cheaper, More Efficient AI Models
A Chinese AI lab called DeepSeek has released a free, open-source large-language model that has outperformed some of the best AI models in the US, sparking concerns about the country’s global lead in artificial intelligence. The model was built using reduced-capability chips from Nvidia and took only two months and less than $6 million to develop.
DeepSeek’s model outperformed Meta’s Llama 3.1, OpenAI’s GPT-4o, and Anthropic’s Claude Sonnet 3.5 in accuracy tests, including complex problem-solving, math, and coding. The lab also released a reasoning model, r1, which outperformed OpenAI’s latest o1 in many of the same tests.
The development has raised alarms about whether America’s global lead in AI is shrinking, and has called into question the massive spending by big tech companies on building AI models and data centers. Microsoft CEO Satya Nadella has warned that the developments from China should be taken seriously.
DeepSeek’s model was built using a process called distillation, which involves using a large model to help a smaller model become smart at a specific task. This approach is cost-efficient and allows the lab to achieve impressive results with less powerful chips.
Little is known about the lab and its founder, Liang WenFeng, who is reportedly the founder of a Chinese hedge fund called High-Flyer Quant. However, DeepSeek is not the only Chinese company making inroads in AI. Other Chinese companies, such as 01.ai and ByteDance, have also released AI models that claim to outperform those from the US.
Experts say that the necessity of finding work-arounds to strict semiconductor restrictions imposed by the US government has led to the development of more efficient AI models in China. “Necessity is the mother of invention,” said Perplexity CEO Aravind Srinivas. “Because they had to figure out work-arounds, they actually ended up building something a lot more efficient.”