There is a new artificial intelligence (AI) model in town—DeepSeek.
The Chinese-made model, which was first released on January 20, has garnered the attention of many from the world over, sending shock waves in the AI industry due to its use of less technologically advanced chips.
Moreover, the company’s AI Assistant, which is powered by DeepSeek-V3, has already become the top-rated free application on Apple’s App Store in the US. In addition, the Chinese startup’s latest release has further caused Nvidia’s stock price to plummet 17 percent on Monday.
But, what is DeepSeek all about?
Founded in December 2023 by Liang Wenfeng, the AI model is a new and free AI-powered chatbot, which works almost the same as ChatGPT, developed by AI research company, OpenAI.
And while it is used for the same set of tasks, the AI model is reportedly as powerful as OpenAI’s o1 model, which was released last year, when used for coding and mathematics.
“We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as preliminary step, demonstrates remarkable reasoning capabilities,” the company said in a statement on January 20, adding that however, challenges include poor readability, and language mixing.
To address these challenges, the company stated that the DeepSeek-R1 was introduced. This includes “multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama,” the statement added.
However, this new AI model, like many other Chinese AI models, is reportedly trained to avoid politically sensitive questions.
The model has already got attention from OpenAI’s Sam Altman, who called it an “impressive model, particularly around what they’re able to deliver for the price,” via X.
“We will obviously deliver much better models and also it’s legit invigorating to have a new competitor! we will pull up some releases,” Altman added.
Why did US companies like Nvidia take a hit?
According to reports, the privately owned company’s lower costs disrupted financial markets on January 27.
Nasdaq witnessed a 3 percent fall in a broad sell-off which includes global chip makers and data centres, whereas Nvidia’s stock price plunged 17 percent on Monday before slowly recovering on Tuesday, roughly 4 percent by midday.
Nvidia fell to third place after Apple and Microsoft on Monday, when its market value shrank to $2.9 trillion from $3.5 trillion, Forbes reported.
DeepSeek hit by ‘large-scale malicious attacks’, faces website outage
The platform’s founder reportedly “built up a store” of Nvidia A100 chips, which have seen an export ban to China since September 2022.
“Some experts believe this collection—which some estimates put at 50,000—led him to build such a powerful AI model, by pairing these chips with cheaper, less sophisticated ones,” the BBC said.
It was also hit by “large-scale malicious attacks” on the same day it became the most-downloaded free app,as well as a website outage on Monday.
However, the company revealed in a research paper explaining how it built its technology—by using only a fraction of the computer chips leading AI companies relied on to train their systems.
While global companies train their AI models with supercomputers using almost 16,000 chips or more, DeepSeek’s team of engineers revealed they only required 2,000 Nvidia chips.
Moreover, the company’s engineers revealed the DeepSeek-V3 model required less than $6 million in computing power through the use of lower-capability Nvidia H800 chips.