International Finance
FeaturedTechnology

IF Insights: What DeepSeek’s emergence means for AI industry

IFM_DeepSeek
What sets DeepSeek on a collision course with the entire American tech ecosystem is that its AI model is cheaper to run and works on less advanced chips

On January 27, news broke that caused a bloodbath in Western stock markets. DeepSeek, a Chinese artificial intelligence (AI) startup, is making waves in the American tech circuit with its latest AI model.

The AI model was reportedly developed at a cost of just USD 5.6 million. Unlike its American counterparts, DeepSeek relies on open-source technology and lower-end chips, negating the need for high-end hardware restricted by US export controls.

Established in 2023 by Liang Wenfeng, a hedge fund and artificial intelligence industry veteran, DeepSeek has so far been funded solely by Wenfeng’s quantitative hedge fund, High-Flyer. This unique structure allows the company to focus on long-term research without the constraints of external investors.

Will US Tech Dominance Be Disrupted?

Despite DeepSeek’s innovative artificial intelligence model having the potential to reshape market dynamics toward a “cost-friendly future,” it still faces challenges, such as a significant disadvantage, exacerbated by Washington’s export restrictions on advanced chips.

However, despite these drawbacks, the timing of DeepSeek’s product release cannot be taken lightly, as many American tech firms are set to report their earnings this week. Analysts already expect slow profit growth for companies like Apple and Microsoft, further intensifying concerns about inflated valuations in the AI sector.

Take Nvidia, for example. The chipmaker’s stock has been skyrocketing due to the AI boom in the past few months. With DeepSeek’s entry, there are chances that demand for Jensen Huang-led Nvidia’s pricey hardware could plummet, lowering its stock price and market value.

According to reports, Nvidia lost USD 589 billion in market capitalisation on January 27, which is by far the greatest single-day value wipeout of any company in history, more than doubling the USD 279 billion market cap loss the chipmaker experienced on September 3, 2024. The slide has now knocked Nvidia from its position as the world’s most valuable company, sending its valuation from USD 3.5 trillion to USD 2.9 trillion—less than Apple’s and Microsoft’s.

Apart from Nvidia, Netherlands-based chip companies ASML and ASM International also dropped between 10-14% in European trading. Nvidia and ASML, which manufactures immersion deep ultraviolet lithography systems to produce high-end chips, have both benefited from the artificial intelligence spending boom. Incidentally, these are the same companies whose products have been barred from being exported to China under Washington’s tech war doctrine.

What sets DeepSeek on a collision course with the entire American tech ecosystem is that its AI model is cheaper to run and works on less advanced chips. The open-sourced product has now moved to the top of Apple’s App Store rankings. According to the Chinese startup’s researchers, the DeepSeek-V3 model was trained using Nvidia’s “less advanced” H800 chips, with training costs under USD 6 million.

How DeepSeek Came Into Being

The artificial intelligence model’s key features include its cost-effectiveness and ability to run on reduced-capability chips. While Washington’s trade restrictions have kept the most cutting-edge chips out of China’s hands, DeepSeek has changed the tech war playbook by using easily accessible open-source technology.

“To create R1, DeepSeek had to rework its training process to reduce the strain on its GPUs, a variety released by Nvidia for the Chinese market that has its performance capped at half the speed of its top products,” according to Zihan Wang, a former DeepSeek employee and current PhD student in computer science at Northwestern University.

DeepSeek R1 has won researchers’ praise for its ability to tackle complex reasoning tasks, particularly in mathematics and coding. The model employs a “chain of thought” approach similar to that used by ChatGPT, allowing it to solve problems step by step.

DeepSeek has also released six smaller versions of R1 that are small enough to run locally on laptops. It also claims that one of them even outperforms OpenAI’s o1-mini on certain benchmarks. Training large language models (LLMs) requires a team of highly trained researchers and substantial computing power.

Long before the anticipated US sanctions, the startup acquired a substantial stockpile of Nvidia A100 chips, a type now banned from export to China. The Chinese media outlet 36Kr estimates that the company has over 10,000 units in stock, but Dylan Patel, founder of the AI research consultancy SemiAnalysis, estimates that it has at least 50,000.

“Recognising the potential of this stockpile for AI training is what led Liang to establish DeepSeek, which was able to use them in combination with lower-power chips to develop its models,” MIT Technology Review stated.

DeepSeek-R1 and its variants, such as DeepSeek-R1-Zero, employ large-scale reinforcement learning (RL) techniques and multi-stage training to achieve their capabilities. DeepSeek has not only open-sourced its flagship models but also six smaller distilled variants, ranging from 1.5 billion to 70 billion parameters. These models are MIT-licensed, enabling researchers and developers to freely distill, fine-tune, and commercialise their work.

While both OpenAI and DeepSeek have leveraged artificial intelligence to create their own large language models (LLMs), unlike traditional models that depend on supervised fine-tuning, DeepSeek-R1-Zero claims to have emerged with robust reasoning abilities after training solely with RL. To enhance readability and address language inconsistencies, DeepSeek introduced DeepSeek-R1, which matches OpenAI’s o1 model in performance on reasoning tasks.

DeepSeek has also advanced technical designs such as multi-head latent attention (MLA) and a mixture of experts, making its models more cost-effective. The latest DeepSeek model required just one-tenth of the computing power used by Meta’s comparable Llama 3.1 model, according to a report by Epoch AI.

It has also adopted a range of efficiency-focused strategies to refine its model architecture. While the company has reduced requirements without compromising performance, there have been innovations like improved data exchange between chips to save memory, reduced field sizes to maximise efficiency, and the combining of smaller models through a “Mix-of-Models” approach to achieve superior results.

DeepSeek And Open-Source AI

In the words of the startup’s founder Liang Wenfeng, “An additional challenge Chinese companies face on top of chip sanctions is that their AI engineering techniques tend to be less efficient. We [most Chinese companies] have to consume twice the computing power to achieve the same results. Combined with data efficiency gaps, this could mean needing up to four times more computing power. Our goal is to continuously close these gaps.”

However, DeepSeek has found ways to reduce memory usage and speed up calculations without significantly sacrificing accuracy. As of January 2025, the Chinese tech ecosystem is quickly adopting open-source principles. Alibaba Cloud has released over 100 new open-source AI models, supporting 29 languages and catering to various applications, including coding and mathematics. Similarly, startups like Minimax and 01.AI have open-sourced their models.

According to a 2024 white paper by the China Academy of Information and Communications Technology, the number of AI large language models worldwide has reached 1,328, with 36% originating in China. This now makes China the second-largest contributor to AI, behind the United States.

Alibaba Cloud has partnered with the Beijing-based startup 01.AI, founded by Kai-Fu Lee, to merge research teams and establish an “industrial large model laboratory.”

DeepSeek’s lower-cost models could provide a huge advantage, especially for companies and researchers looking for cheaper alternatives to high-cost AI systems. DeepSeek’s approach is making waves not only because of its efficiency but also due to its ability to make AI open-source, making the technology free and accessible for everyone—unlike OpenAI and other American businesses that charge for access to their artificial intelligence technologies.

DeepSeek has also teamed up with AMD, a competitor to Nvidia, to boost its position in the AI sector. AMD’s expertise in high-performance processing could help DeepSeek create even more powerful and efficient AI models, putting them on par with US-based enterprises. This collaboration may enable DeepSeek to scale its technologies more quickly.

While DeepSeek-R1 still faces challenges with Chinese censorship, it has, no doubt, caught its Western rivals completely off guard. Will cheap, energy-efficient, open-sourced yet performance-oriented artificial intelligence become the new normal in the tech space? Only time will tell.

What's New

Start-up of the Week: Aiwyn redefines accounting activity through automation

IFM Correspondent

IF Insights: What’s the relationship between your grocery list & credit payments?

IFM Correspondent

Egypt, Switzerland sign economic agreement in WEF 2025

IFM Correspondent

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.