DeepSeek: The Chinese ChatGPT

DeepSeek: The Chinese ChatGPT


Introduction

Since its launch in December 2024, DeepSeek-V3, an artificial intelligence model developed by the Chinese startup 深度求索 (DeepSeek), has been challenging global AI giants like OpenAI, Google, and Meta. Open-source, high-performing, and cost-effective, this model has been dubbed the "Chinese ChatGPT" by international media. But what does this innovation truly conceal? How did a relatively unknown company manage to compete with industry leaders? Dive into a technological success story with geopolitical implications.


1. The Origins of DeepSeek: An Ambition Born in the Shadow of a Hedge Fund

Founded in July 2023 by Liang Wenfeng, an engineer graduated from the University of Zhejiang, DeepSeek emerged from an internal project of the hedge fund High-Flyer. Specializing in quantitative analysis, the latter accumulated a strategic reserve of NVIDIA A100 and H800 chips, anticipating American sanctions on key technology exports to China. Unlike its Chinese competitors (such as Alibaba and ByteDance), DeepSeek prioritized fundamental research over commercial applications. Its goal? To develop a general artificial intelligence (AGI) capable of rivaling Western models despite hardware restrictions. "We aim to solve the world's most complex problems," stated Liang Wenfeng in an interview with 36Kr.


2. DeepSeek-V3: A Technical Feat at Low Cost

The architecture of DeepSeek-V3 is based on a Mixture of Experts (MoE) approach, integrating 671 billion parameters, with only 37 billion activated per query. This design drastically reduces energy consumption while maintaining high performance. The model was trained on 14.8 trillion tokens of multilingual data, at a total cost of $557.6 million—a fraction of the budgets of OpenAI or Meta.

Engineers optimized every step through key innovations such as FP8 Mixed Precision Training, a reduced-precision calculation method that accelerates training without quality loss, and Load Balancing, an algorithm that efficiently distributes queries among the model's "experts." They also employed knowledge distillation to transfer reasoning capabilities from specialized models like DeepSeek R1.

In terms of performance, tests by the agency Artificial Analysis place DeepSeek-V3 ahead of open-source models (Qwen2.5-72B, Llama-3.1-405B) and on par with GPT-4o or Claude-3.5-Sonnet for complex tasks. The model excels particularly in decoding encrypted messages, providing exact responses where others fail, and in code explanations, with outputs enriched with detailed comments and practical guides.


3. An Open-Source Model Shaking Up the Market

Unlike OpenAI, DeepSeek adopted an open-source strategy, making its model freely accessible. This decision propelled its application to the top of the American App Store in January 2025, dethroning ChatGPT.

The economic advantage is twofold: on one hand, API costs are reduced to $0.48 per million tokens (compared to $3 for Claude-3.5), and on the other, public access is entirely free, whereas ChatGPT charges $20/month. This aggressive policy appeals to independent developers and emerging countries, while strengthening Chinese technological influence globally.


4. Controversies and Challenges

One of the most intriguing incidents concerns the confused identity of DeepSeek-V3. During tests, the model sometimes claimed to be "a ChatGPT developed by OpenAI." According to experts like Thomas G. Dietterich, this bug stems from training data contaminated by public AI outputs. Although DeepSeek denies using synthetic data, the affair raises ethical questions about AI training transparency.

Furthermore, DeepSeek's success challenges the effectiveness of American sanctions on NVIDIA chips. The company demonstrates that software innovation—through optimized architectures like MoE—can compensate for hardware limitations. "It's a lesson in efficiency for the West," emphasizes Chris McKay of Maginative.


5. Geopolitical Implications and Future

DeepSeek embodies a turning point in the China-US technological war. By open-sourcing its models, China positions them as global public goods, gaining soft power while circumventing Western restrictions.

In the short term, the link between DeepSeek and the hedge fund High-Flyer could lead to applications in algorithmic trading, while compact versions of the model (like DeepSeek R1) are already being tested in universities for research in healthcare and education.


Conclusion

DeepSeek-V3 is not just a clone of ChatGPT: it is a symbol of China's rising power in AI. By combining technical innovation, an open-source strategy, and resilience against sanctions, DeepSeek redefines the rules of the game. It remains to be seen whether this model can maintain its trajectory against established giants—and the ethical challenges it raises.