Chinese AI company DeepSeek made waves in the technology industry with its claims of achieving performance comparable to leading AI models while dramatically reducing training infrastructure requirements.
While the news led to a historic sell-off of technology stocks, the implications are far from negative, even for core technology providers like Nvidia. If DeepSeek’s claims are true – and that hasn’t yet been validated – the revelations remove the almost insurmountable cost barriers to AI training, opening the door to much broader adoption and competition in the market.
The DeepSeek Announcement
DeepSeek unveiled its DeepSeek-R1 model, which it claims rivals leading AI systems like OpenAI’s GPT-4 and Meta’s Llama. The importance of the news is not on its model and how that may be used, but rather the techniques it employed to build an LLM that is competitive with other leading large language models.
According to DeepSeek, its model was trained on just 2,048 Nvidia H800 GPUs, costing approximately $5.58 million — a fraction of the infrastructure and cost typically associated with such efforts.
By employing advanced techniques such as FP8 precision, modular architecture, and proprietary communication optimizations like DualPipe, DeepSeek has purportedly streamlined AI training to a level previously thought unattainable.
Democratizing AI Training
One of the challenges of current AI training techniques is that the resources are prohibitive, requiring investment that’s only feasible for the largest hyperscalers. This has led to a cloud-first market dominated by a handful of players.
DeepSeek’s approach, however, promises to disrupt that model by making AI training accessible to most enterprises. While this might negatively impact companies like OpenAI, it has the potential to broaden the market for nearly everyone else.
Despite reducing reliance on high-end GPUs, DeepSeek’s approach does not eliminate the need for robust supporting infrastructure. Key infrastructure requirements for AI training, like high-performance storage, low-latency networking, and strong data management frameworks, remain critical:
- Storage: High-throughput storage systems are essential to manage the large datasets used in AI training.
- Networking: Advanced networking solutions minimize bottlenecks during model training and ensure efficient communication across nodes.
- Data Governance: Compliance, security, and checkpointing challenges remain pressing concerns for enterprises adopting AI.
DeepSeek’s approach enables smaller enterprises to participate in AI development by significantly reducing the hardware and costs required for training. It’s a moment that mirrors historic IT transformations, like the transition from mainframes to mini-computers and, ultimately, PCs, where decentralization unlocked new opportunities at every point.
If DeepSeek’s claims hold, rack-level training clusters may now be possible. This is a tremendous opportunity for storage providers, server OEMs, and networking companies to deliver higher-value products to the enterprise market and take share from the traditional cloud-first AI players.
Storage companies like NetApp and Pure Storage and leading server manufacturers like Dell Technologies, HPE, and Lenovo all stand to benefit as the demand for scalable and cost-effective AI infrastructure grows. This is good news for enterprise customers and the market overall.
The Impact on Nvidia
Nvidia’s stock price sharply dropped following DeepSeek’s announcement, reflecting market fears about potential disruptions to its dominance in the AI GPU market. While DeepSeek’s use of mid-tier GPUs like the H800 highlights an alternative path, Nvidia remains well-positioned due to its entrenched ecosystem, which includes its CUDA platform and investments in AI systems like DGX and Mellanox networking solutions.
Nvidia built a hedge against its GPU dominance with its robust data center business that extends beyond its GPU offerings. While it doesn’t break out specific revenue numbers for networking, software, and services, it doesn’t offer hints.
In its most recent earnings, the company reported that networking revenue increased 20% year over year, with significant growth in platforms like Spectrum-X, up 3x year over year.
Software revenue is annualizing at $1.5 billion, about 4% of its total revenue, and Nvidia expects that to exceed $2 billion by year-end, driven by offerings like NVIDIA AI Enterprise, Omniverse, and AI microservices. Software and services contribute higher-margin revenues, enhancing overall profitability, and shouldn’t be impacted by DeepSeek’s announcement.
Overall, reducing the requirements for AI training has an impact on Nvidia’s approach to the market, but it’s one for which the company is prepared:
- Broader Market Adoption: The democratization of training may expand the overall GPU market, even if individual customers purchase fewer high-end units. Increased demand for mid-tier GPUs like the H800 could offset reduced reliance on flagship models.
- Resilience Through Ecosystem Control: Nvidia’s software lock-in, including CUDA, and its investments in system-level solutions like DGX and Mellanox position the company to adapt to changing market dynamics.
- Strategic Adjustments: Nvidia’s ability to manage product flow and pricing and its control over the GPU supply chain ensures its continued relevance despite emerging competition.
What About Broadcom and Marvell?
The stock market reacted negatively to the news, pummeling the share price of custom silicon companies like Broadcom and Marvell. DeepSeek’s approach, however, may prove to be a net positive for these companies.
Marvell and Broadcom each see significant revenue in delivering custom silicon to public cloud providers. DeepSeek’s announcement, focused on AI training, should have minimal impact on this business. The need for accelerators for inference, where much of this work is focused, remains unchanged. Likewise, the custom silicon revenue of both is heavily influenced by non-AI projects like custom Arm-based processors for the CSPs.
The demand for low-latency, high-throughput networking solutions remains essential in DeepSeek’s framework. Broadcom’s dominance in Ethernet and InfiniBand and Marvell’s strength in energy-efficient and high-bandwidth interconnects position both companies to benefit from the need for advanced interconnects in decentralized AI training environments.
For both Broadcom and Marvell, DeepSeek’s innovation represents less of a disruption and more of a realignment of market dynamics:
- Increased Demand for Networking: DeepSeek’s modular, distributed AI training will likely drive demand for efficient networking solutions, benefiting both companies.
- Expansion Beyond Hyperscalers: Broadening AI training beyond major cloud providers introduces new customers requiring scaled-down yet high-performance infrastructure. This aligns with both Broadcom’s and Marvell’s strategies.
- Limited Downside: Unlike GPU vendors, whose margins might compress with reduced high-end hardware demand, Broadcom and Marvell’s products remain critical to AI workflows, positioning them as net beneficiaries of a more decentralized AI market.
Analyst’s Take
If reproducable, DeepSeek’s claims will drive a shift in the AI training landscape by lowering costs and democratizing access to advanced model training. While this disrupts traditional reliance on hyperscalers, it introduces opportunities for infrastructure providers and enterprises to innovate.
While DeepSeek’s methods may directly threaten companies like OpenAI and force companies like Nvidia to evolve, the overall net impact on the IT industry is positive. Enterprise server, storage, and networking providers will all benefit, as will companies like IBM, which is driving the enterprise-safe AI agenda.
For Nvidia, the news reinforces the need to adapt to evolving market dynamics by leveraging its ecosystem and diversifying its product offerings. Though margins on flagship GPUs may be pressured, the overall expansion of the AI market and its successful diversification into system-level and software-driven differentiation will likely sustain its long-term growth.
Ultimately, the broader implications of DeepSeek’s announcement highlight the ongoing evolution of AI infrastructure, emphasizing efficiency, accessibility, and decentralization. Stakeholders across the ecosystem should prepare for an increasingly dynamic and competitive market. The next phase of AI infrastructure will emphasize efficiency, accessibility, and innovation, reshaping how businesses approach artificial intelligence.
Disclosure: Steve McDowell is an industry analyst, and NAND Research is an industry analyst firm, that engages in, or has engaged in, research, analysis and advisory services with many technology companies; the author has provided paid services to every company named in this article — except for DeepSeek and OpenAI — in the past and may again in the future. Oracle provided technical fact-checking for this article. Mr. McDowell does not hold any equity positions with any company mentioned.