Just as Apple (AAPL) has built an ecosystem that integrates consumer electronics into households as ubiquitously as laundry detergent and toothpaste, NVIDIA (NVDA) is dedicated to creating a similar ecosystem for enterprises, aiming to generate recurring revenue streams in the era of AI inference. This evolution in business model is expected to become a key turning point for investors by adding balance to NVIDIA’s investment thesis.
In recent years, as major hyperscale data center customers have successively built data centers reliant on NVIDIA GPUs, the company has achieved explosive profit growth. The data center business has grown so large that other segments such as professional visualization, gaming, automotive, and robotics are almost dwarfed in comparison. In fiscal year 2026, the data center business accounted for nearly 90% of total revenue. This places pressure on NVIDIA to consistently sell GPUs to hyperscale customers to sustain its rapid growth trajectory.
NVIDIA’s latest Rubin architecture has begun to address this issue. It comprises six collaboratively designed chips aimed at enhancing rack-level efficiency for data center applications. Many of Rubin’s breakthroughs are related to AI inference rather than training. AI models serve as knowledge bases that intelligent agents and tools apply to real-world tasks, and applying these models for inference requires immense computational power.
The “tokenization” of AI inference creates a recurring revenue stream for NVIDIA. The concept is that hyperscale customers will charge based on the number of AI inference tokens used by their clients. As generative AI, AI agents, and physical AI applications become widespread, the required number of tokens will grow accordingly. NVIDIA’s hardware and software are specifically designed to process these tokens rapidly, which will be highly appealing to hyperscale customers. In summary, NVIDIA aims to build an ecosystem encompassing its specialized AI chips, networking hardware, and inference software that scales in tandem with token demand—since AI inference and physical AI will require far more tokens than simple chat-based generative AI.
NVIDIA’s roadmap to achieving recurring revenue through AI inference mirrors Apple’s blueprint. The iPhone sits at the core of Apple’s product ecosystem, with all products—from Mac, iPad, Apple Watch to AirPods—complementing each other. This is analogous to how NVIDIA, building on its GPU business, leverages “extreme co-engineering” to capture revenue across AI data center computing, networking, and storage.
Apple also offers services that directly support these products (such as iCloud), as well as services that can be integrated and utilized by them (such as Apple Music, Apple TV+, Apple Card, etc.). NVIDIA’s recurring revenue stream from AI inference tokens will follow the same business model.
Apple has evolved into a stable, high-margin, moderately growing company that generates substantial free cash flow for share buybacks and steadily increasing dividends. Similarly, NVIDIA is expected to allocate 50% of its free cash flow this year to share repurchases and dividends. While NVIDIA currently pays only $0.01 per share quarterly, it may follow Apple’s example of raising dividends for 14 consecutive years by announcing a substantial dividend increase, followed by modest annual growth thereafter.
In conclusion, NVIDIA’s move to build recurring revenue streams is a prudent strategy to mitigate potential future downturns in hyperscale capital expenditures, thereby making the investment thesis for its stock more attractive to long-term investors.