The GPU Monopoly Is Now A Systemic Risk

We built an AI revolution on chips we do not control. That is not a hyperbole, it is a design flaw. Orders for Nvidia’s new Blackwell processors reportedly exceed 3.6 million units, much of it locked up by the largest clouds. Nvidia controls about ninety four percent of the GPU market and recently disclosed that two unnamed customers made up thirty nine percent of its quarterly revenue. OpenAI’s finance chief said the quiet part out loud this year: demand is voracious and capacity is not keeping up. The AI grid is more private than public, more concentrated than diversified, and more brittle than investors want to admit. Everyone talks about open models, but the bottleneck is not code. It is the physics of chips, the politics of allocation, and the economics of a single supplier architecture. When compute becomes the choke point, the invisible hand turns into a very visible queue.

The Compute Cartel Risk

A handful of firms now act as de facto gatekeepers of AI progress. They decide who rents GPUs, at what price, and on what terms. That is not a cartel in the legal sense, but it rhymes with one in outcome. Prices stay elevated, supply stays tight, and those without scale stay shut out. This shift has real second-order effects. It determines which research paths get funded and who can build at the frontier. It also increases the blast radius of operational mistakes. We have already seen how centralized tooling can amplify risk. Microsoft’s own security experts warned that AI copilots often inherit pass-through permissions, exposing data to employees who should not see it. Concentrate compute and data in the same few places, and you are engineering common mode failures. One permissions error becomes a firm-wide breach vector. The president of Signal summed up the power shift clearly last year: only a handful of companies can train and deploy large-scale systems, and that gives them inordinate power over institutions. Investors like to believe this concentration is temporary. History says otherwise. In electricity, oil, and telecoms, infrastructure monopolies are sticky. They do not unwind on their own. It takes shocks, regulation, or a substitution wave.

Hoarding As A Dominant Strategy

In game theory, hoarding becomes a dominant strategy when the resource is scarce, the payoff to control is high, and rivals are seen as credible threats. That describes the GPU market. Cloud providers forward-buy inventory, lock in long-dated supply, and announce new data centers as deterrents to each other. Startups prepay for capacity they cannot fully use, hoping to secure their survival option. The rational individual move creates the irrational collective outcome: shortages at the peak and waste in the trough. This is the bullwhip effect with silicon. The psychology is familiar. In a gold rush, you do not ask if the pickaxe is overpriced; you hope the next miner will pay more. But compute is not software. It is steel, land, power, and logistics. It moves slowly and fails in clusters. A supply chain that runs through a handful of fabs, a few memory providers, and a narrow set of advanced packaging lines has fat-tailed risks. One outage does not add to risk; it multiplies it. Even the demand side is correlated. If one model breakthrough raises inference loads, everyone’s workloads spike at once. When correlation is high, redundancy is a myth. Multi-cloud strategies still rely on the same hardware vendor and the same upstream bottlenecks. The probability math here is not comforting: common dependencies make failure modes co-linear. The upshot is simple and uncomfortable. The market is optimizing for speed, not resilience. The bill comes due when the cycle turns and capacity sits unutilized or when a geopolitical event reroutes the pipeline overnight.

Sovereignty, Energy, and the Anti-Stack

Compute is now statecraft. Governments in Europe, Asia, and Africa talk about AI sovereignty because they understand the lever is not datasets or model weights; it is who controls the megawatts and the chips. China is pursuing self-sufficiency in AI compute with targets as high as ninety percent within five years, a move that, if even partially met, rewires the global allocation map. When great powers build separate stacks, markets fragment and hedges get pricier. The default response from the private market is to double down on scale. Yet scale without diversity is fragility. A resilient system looks different. It mixes hardware types, diversifies suppliers, pushes workloads to the edge where possible, and uses algorithms that are parsimonious with compute. That last point is taboo in the current narrative. We celebrate parameter counts and training tokens like they are GDP, but sometimes the antifragile choice is smaller, on-device, or domain-specific. The investor lens should shift from total model size to unit economics per useful task and energy per correct output. Energy is the rate limiter. Bitcoin miners operate tens of gigawatts today. That scale was built through transparent incentives and ruthless cost discipline. The analogy to AI is tempting, and some teams are trying to port it.

The Limits Of Decentralized Compute

Decentralized networks like the one Gonka proposes try to turn idle GPUs into a community grid. The idea borrows from Bitcoin’s open market for hashing. The ambition is noble and, if it worked at scale, it could be a release valve on concentration. But we should be honest about the design challenges. Galaxy Research found that decentralized networks can win on certain workloads but verification and reliability are hard. The nonprofit EPOCH AI called it the paradox of distributed systems: the more open they get, the more coordination and verification they require. In other words, you trade monopoly risk for coordination risk. That is not automatically better. It is a different risk profile. The Libermans say they avoid features like delegation to blunt centralization pressure. That helps on paper, but capital intensity has its own gravity. Where there is money to be made, pools appear, rules get gamed, and the efficient frontier narrows around the well-capitalized and the cheap-energy players. Technology alone will not save the governance. In open systems, incentive design and auditability are not add-ons; they are the core product. Without them, you get a thin veneer of openness on top of the same old hierarchy.

What The Market Is Pricing Wrong

The market is treating GPUs like oil in 2007: high prices forever and a single path to growth. It is underpricing two risks. First, the security externalities from centralization. The more code and data you run through a few platforms, the more you invite systemic breaches. The current generation of AI copilots already shows how easy it is to overexpose internal information by accident. Second, the demand elasticity to better algorithms. The assumption that only more chips drive progress is a convenient story for those selling capacity. But history in computing shows periodic shocks where software efficiency resets the curve. Compiler breakthroughs, pruning, distillation, retrieval, and specialized models can reduce the need for bulk training runs. If that happens on the same timeline as new supply, the market flips from famine to glut faster than the capex payback period. Investors should not bet against optimization. They should model it.

The Antifragile Path To Open Compute

If compute is the new grid, we should learn from grid engineering. You do not rely on a single plant for base load. You mix sources, build fail-safes, and design for partial failure. For AI, that means three practical shifts. Diversify the hardware stack beyond a single vendor and a single memory tech. Push verification to the protocol level when using decentralized networks, so work can be audited without trusting the worker. And align incentives with reliability, not just contribution, so the network rewards what users actually need. None of this is as headline friendly as a new model size or a large GPU order. It is slower, and it is less glamorous. But systems that survive are rarely glamorous. They are boring and redundant. The open compute movement will not succeed by reciting open slogans. It will succeed by absorbing the lessons of markets, engineering, and history: concentration is efficient until it is not, scarcity creates perverse behavior, and resilience is built in the small decisions long before the crisis. The hard truth remains. Whoever owns the chips today owns the pace of innovation. The smart money is not just asking how to buy more. It is asking how to need less, prove more, and depend on fewer single points of failure.