What if the cheap part of AI is the model, and the expensive part is the error? We are treating prediction machines like free leverage when, in fact, they create a new asset class of liabilities. The paradox of AI adoption is simple: the more we automate, the more one wrong output can cascade into many wrong actions. The market is pricing AI as if it were a linear productivity tool. It is not. It is a non-linear amplifier of both insight and mistake.
A recent survey shows nearly a third of firms have already eaten a negative consequence from AI inaccuracy. That is not a rounding error. In risk terms, this is not a white swan you can budget. It is normal accident theory, applied to probabilistic software. Knight Capital lost hundreds of millions in minutes from a deterministic code glitch. Now swap deterministic for stochastic and pipe those outputs into customer service, pricing, credit, inventory, and marketing. Small error rates, once looped through automated processes, change scale. A 1 percent hallucination rate is not 1 percent of the business at risk; it is the weak bolt in the bridge that transfers force to the wrong beams during stress. Firms rarely model the propagation. They report accuracy percentages and call it governance. Accuracy is a vanity metric. Loss distribution is the real KPI.
Insurers are getting there first. Legal scholars note that policies are adding exclusions or limits for AI-related losses. Directors and Officers coverage may not pick up liabilities tied to AI-driven decisions. Translate that to capital markets: more risk sits on the balance sheet and the equity holder pays. This is a duration problem too. AI deployments promise long-run margin expansion, but exclusions pull forward the cost of failure to the first big incident. The valuation story assumes decades of compounding benefits. The liability story recognizes that one material misfire can impair brand, trigger regulatory actions, or force a restatement. We do not book AI risk reserves the way banks book loan-loss provisions. Maybe we should. The absence of a reserve does not mean the risk is gone; it means it is unfunded.
Companies report explainability issues at roughly half the rate of inaccuracy. That sounds manageable until you remember game theory. You are not just deploying a tool; you are entering a contest with adversaries who study your tool’s failure modes. Black boxes are not neutral. They are attack surfaces. Finance has decades of model risk management doctrine. Supervisory guidance like SR 11-7 exists for a reason: you cannot control what you cannot explain. If your procurement, HR, or underwriting stack is making calls you cannot decompose, you are outsourcing governance to weights and training data you do not own. In a stress event, you will not be able to tell regulators or courts why a decision was made. That is not a technical inconvenience. It is a control failure.
Ask executives about top AI risks, and data privacy and security outrank hallucinations. They are right. Surveys show about 80 percent of companies put data risk at the top, and nearly half have already had unintended data exposure during early deployments. The immediate threat is not a rogue superintelligence. It is sensitive data walking out through logs, prompts, or vendor fine-tuning. The next threat is adversarial misuse. Cybersecurity experts warn that newer models are more capable, and that includes being better at bad tasks. Dual-use is not an edge case. It is the default. If you integrate third-party models into your workflow, you inherit their supply chain, their incentives, and their vulnerabilities. Most firms have not mapped this dependency stack. They have an SBOM for software, but not for models and data. You cannot patch what you have not inventoried.
Why are firms pushing AI so hard, so fast? Because competition punishes caution. This is classic prisoner’s dilemma. If you slow down, your rival captures the operating leverage. Both race, both underinvest in safety, and the system gets more fragile. Insiders have already raised flags about weak oversight inside leading labs. That should tell boards something about the safety budget curve under pressure. In markets, speed without slack is hidden leverage. When the unexpected arrives, it unwinds quickly. Couple that with Goodhart’s Law. Optimize for deployment milestones or demo metrics and behavior will drift away from true safety. A handful of demos, a pilot ROI, and system owners convince themselves that what works in the sandbox will work at scale. It might, until it does not. Complex systems do not fail linearly. They fail by surprise.
We treat model error like an isolated bug. It is not. It is a systemic input that travels through your OODA loop. Mistakes retrain staff behavior, miscalibrate dashboards, and nudge strategy. A wrong forecast becomes a wrong hire becomes an overbuilt capacity that needs discounting to move product. A wrong eligibility call becomes a regulator’s exhibit. The loss does not show up as an accuracy delta. It shows up as churn, charge-offs, or write-downs months later. If you are measuring AI with hit rates instead of expected loss and tail risk, you are managing the theater, not the war.
The fix is not a soothing slogan like humans in the loop. Humans rubber-stamp when the queue grows. Antifragile deployment is more boring and more effective. Build circuit breakers. Create slow lanes for high-stakes decisions with enforced dwell time. Cap the financial exposure per model per day. Use canary cohorts and randomized holdouts so you can see counterfactuals. Run red teams that get paid to break prompts and drain data. Do chaos engineering for decision systems: scheduled failure days where the model is down and the business must operate. Treat the model like a talented junior analyst who is confident and sometimes wrong. Verify, then trust. The point is friction. Small, controlled losses now prevent large, uncontrolled losses later. Think controlled burns in forestry. Skip them and you do not save trees; you store fuel.
Accuracy is not the target. Calibration, expected shortfall, and cost-weighted error are better. You do not run a trading desk on win rate; you run it on PnL distribution and drawdown. Apply that discipline to AI. Demand base-rate adjustments and pre-mortems before rollout. Track model drift like you track FX risk. Set explicit thresholds for automatic de-escalation to human-only decisions when drift or anomaly scores spike. And make a rule: the higher the decision stakes, the slower the automation pathway. This is not anti-AI. It is pro-survivorship. When AI succeeds, it will be because we constrained it where it could hurt us most and let it run where errors are cheap and reversible.
Investors are underwriting AI productivity as if the downside is bounded. It is not. The more firms converge on similar models, vendors, and datasets, the more correlated their failure modes become. That is a market factor waiting to be priced. In a macro shock, a shared misclassification could translate into synchronized missteps across credit, hiring, pricing, and supply chains. That is hidden negative convexity. The equity market rarely discounts operational risk until it blows a hole in cash flows. The contrarian view is simple: AI delivers gains, but it also mints a new class of liabilities that do not sit neatly in current disclosures. If boards want the upside, they should budget for the downside, in cash and in time. Treat AI not as magic, but as leverage with a variable margin call.