The Gravity of Free
On What Enterprise AI Actually Costs, Who Set the Price, and Why the Bill Arrived All at Once
The engineers at Microsoft loved Claude Code — that much is documented, and worth sitting with for a moment, because it is the detail that makes everything else coherent. They loved it enough that when the company announced in May 2026 that their licenses would be terminated by June 30, the internal reaction was not indifference but something closer to grief: a tool that had genuinely changed how people worked, withdrawn not because it failed but because the invoice had arrived and no one in finance could account for it. Uber had burned through its entire 2026 AI budget by April. Walmart had quietly capped its in-house AI agent after unlimited access proved, in practice, unlimited in cost. Amazon, Meta, Cisco, and AT&T followed with their own restrictions before the end of June. What is visible in that list of names — some of the best-resourced technology organizations in the world — is not a collection of companies that failed to appreciate what they were buying. It is a collection of companies that bought exactly what was being sold, at the price that was advertised, and then discovered that the advertised price and the actual price were two entirely different numbers.
What the Headline Price Actually Was
Anthropic marketed Claude Code Max at $100 to $200 per month. That number was designed to be read as a software seat cost — comparable to a SaaS subscription, predictable, budgetable, benign. What it actually was, at the level of enterprise agentic adoption, was a loss-leader priced to create dependency before the real bill presented itself.
Here is the arithmetic that matters. One developer documented $41,952 in API-equivalent token consumption in a single month of March 2026, at a subscription price of $200. At full agentic workflows — the kind Anthropic was actively selling enterprises on — per-engineer monthly costs landed between $500 and $2,000 in API terms for heavy users. Not the $13-per-active-day figure Anthropic revised upward from its initial $6 estimate — that is a median, and medians hide the tail that destroys budgets. One unnamed organization, reportedly running an internal gamification leaderboard around AI usage, failed to install any usage controls and received a bill attributed to Claude consumption of approximately $500 million in a single month. Anthropic’s own commercial officer has stated that the government’s supply-chain-risk designation put “multiple billions of dollars” of 2026 revenue at risk — which implies the company’s total enterprise revenue expectation was large enough that multiple billions represented a meaningful fraction of it. At $200 per seat, that arithmetic doesn’t work. The enterprise pricing structure — a $20 base seat plus API consumption billed at full rates — is where the real numbers live.
The Wharton and industry narratives of AI as a productivity multiplier that pays for itself have been running ahead of this reality for roughly eighteen months. The ServiceNow CEO publicly suggested that AI agents are replacing entry-level work at a rate that could push graduate unemployment into the mid-30 percent range. Tech layoffs hit nearly 40,000 in a single month by mid-2026, with AI cited as the most frequent reason. And simultaneously, the tools that recently displaced engineers most need to remain competitive — Claude Code, Cursor, GitHub Copilot — were priced in a way that looked accessible at $100 to $200 per month, but generated bills of $5,000 to $40,000 per month at serious agentic usage levels. One power user documented $840 per week. These are San Francisco engineer salaries, denominated in tokens rather than equity, flowing to Anthropic rather than to the person doing the work.
The Krugman Mechanism
Paul Krugman’s Nobel Prize in 2008 was awarded for formalizing something economists had long suspected but couldn’t fully model: that in industries with increasing returns to scale, the competitive equilibrium that classical economics predicts never actually materializes. Instead, whoever gets large first gets cheaper first, which attracts more volume, which makes them larger and cheaper still. The market converges not on the efficient outcome but on the incumbent, and the incumbent’s pricing power is structural rather than transient. Krugman’s home market effect further showed that the larger market attracts the industry’s production base — which then exports from that position of scale advantage rather than yielding it.
Apply this to AI inference directly. The correct question is not “which model benchmarks best this month” — it is “who accumulates enough inference volume to drive marginal cost toward zero first, and what do they do with that advantage once they have it.” Anthropic’s answer to that question has been to use venture capital — the company is burning approximately $10 billion per year more than it earns — to price below cost in the individual developer segment, with the explicit intention of converting those developers into institutional advocates who drag their employers onto the enterprise contracts where the subsidy disappears and the real pricing begins. The $200 Max plan is not a sustainable product. It is a funnel. And once the enterprise is dependent, the funnel closes.
This is the structural logic that Microsoft’s CTO apparently grasped when the May invoice arrived. The company was paying a competitor for the intelligence layer inside its own products. For example, GitHub, which Microsoft owns, with Copilot, which Microsoft controls. That is a strategic incoherence that becomes visible the moment the bill is large enough to generate a line item in a quarterly review. The engineers are not wrong that Claude Code is a better tool — that appears to be the consensus — but “better tool, controlled by a competitor, priced at a rate we cannot govern” is a different calculation than “best AI coding tool.”
The Untoward Impact of Subscription Pricing Models
There is a dimension of this that the productivity narrative actively suppresses: a class of recently unemployed software engineers, disproportionately located in or displaced from San Francisco finds that the AI tools they need to stay competitive are priced at $100 to $200 per month at the point of sale, and at $5,000 to $40,000 per month at the level of serious agentic use. The companies building those tools have raised billions of dollars on the argument that the tools are displacing the engineers’ jobs. The political infrastructure protecting those companies from pricing regulation was funded by the same investors who funded the tools. This is not a coincidence of timing. It is the structure of the situation.
Meanwhile, Meta suspended its Model Capability Initiative — an internal program launched in April 2026 that recorded employees’ keystrokes, mouse movements, and screen content for AI training data — after a security incident on June 22 left those databases exposed to every employee at the company. The program had previously been storing captured data in unencrypted format. This is Zuckerberg’s organization, which has made $200 billion in revenue and $60 billion in profit in a single year, being unable to manage the data governance of its own internal AI training pipeline. The idea that average enterprises — without Meta’s security budget, engineering depth, or legal infrastructure — can safely deploy frontier AI agents at scale without similar incidents is not supported by the available evidence.
The Fable Episode as Symptom, Not Cause
Claude Fable 5 generated significant commentary, but the governance risk it illustrates — that model access can be interrupted by regulatory action without notice or compensation — is a second-order problem relative to pricing. An enterprise that cannot afford to run Claude Code at agentic scale has already decided to leave before the government weighs in. The organizations watching the Fable shutdown are largely the same organizations that have already capped or terminated their Claude contracts due to cost.
Sakana AI, launched by former Google Brain researchers in Tokyo, announced its Fugu multi-agent orchestration system on June 22 — ten days after the shutdown. The timing is deliberate and the pitch is sovereignty: Fugu routes tasks across a swappable pool of models, so that no single government or vendor can terminate your workflow by withdrawing one model from the pool. The architectural claim is more durable than the performance claim: an orchestration layer that abstracts over models is structurally less exposed to single-vendor risk than a workflow built on one provider’s API.
The IPO as the Final Act of the Pricing Theater
The valuation figures for the two companies preparing to go public in 2026 require a specific kind of reading. Anthropic filed confidentially for an IPO in June 2026 at an implied valuation approaching $965 billion, following a $65 billion Series H-1 round in May. OpenAI is targeting a $852 billion valuation at a listing currently projected for September 2026. Both companies are projecting significant losses through at least 2027: OpenAI at approximately $14 billion in 2026 with profitability not arriving until 2029 at the earliest; Anthropic projecting its first operating profit of $559 million in Q2 2026, a 5% margin on $10.9 billion in quarterly revenue.
A 5% operating margin on a $965 billion valuation implies a price-to-earnings multiple, at maturity, that requires sustained revenue growth at rates that would make the 2010s software boom look conservative. Anthropic’s revenue run rate crossed $47 billion annualized in May 2026 and is projected to reach $70 billion by 2028. That trajectory is real, and the revenue is primarily enterprise. The enterprise revenue was substantially generated by pricing a product below its cost of production.
GLM-5.2 has existed for ten days. Nous Hermes 4 reached 40,000 GitHub stars in six weeks. The open-weight inference ecosystem that will compete for the revenue base supporting those IPO projections is not a future threat. It is a present one.
The public investor who buys Anthropic or OpenAI shares at near-trillion-dollar valuations is making a specific bet: that the moat created by training data, RLHF investment, and brand loyalty is durable enough to sustain pricing power as open-weight models close the capability gap and local inference becomes economically viable at enterprise scale. That bet may be correct.
But the evidence of June 2026 — Microsoft canceling licenses, Uber capping spend, enterprise after enterprise discovering that the subscription price and the actual cost were different numbers, and a 7-billion-parameter orchestration model matching frontier benchmarks without any new training — is precisely the evidence that a skeptic would identify as the early signal that the moat is shallower than the prospectus implies. There are roughly 200,000 tech workers who lost their jobs in 2026 to date, with AI cited as the leading reason for layoffs.
Those workers were displaced, in part, by tools whose pricing was held artificially below the cost of production.
The tools that displaced them are now seeking public valuations that depend on the ability to raise that pricing to sustainable levels. That is a net negative-sum transaction dressed as a growth story, and the only question is how many quarters elapse before the public market performs the arithmetic that the enterprise CFO class finished in April.
The Accounting Problem Nobody Is Solving
The Brookings Institution’s analysis — produced by Anton Korinek and Lee Lockwood — is more structurally serious than anything the AI vendor class has offered on the fiscal consequences of the technology it is selling. Their framework recommends shifting the tax base from labor toward consumption as AI displaces human labor, with specific attention to digital and AI service consumption as a logical tax base. The logic is straightforward: if the economic value created by AI accrues primarily to capital owners rather than to labor, and if the tax base has historically been weighted toward labor income, then AI-driven labor displacement creates a structural fiscal crisis independent of GDP growth. Taxing the AI services that generate the economic activity — rather than the workers who used to perform it — is the mechanism that keeps the fiscal system solvent. The Brookings model is the application of basic public finance logic to a new distribution of economic activity.
What Enterprise Finance Should Actually Do
There are three decisions that compound here, and none of them involve choosing a better model.
The first is governance before adoption, not after. Every company on the list above — Uber, Microsoft, Walmart, Amazon, Meta — installed usage caps after the bill arrived. That is governance as damage control. The organizations that survived the 2026 AI cost crisis in reasonable condition are those that required usage instrumentation — dashboards, per-engineer attribution, approval workflows for agentic tasks — as a precondition for deployment rather than a remediation of it. This is a finance operations decision, not a technology decision.
The second is accounting classification. Token consumption is not a software seat, not a cloud compute commitment, and not a professional services engagement. Treating it as any of these produces the wrong governance model. The correct classification for ongoing API and subscription inference spend is metered operational expense — analogous to electricity or bandwidth — with the same controls that any utility-like expense carries: budgeted limits, variance reporting, and escalation procedures when actuals exceed plan. Capital expenditure treatment is appropriate only for infrastructure that the organization owns: GPU clusters, fine-tuned models, training compute.
The mistake of classifying inference as a productivity tool (and therefore embedding it in an unmetered seat budget) is precisely how Uber burned a year’s budget in four months.
The third is the distinction between opex reduction and capability investment — and being honest about which one is actually happening. If AI tools reduce headcount, they reduce opex. If AI tools make existing headcount more productive without reducing it — which is what Uber, Microsoft, and the majority of enterprises have actually observed so far — then AI spend is additive to the existing cost base, and calling it opex reduction is a planning error that will produce a variance conversation with the CFO in Q2. Generative AI, at current adoption levels, is predominantly a capability investment: it expands what the engineering organization can do per unit time, without proportionally contracting what that organization costs. That is not necessarily a bad investment. It is a different kind of investment than the one being sold.
Alan Eyzaguirre is a Silicon Valley corporate strategist, with a keen sense of the numbers.






