Field Note

AI Economics Is Broken, Part I

The public price is not carrying the full cost.

Entity: InflectAI, Inc.
Published: May 30, 2026
Status: current

AI economics are broken. Until recently the true economic cost of AI was hidden behind byzantine AI circular financing deals and opaque leasing arrangements that made it difficult to understand the true cost of serving AI. The $200/month Max plan for Claude Code. Even the API pricing for the Codex platform. These have all been subsidized by one of the biggest capitalization flows in the history of mankind. But no one outside the industry could truly analyze the cost. Until recently.

The SpaceX/xAI S-1 has changed that. When SpaceX filed to go public on May 20, 2026, it put several pieces of the frontier AI machine in the same document: AI segment revenue, AI segment operating loss, AI segment capex, compute infrastructure, data-center capacity, and a major third-party compute contract. We can now compare what it may cost to produce a token with what xAI is charging for it.

SpaceX/xAI S-1, filed May 20, 2026

True Cost vs. The Rate Card

Anchored to Anthropic's disclosed Colossus 1 capacity of over 220,000 GPUs and more than 300 MW.

True cost$11.12/M

Rate card$1.69/M

Cost / price6.6x

Assumptions

Inference fleet1x = 220K GPUHow many Colossus-scale clusters are serving paid inference.Utilization40%Share of GPU-hours actually serving billable tokens.Output tokens / GPU / sec45 tok/sEffective sustained decode throughput per GPU under serving load.Output share35%Output tokens are more expensive than input tokens.All-in capex / GPU$35KGPU, networking, data-center build, and supporting infrastructure.Depreciation life3 yrShorter life increases annual cost.Cost of capital12%What the money costs when equity no longer behaves like free fuel.Power price$70/MWhThe physical layer underneath the interface.Other opex1.6x powerCooling, staffing, bandwidth, reliability, and serving infrastructure.

Rate card

Annual cost stack

Depreciation$2.57B

Capital charge$924.00M

Power$183.96M

Other opex$294.34M

Total true cost / yr$3.97B

Total tokens / yr356.8T

Revenue at card$602.11M

Deployed capex$7.70B

Disclosed 2025 AI operating loss$6.36B

Anthropic capacity cross-check$42.04/M tokens

At your throughput setting, the $1.25B/month Anthropic agreements imply this price for serious compute capacity.

Method caveat: this is a top-down stress test. Utilization and tokens per GPU second compound, so no single output should be treated as the final number. The model does not include training-run amortization or contingent IP liability.

Interactive Model

Move the sliders. The base case is ugly for xAI.

At the model's starting assumptions, fully loaded cost is about $11.12 per million tokens. The current Grok 4.3 public rate card, using a blended input/output mix, is about $1.69 per million tokens. The gap is 6.6x.

The model starts with disclosed anchors. Then it lets the reader move the assumptions that matter:

Inference fleet: how many Colossus-scale clusters are serving paid inference.
Utilization: how much of the fleet is actually producing billable tokens.
Output tokens per GPU per second: how much sustained decode throughput the hardware produces under real serving load.
Output share: how much of the billable token mix is expensive output rather than cheaper input.
Capex per GPU: hardware, networking, data-center build, and supporting infrastructure.
Depreciation life: how quickly frontier hardware loses economic value.
Cost of capital: what the money costs once equity stops behaving like free fuel.
Power and opex: the physical layer underneath the interface.

The model asks a plain question: what has to be true for the public rate card to carry the full cost stack?

A proponent would say the base case is too harsh. Utilization will rise. Decode throughput will improve. Capex per GPU will fall. Hardware may last longer in inference than in training. Capital may remain cheap if investors keep believing in the category. Opex will improve as these facilities become more standardized.

Fair enough. Push enough assumptions in the right direction and the public rate card can be made to work.

But that defense asks for several things to go right at the same time: high utilization, high sustained throughput, lower capex, longer hardware life, cheap capital, and low opex. The S-1 does not show a machine already operating in that clean future. It shows $12.727 billion of AI capital expenditures in 2025, another $7.723 billion in the first quarter of 2026, and billions of dollars in AI operating losses.

The Rate Card Is The Developer Price

The xAI pricing page currently lists Grok 4.3 at $1.25 per million input tokens and $2.50 per million output tokens. Those are the numbers a developer sees when deciding whether to build against the API.

The model treats that number as the rate card. Then it rebuilds the cost stack below it.

Start with the physical pieces. GPUs have to be bought. Data centers have to be built. Power has to be secured. Cooling, networking, staff, reliability, and serving infrastructure have to operate. The hardware then has to be depreciated over a useful economic life that is probably shorter than anyone wants to admit when the next chip cycle arrives.

Then add the cost of capital.

The cost of capital is where the bubble enters the spreadsheet. During the funding phase, capital behaves as if it can be carried indefinitely. Equity arrives. Debt arrives. Strategic commitments arrive. Everyone focuses on growth, capacity, and market share.

Capital still has a cost. If revenue does not recover that cost, the deficit is being financed. It may be financed willingly. It may be financed for strategic reasons. It may be financed because everyone believes the next round, next model, next chip, or next enterprise wave will make the old math look small.

The math is still there.

The Anthropic Cross-Check

The model has a second test: the Anthropic contract.

Anthropic announced on May 6, 2026 that it had signed an agreement with SpaceX to use all compute capacity at Colossus 1: more than 300 megawatts of new capacity and over 220,000 NVIDIA GPUs. The later SpaceX S-1 disclosed the financial terms of the broader Cloud Services Agreements with Anthropic: $1.25 billion per month through May 2029, with a reduced ramp fee in May and June 2026, and a 90-day termination right.

The contract is not a retail API price. It is a capacity price.

Take the same throughput assumptions from the model. Convert $1.25 billion per month into an implied price per million tokens. Now compare that implied capacity price against the public Grok rate card.

At the base case, that capacity price is around $42 per million tokens. Against a blended Grok 4.3 public rate card around $1.69 per million tokens, the distance is enormous.

A proponent would object again: Anthropic is buying strategic capacity, probably with service guarantees, priority access, and option value during a compute shortage. True. The contract should not be treated as the same product as a developer API call.

But it is still a market price for serious frontier compute capacity. If the public rate card can only be reconciled with that capacity price by assuming much higher throughput and much better utilization, the rate card is doing a different job. It is creating adoption while the capacity market shows the harder number underneath.

The public rate card tells developers what they pay at the surface. The Anthropic contract tells us what serious capacity can cost when another frontier lab needs the machine underneath the surface.

The two numbers do not reconcile cleanly unless the model is pushed toward optimistic operating assumptions.

Why SpaceX Lets Us See It

OpenAI and Anthropic can look cleaner because their economics are distributed.

A lab signs a large compute commitment. A hyperscaler builds capex against the commitment. The hyperscaler may also invest in the lab. The lab spends money back on the hyperscaler's cloud. Cloud revenue rises. The lab shows growth. The hyperscaler shows demand for AI infrastructure. The equity stake carries strategic value. The losses, risks, and depreciation schedules are spread across institutions large enough to absorb them.

The token looks cheaper because the financing stack is doing work outside the rate card.

SpaceX/xAI puts more of that stack in one filing. The AI segment loss is visible. The AI capex line is visible. The compute infrastructure is visible. The Anthropic capacity contract is visible. The orbital AI compute plan is visible too. A normal filing does not need to tell the SEC it expects to begin deploying orbital AI compute satellites as early as 2028. Pressure is showing up in the architecture.

SpaceX/xAI may simply be the structure where the economics are harder to hide.

What The Model Does Not Include

The model leaves out the full cost of training runs, contingent IP liability, and the legal, safety, governance, security, and research costs sitting around frontier model development.

It focuses on inference economics because that is the question the public rate card pretends to answer.

Even there, the model is a stress test rather than a precision instrument. Utilization and tokens per GPU second compound. Small changes in both can move the result materially. One number is not the point.

The public rate card is an adoption price. It is not evidence of cost recovery.

A proponent can still defend the price. The defense is not stupid. It says the system is young, utilization is low, hardware costs will fall, software will improve, and the economics will look better at scale.

Maybe. But that is a future-state argument. The present-state filing shows large AI operating losses, enormous capex, and a capacity contract that prices serious compute far above the public rate card under ordinary throughput assumptions. The exact gap moves with assumptions. The burden placed on those assumptions is the thing to test.

What This Means

AI economics are broken because the public price of frontier tokens has been engineered for adoption while the full cost is being carried elsewhere.

The gap cannot remain invisible forever.

Part II will look at the next layer: developer tools and platform multipliers. VS Code, Copilot, Replit, Lovable, Cursor, and similar products wrap frontier-token economics in premium requests, multipliers, bundles, limits, throttles, and plan design. The API rate card is already a fiction. The application layer adds another one.

Part III will look at what happens when the subsidies stop.

The unwind will not move every price in the same direction. Older commodity tokens can get cheaper as sunk hardware gets liquidated or pushed into lower-margin use. Frontier tokens have a different problem. They need the newest chips, new data centers, fresh power, new training runs, and a capital structure willing to carry the gap.

Architectures built on casual frontier calls inherit the subsidy.

Architectures that reserve frontier calls for judgment, exception-handling, and high-value reasoning have a better chance of surviving the repricing.

For now, start with the model. Move the assumptions. Make the rate card carry the cost stack. If it only works after several assumptions become heroic at the same time, that tells us something.

The pricing page is where the developer starts. It is also where the subsidy is hiding.

Sources

SpaceX S-1 filing index, filed May 20, 2026: SEC filing index.
SpaceX S-1 filing text: Space Exploration Technologies S-1.
Anthropic announcement on SpaceX compute capacity: Higher limits through SpaceX.
xAI pricing page: xAI developer pricing.