Let’s be honest. We’ve all been in that meeting. Someone puts up a slide that says “AI will cut costs by 40%” and everyone nods like they just heard a prophecy from the burning bush. The real challenge, though, is understanding what true transformation looks like beyond the buzzwords.
Two years later, the invoices are in. And surprise — AI is not cheap.
The Bill Nobody Wanted to Open
In 2024 and 2025, companies worldwide made a bet: replace headcount with AI. Cut people. Add tokens. Profit.
By 2026, the math stopped working.
The companies that moved fastest — Uber, Microsoft, Starbucks, even Nvidia — are now saying the same quiet part out loud: AI can cost more than the humans it replaced.
This isn’t a doom narrative. It’s a correction. And honestly? It was overdue.
Jensen Huang Said the Quiet Part Loud
The whole “tokens as a performance metric” thing started with one podcast.
Nvidia CEO Jensen Huang floated the idea that an engineer earning $500K/year who isn’t spending $250K/year on tokens is basically using a pencil when there’s a computer right there. He even suggested giving employees a token budget equal to half their salary.
Noble idea. Terrible unintended consequence.
Within weeks, companies were building internal token usage leaderboards. Meta had one. Uber had one. Suddenly “how many tokens did you burn?” became a performance review question.
Here’s the problem: tokens are an input, not an outcome.
Salesforce — to their credit — called this out immediately, labeling token counts a “vanity metric.” Spending more doesn’t mean doing better. And now the invoices prove it.
Tokenmaxxing: The New Corporate Theater
Meet Tokenmaxxing — the art of generating impressive AI usage numbers without generating impressive results.
It’s the AI equivalent of leaving your car engine running all night to hit your mileage targets. The number goes up. Nothing useful happens. You just paid for gas.
When usage becomes the goal, cost becomes decoupled from value. And in a pay-per-token world, that’s genuinely dangerous.
Why AI Got So Expensive So Fast
Two things happened simultaneously, and they compounded each other.
First: usage-based pricing. Unlike your old SaaS tools (flat monthly fee, sleep easy), AI bills look like your electricity bill after you installed seventeen smart home devices. The more you run, the more you pay. Fintech company Ramp’s payment data shows their customers’ monthly AI token costs grew 13x in just over a year since early 2025. The heaviest users saw costs spike 50%+ in one out of every four months.
Second: the agentic AI era. AI isn’t just answering questions anymore. It’s reading files, writing code, running tests, catching errors, fixing them, running tests again… Stanford research puts the token consumption of agentic tasks at up to 1,000x more than simple chat interactions.
One wrong prompt template? Your costs triple overnight. One junior engineer experimenting on a Friday afternoon? Quarterly budget gone by Monday. One looping agent? That’ll be $50,000, please.
The Emerging Market Reality Check
Here’s what makes this even more interesting — and urgent — for companies scaling AI across emerging markets.
In India, where IT services firms are aggressively deploying AI to remain globally competitive, the token economics hit differently. Labor arbitrage — once the foundation of the industry’s value proposition — is being squeezed from both sides. AI is competing with human workers on cost and speed, but only for about 11.7% of tasks (more on that number shortly). For the rest? Humans in Bengaluru and Hyderabad are still cheaper.
In China, enterprise AI deployments — particularly in manufacturing and logistics — are facing a similar reckoning. The push toward domestic AI models (Baidu’s ERNIE, DeepSeek, Alibaba’s Qwen) creates cost advantages, but the agentic compute problem remains universal. More tasks delegated to AI = exponentially more tokens = surprise bills.
Across Southeast Asia — Vietnam, Indonesia, Thailand — startups and enterprises riding the digital economy wave are treating AI as a growth lever. But many are adopting US enterprise AI pricing models without US-scale IT budgets. The result: the token paradox hits harder when your cloud budget is measured in thousands, not millions.
The lesson is the same whether you’re in Mumbai, Shenzhen, or Jakarta: AI ROI math doesn’t care about your timezone.
Four Companies That Learned This the Hard Way
Uber: Burned the Annual Budget by April
Uber gave AI coding tools to ~5,000 engineers. Built leaderboards. Cheered adoption. By mid-2026, 95% of engineers were monthly active users and ~70% of committed code was AI-written.
They also burned their entire annual AI coding budget in four months.
Worse? When asked what they got for it, Uber’s President and COO Andrew McDonald couldn’t give a clean answer. That’s the real headline. Not “company spends a lot on AI.” But “company spends a lot on AI and doesn’t know if it worked.”
Microsoft: Pulled the Plug Because It Worked Too Well
This one is genuinely funny. Microsoft revoked Claude Code licenses for thousands of internal developers — not because it underperformed, but because devs used it so enthusiastically that costs became unmanageable.
Their solution? Switch to GitHub Copilot’s flat-rate plan ($39/user/month, no overages) instead of Claude Code’s usage-based pricing.
They didn’t pick the better tool. They picked the more predictable bill. Welcome to enterprise procurement in 2026.
(Insight Bridge AI notes: no judgment here — predicting costs matters. A lot.)
Starbucks: $0 ROI After 9 Months
Starbucks deployed an AI inventory system (LiDAR + tablet cameras, made by startup NomadGo) across 11,000 North American stores. The pitch: 8x faster than humans, 99% accurate.
The reality: it couldn’t distinguish between similar milk types. It missed products sitting right in front of it. In Starbucks’ own promotional video, the system failed to recognize a peppermint syrup bottle.
Staff ended up recounting everything the AI counted. That’s not automation. That’s creating extra work and calling it innovation.
Nine months later, it was gone.
Nvidia: The Supplier Admits the Price
The most credible testimony came from the company profiting most from AI spending.
Nvidia VP of Applied Deep Learning Brian Catanzaro confirmed that on his team, compute costs now exceed labor costs. The company that sells AI infrastructure admitted AI infrastructure is expensive.
When your dealer tells you the habit is getting pricey, maybe listen.
The Jevons Paradox: Cheaper Tokens Won’t Save You
“But tokens will get cheaper!” Yes. They will. Gartner projects inference costs to drop 10x by 2030.
Here’s the trap: cheaper access drives higher consumption.
This isn’t a new problem. In 1865, economist William Stanley Jevons noticed that more efficient coal engines didn’t reduce coal consumption — they increased it. Because efficiency lowers cost, which expands use, which increases total spend.
AI tokens are following the same curve. Price per token falls. Capability per token rises. Use cases multiply. Total spend climbs.
The companies that get comfortable because “AI is getting cheaper” are the ones who’ll be most surprised by next year’s invoice.
The 11.7% Number You Need to Know
MIT’s “Iceberg Index” study (published November 2025) asked a sharper question than “what can AI do?”
It asked: what can AI do cheaper than a human, right now?
The answer: 11.7% of U.S. labor market tasks.
That’s ~$1.2 trillion in wage-equivalent work where AI has a cost advantage today. The other 88%? Humans are still cheaper.
For visual inspection tasks specifically (quality control, shelf audits, etc.), AI automation is cost-effective in only about 23% of cases. Which explains Starbucks. And a lot of other failed deployments nobody is talking about publicly.
“AI can do this” and “AI should do this” are very different sentences.
5 Ways to Actually Do AX Right
AX (AI Transformation) isn’t dead. It just needs to grow up. Here’s how to stop performative AI and start productive AI:
1. Measure outcomes, not usage. Tokens consumed = input. Revenue protected, time saved, errors reduced = output. Stop reporting one and calling it the other.
2. Tie costs to specific workflows. “Our AI spend went up” tells you nothing. “Our customer support AI costs $X/month and reduced response time by Y days” tells you everything. If you can’t make that sentence, you don’t know if you’re investing or wasting.
3. Know the 11.7% boundary. Not every task should be automated. Start with what AI can do demonstrably cheaper than your team. Expand from there, with data.
4. Manage usage-based billing like an adult. Set department-level caps. Build cost alerts. Seriously consider flat-rate plans for high-volume users (Microsoft did). Treating AI billing like SaaS will ruin your quarter.
5. Define your kill switch. Set success criteria before deployment. If the AI doesn’t hit them in a defined window, shut it down. Starbucks took nine months. Faster is better.
The Real Competitive Advantage in 2026
The first wave of AI competition rewarded speed of adoption. Move fast, deploy everything, figure it out later.
That phase is over.
The next wave rewards something less glamorous: knowing whether your AI spend is working.
Uber spent the budget and couldn’t explain the return. Microsoft calculated the cost and made a rational switch. Starbucks pulled the plug. Nvidia admitted the limits.
The companies winning aren’t necessarily using the most AI. They’re the ones who can answer: “We spent X on AI this quarter, and here’s exactly what we got for it.”
Most companies can’t answer that question today.
The ones who learn to answer it first? That’s your next moat.
See more insightful posts!
“AI Job Replacement 2026: How to Pivot from Labor to Capital Income”
“Beyond Software: 5 AI Infrastructure Trends That Will Define 2026”
“Oracle’s AI Revolution: The 36% Surge That Changed Everything”
“China’s Global AI Ambition: The Shocking Strategy Behind Its New Action Plan”
“Decoding the Fed’s Warning: How $1.1 Trillion in Margin Debt Threatens Your Portfolio”