
Is AI Even Worth It? Is it delivering tangible value, or is this just another case of the shovel-maker claiming there’s a gold rush? It’s a valid point to raise; but we’ve never argued that AI is here to replace all humans or drive labor costs down to zero. Quite the opposite: our belief has always been that AI kills mediocrity, elevates expertise, and accelerates time-to-production.
So if AI isn’t about replacing people or chasing hype, how do we measure its real impact? The answer is the same metric every business ultimately cares about: return on investment (ROI).
And the numbers behind GenAI adoption are starting to speak for themselves. Google reports that 74% of organizations are already seeing measurable ROI from their AI investments. Snowflake found that 92% of early adopters said GenAI paid for itself, delivering an average 41% return. An IDC report sponsored by Microsoft estimates that GenAI produces 3.7x returns per dollar spent, while BCG research shows productivity gains of 15–30% — and in some cases up to 80%.
These results are translating directly into growth and efficiency. LinkedIn data shows that over half of companies integrating AI are already seeing a 10% bump in revenue. Gartner reports early adopters achieving 15.8% revenue increases, 15.2% cost savings, and 22.6% productivity improvements on average. Morgan Stanley projects even greater potential at scale, estimating that widespread AI adoption could save S&P 500 firms $920 billion annually. Realistic expectations have been met for revenue, cost savings, and productivity. It’s those with unrealistic expectations (and investments) that seem to be panicking.
These numbers look reasonably encouraging even while people are understandably antsy about the possibility of a bubble. Both can be true at the same time:
- AI projects relying on closed-source providers are facing an impending bubble.
- AI projects utilizing open-source providers are insulated from this bubble collapse.
Inference Economics & ROI
When builders weigh AI stack choices, the efficacy of the end product rarely hinges on which model is slightly better. The definitive business question is: what does each token cost to serve reliably at scale? Closed‑source APIs like OpenAI’s GPT or Anthropic’s Claude may deliver marginally higher benchmark scores, but the business economics collapse when you consider the cost differences.
Cost per million tokens (inference):
Even at conservative volumes, the difference is stark; a mid‑sized AI builder consuming 500M tokens/month would spend about $5,000/month at GPT‑5 rates, $4,500/month on Claude Sonnet‑4, $2,800/month for Gemini, but only $450/month on DeepSeek V3.1 through GMI Cloud. Or to put it into perspective:
- $60k/year for GPT
- $54k/year for Claude
- $34k/year for Gemini
- $5.4k/year for DeepSeek V3.1
This is a 84-90% price difference (and for Claude/Gemini, we chose the cheapest, realistic ways to contain costs) with a minor intelligence difference. Scale that up another 10x and you’re looking at a price difference enough to hire a small new engineering team.
It’s entirely possible these prices make sense to AI builders who absolutely require a specific closed-source model for their unique use-case. But we’re forward-looking: Models will converge and consolidate features over time, making it easier to replace one model with another if your tech stack allows it. Closed-source providers must eventually raise prices, become far more efficient, or raise another trillion dollars as Sam Altman keeps saying.
And if I’m being honest, does anyone really think giving OpenAI another trillion dollars staves off the inevitable price hike?
The real AI bubble isn’t in the technology but the closed-source stacks propped up by unsustainable economics. Right now it’s all costs paid for on investor money. As costs rise and flexibility shrinks, builders locked into proprietary APIs and walled gardens will be the ones to feel the painful pop.
What does it look like?
Real-world workloads highlight where builders are placing their bets. One major area is agent copilots — teams are rolling out code assistants that query knowledge bases in real time, often achieving sub-second latency on open checkpoints. These systems not only deliver a responsive developer experience but also offer predictable inference costs that scale linearly with usage, making them more sustainable for long-term adoption.
Another area of momentum is multimodal pipelines. Media companies, for example, are using GPU clusters to process workflows like video-to-text-to-translation at scale. By shifting to specialized infrastructure, they’ve reduced processing costs by as much as 60% compared to hyperscaler equivalents — all while maintaining SLA-grade throughput.
Meanwhile, open-source models paired with reinforcement learning (OSS+RL) are becoming the foundation for meaningful differentiation. Instead of fine-tuning closed models through expensive APIs (which becomes a costly context window expense), some businesses are training domain‑specific models on their own proprietary data using reinforcement feedback loops. This approach reduces dependence on proprietary vendors, enables rapid iteration on real‑world outcomes, and transforms AI from a cost center into an asset that compounds over time. The result is a bespoke AI model that fully understands the context of your business and desired outcomes, improving adoption rates and providing true business value.
The common theme: inference demand keeps climbing, and the workloads that thrive are the ones most cost‑sensitive — exactly where open models on specialized GPU clouds outperform.
The real reason change hasn’t happened yet
The obvious question comes out: why aren’t companies shifting away from closed-source providers? The answer to that is simple: they currently have no incentive to change. It’s a painful process to rearchitect the entire tech stack to be model-agnostic, and slotting in a new model is rarely as simple as changing the API.
Eventually, closed-source providers will be forced to either raise prices, improve efficiency, or secure additional funding.
For users of closed-source providers, each option carries trade-offs, but history shows the market tends to choose the fastest and simplest route. They’re not mutually exclusive, but the first one is the path of least resistance for the provider themselves, and that is the event horizon where part of the AI bubble pops.
The question I’m always asking our partners at GMI Cloud: where does your stack grant you leverage instead of lock-in? And the ones who have a good answer – well, they’re winning the market.



