FinanceFuture of AI

Can Generative AI Transform Finance? Why the Answer Lies in Compute Infrastructure Constraints

By Satayan Mahajan, Chief Executive Officer, Datalign Advisory

Over the next couple of decades, an estimated $84.4 trillion will transfer from Baby Boomers to Gen X and Millennials. This is more than just money changing hands. It creates a massive wave of decisions about inheritance, taxes, investments, and financial security for millions of families.

At the same time, Jensen Huang, CEO of NVIDIA, warned that while we are moving toward an AI-driven world, the compute infrastructure powering it is not scaling fast enough to meet demand. For most industries, this is a challenge to watch closely. For wealth tech, it is an urgent problem.

The traditional way of finding financial advice through referrals and long relationships is fading. Ficomm’s 2024 Consumer Insights Study shows only 29% of advice seekers require a referral, and that number drops to 17% for people under 44. Nearly half of all advice seekers now find financial advisors through digital channels.

AI offers a powerful way to help people find trusted guidance amid growing financial complexity and shifting behaviors. But the kind of AI that develops—and who it ultimately serves—depends on critical infrastructure decisions companies are making right now. These choices will shape not only the future of wealth tech but also how AI transforms our lives more broadly. The rest of this article explores two very different paths emerging from those decisions and what they mean for the industry and the people it aims to help.

Rethinking the Infrastructure Choice

NVIDIA powers the majority of global AI systems, from large language models to real-time recommendation engines. When Huang says infrastructure isn’t scaling, he’s referring to the core resources behind all of it: GPUs, data centers, and the energy to run them. His warning carries particular weight because NVIDIA isn’t just observing this trend—they’re one of the key makers of hardware that powers it.

At a high level, every firm shares the same ambition: to harness AI for more personalized, intelligent, and effective financial services. But they’re making different bets on how to get there.

The scale-first approach sticks with known and proven techniques—training larger models, acquiring more GPUs, and pushing toward real-time, always-on intelligence. The goal: extract sharper insights, deliver faster decisions, and gain competitive edge at machine speed. The ambition is clear. But so is the risk.

Fine-tuning a foundation model for financial time series can run $500K to $2M per iteration. Real-time portfolio optimization for 10,000 clients may demand more than 100 A100 GPUs. These systems don’t just push compute limits—they burn through energy at rates up to 50x higher than traditional methods.

Infrastructure cost and capacity aren’t abstract concerns—they are structural boundaries. If they don’t improve quickly enough, this approach will favor resource-rich incumbents and deepen the digital divide in financial access. But if they do improve, the payoff could be massive: a platform for widespread innovation, openness, and intelligence at scale.

The efficiency-first approach takes a different tack. Rather than chase ever-larger systems, some firms are optimizing around culture, agility, and durability. They’re using quantization techniques that cut memory needs by 4x with minimal accuracy loss. Knowledge distillation reduces model size by 10x while retaining market-predictive power. Hybrid architectures split compute between local and cloud systems—reducing latency by 70% and slashing costs by up to 80%.

These aren’t academic experiments. They’re live systems—powering risk engines, recommendation logic, and portfolio tools today. Specialized AI chips are also gaining traction. Andrew Feldman, CEO of Cerebras, puts it plainly: GPUs, originally designed for graphics, have become the industry standard for machine learning, but “Cerebras is changing that by designing a chip specifically for AI.”

Lean architectures like MobileNets and linear attention transformers are cutting overhead while preserving predictive strength. The message? Smaller can still be smart—and in many cases, better suited to financial services.

But this path has risks too. Over-rotating on cleverness or control can obscure the point: to serve real customers, not just build elegant systems. And in fast-moving sectors like AI, complexity can become a distraction rather than a differentiator.

The tradeoff is now clear:

  • The scale bet leans on familiar methods, but gambles that infrastructure constraints won’t throttle growth—and that new entrants will still have a shot.
  • The efficiency bet embraces experimentation and opens doors to challengers, but risks over-indexing on engineering rather than outcomes.

What Actually Matters

Both approaches may work—but they should be judged by the same core metric: do they lead people to better financial outcomes? A powerful model that’s too expensive to scale won’t reach the people who need it most. Infrastructure is only as valuable as the clarity, trust, and utility it enables.

That’s the deeper strategic lens. The infrastructure bet a company makes isn’t just about performance—it reveals what it values. Is speed the priority? Accessibility? Control? Differentiation? Every decision reflects a point of view about the future.

Morgan Stanley’s Head of AI, Jeff McMillan, captures the tension succinctly: The challenge is not the math or the models. It’s how do you actually get these things into production at scale, in a way that’s reliable and safe.

A more grounded approach is emerging: assume compute capacity and model cost will improve. Focus on delivering value. Only optimize deeply when there’s a clear, customer-facing reason to do so. In other words—don’t chase efficiency as an end, but as a means to serve.

The Alignment Bet

The most resilient companies don’t start with infrastructure. They start with purpose. Amazon’s leadership principles, like Customer Obsession and Think Big, don’t just guide product—they shape compute investments. Every dollar spent on infrastructure is measured against mission clarity.

For financial services—and increasingly, for AI across all sectors—that kind of discipline is table stakes. It’s not about being faster for the sake of speed. Or cheaper for the sake of efficiency. It’s about building systems that meet people in the real world, with real needs.

This piece focuses on compute because it’s where the divergence in strategy is most visible. But this is part of a broader belief: AI must be built in alignment—with the people it serves, the values it reflects, and the long-term outcomes it enables.

That’s the real infrastructure bet: Not just power. Not just polish.

Perfect alignment.

Author

Related Articles

Back to top button