
How AI Became a Construction Project
For many years, the AI sector has operated under a simple mantra: more compute equals more intelligence. This belief has driven an extraordinary build-out of data centers optimized for artificial intelligence, fuelling a construction frenzy that reached $61bn in 2025 and is expected to surpass $1 trillion by 2030. Some forecasts suggest these facilities could consume 3% of global electricity by 2030, more than double 2024 levels – and it doesn’t stop there. Today’s hyperscale facilities are also operating with electricity footprints equivalent to tens of thousands of households, drying up water sources from areas that are already plagued by shortages and droughts.
Expanding AI hardware capacity is often framed as progress with hyperscalers, leading us to believe that the only way forward is through the brute-force expansion of hardware. But simply adding more GPUs concentrates power among cloud providers without addressing deeper inefficiencies in system design or compute allocation. Large tech companies secure long-term energy and GPU supply to manage scarcity, but these measures do little to truly address the underlying compute problem. Much enterprise AI infrastructure remains underutilized, with many organizations frustrated by poor GPU scheduling and fragmented resources, pointing to an overlooked truth. Today’s AI infrastructure is built for static, worst-case provisioning rather than dynamic, efficient allocation of capacity.
Engram: Rethinking the Compute Shortage
Engram, an architectural innovation developed by DeepSeek, represents a shift in thinking about AI compute. It uses conditional, memory-based lookup techniques that let models fetch information instead of repetitively recomputing it. Retrieval-centric designs ease pressure on high-bandwidth memory and specialized accelerator cycles by replacing repeated model runs with efficient data lookups. As a result, marginal GPU efficiency gains compound into meaningful cost savings, lowering energy use and resulting in faster experimentation cycles.
By shifting the heavy lifting toward RAM and creating efficient lookup tables, Engram addresses one of the most significant bottlenecks in modern AI: the constant, wasteful re-computation of simple data. Engram’s contribution lies in driving algorithmic efficiency and acknowledging that performance gains do not depend solely on scaling hardware. As AI investment spirals out of control, with sceptics sounding the alarm of an “AI bubble,” Engram demonstrates a step toward smarter infrastructure that relieves the burden on scarce GPUs and respects both the cost and environmental impact of AI development.
The Real Challenge is Coordination
Despite improvements at the software level, the underlying compute scarcity remains structural. Today’s GPU supply continues to be constrained relative to demand for training and inference tasks. Procurement lead times for advanced accelerators remain long, and many organizations struggle to secure capacity at predictable costs, including juggernauts like OpenAI. Furthermore, research has shown that in practice, GPUs often operate well below their theoretical capacity, as fragmented workloads and static scheduling leave resources underutilized even when demand is high. Peak-oriented, rigid infrastructure leaves GPUs idle while urgent workloads wait, limiting the value extracted from existing hardware.
Centralized provisioning amplifies inefficiencies. Hyperscalers like AWS, Google Cloud, and Microsoft Azure continually upgrade hardware and sell fixed package sizes, creating excess capacity in some cases and shortages in others. With fixed capacity, prices rise; when demand spikes, supply can’t reallocate quickly. Reducing GPU strain helps, but as long as models remain in hyperscaler silos, today’s “memory wall” will give way to the next infrastructure bottleneck.
Unlocking Global Underused Capacity
Decentralization reframes the problem by treating underused global compute as liquidity activated through marketplace mechanisms. A decentralized network matches supply and demand in real time, aligning pricing and allocation with actual workloads rather than fixed tiers. Idle GPUs can come online cost-effectively and with lower energy use. Decentralized compute marketplaces let developers deploy workloads across distributed providers, support modern multi-vendor GPUs, and deliver flexibility and resilience that centralized models struggle to match.
Distributed compute also supports resilience. Centralized infrastructure carries systemic risks, as outages at a single provider can cascade across sectors. By contrast, decentralized networks diffuse that risk by spreading workloads across providers and regions. This means that a compute outage in one area is unlikely to affect the entire system, as other providers can substitute in to make up the shortfall. Through better utilization, decentralized allocation can lower overall costs and broaden access. As usage scales, the network effect reinforces itself; more providers attract more demand, which in turn signals further capacity participation.
From Brute Force to Coordination
The AI industry must move beyond the assumption that more hardware alone drives progress. Efficiency gains like Engram demonstrate a shift toward smarter computation, but they must be paired with infrastructure that treats compute as a fluid, market-driven resource. Decentralization rethinks scarcity as a coordination challenge, unlocking global capacity. Sustainable AI growth will favor systems that combine algorithmic advances with open, distributed resource allocation. Capital expenditure is critical, yet coordination and efficiency will determine real returns that this expenditure has yet to manifest. As AI continues to develop and push past new frontiers, the enduring advantage will belong to those who can harness collective compute intelligently.


