
Most AI infrastructure conversations focus on raw performance and the user experience that follows from it: how fast tokens come back, how much throughput a model can sustain. What gets discussed less is the thing that actually limits both, thermal management. AI workloads generate enormous amounts of heat, and these systems can only operate within a narrow temperature window. So how are AI data center operators handling the problem today, and where is the architecture going?Â
Traditionally, data centers have leaned on air conditioning and water-evaporative cooling. But as power demand and water demand both rise, communities are pushing back with stricter regulations and tighter caps on what a single facility can consume. Operators are now searching for ways to make their existing infrastructure more efficient, and to adopt cooling architectures that can sustainably support continuous growth.Â
Tackling the resource challengeÂ
In a typical data center today, cooling consumes 30 to 40 percent of total facility energy, depending on architecture and climate [1]. At the industry average PUE of 1.56 [2], that maps to roughly 28 to 30 percent of facility power going to cooling. Even modern direct-to-chip (DTC) systems still spend 10 to 20 percent of facility power on thermal management [3], largely because most current deployments require facility water at roughly 30°C or below to stay within ASHRAE W2 / W3 envelopes. In hot climates, that means running mechanical chillers for much of the year. Â
Water consumption is becoming an equally critical sustainability issue. A small-sized data center of 10 to 20 MW can consume up to 110 million gallons of water annually for cooling, equivalent to the yearly water use of about 1,000 households [4]. Cooling towers and evaporative chiller loops are the main culprits, and they do not go away simply by moving to cold plate cooling, because heat still has to be rejected to the atmosphere somewhere.Â
A useful analogy: think of an AI data center like a high performance car engine. Combustion generates the energy that moves the car, but it also generates heat. If that heat is not removed properly, efficiency drops, components degrade, and performance falls off. Automotive engineers have spent decades optimizing engine cooling to extract more useful work from the same fuel, and AI data centers are now in the same race; even a 10 to 15 percent gain in compute output per watt translates into multi-million-dollar swings in operating cost at the megawatt scale.Â
Where conventional cooling hits the wallÂ
Today’s data centers spend a significant share of their power and water budget just keeping the facility cool. PUE, the ratio of total facility energy to IT equipment energy, is the industry’s standard yardstick. The further it sits above 1.0, the more energy is burned on overhead rather than compute.Â
Traditional air cooled facilities depend on an extensive support stack: chillers, cooling towers, CRAC and CRAH units, raised floors, ducting, and airflow management. That infrastructure carries real cost, both up front and across operating life. Direct-to-chip (DTC) removes some of it, including raised floors and most ducting, but still requires chillers and cooling towers, because most current DTC deployments operate at 30°C inlet coolant or below. As chip TDPs continue to climb, those temperatures get squeezed further, water consumption rises, and chiller energy costs scale with the chip.Â
This is why NVIDIA’s January 2026 announcement of the Vera Rubin platform is so consequential. Jensen Huang stated at CES that Vera Rubin NVL72 systems are designed around a 45°C supply temperature, and that no water chillers are necessary for data centers at that operating point [5]. In a single line, NVIDIA validated what we and a handful of others have been arguing for years: the future of AI cooling is warm water, not cold water. The entire industry is being pulled toward the temperature regime where mechanical chillers become optional and dry-cooler-only, water-free architectures become viable. Â
Two-phase immersion has been one early answer to that pull, and it does deliver excellent thermal performance. But its form factor is a serious deployment problem. Large horizontal tanks of dielectric fluid break compatibility with standard 19 inch racks, require reinforced floor loading to support hundreds of gallons of coolant per tank, demand new servicing procedures, and force operators to rebuild around the technology rather than retrofit it. For most colocation and existing enterprise facilities, that is a non starter.Â
A different approach: Adaptive Phase CoolingÂ
At Ferveret, we took a different path. We invented Adaptive Phase Cooling (APC), built around a mechanism borrowed from nuclear reactor thermal hydraulics called subcooled nucleate boiling. APC submerges servers in dielectric fluid, but the boiling regime is fundamentally different than the typical two phase system. Bubbles form on the heated surface and collapse back into the surrounding liquid before they ever reach a vapor space while operating at near atmospheric pressure, which means no pressure vessels, no exotic plumbing, and no special floor loading.  Â
The result is an architecture that combines the thermal performance of two-phase cooling with the deployment simplicity of direct-to-chip. APC eliminates water consumption entirely and significantly lowers PUE: roughly 17 percent more facility power reaches the racks compared to a typical DTC deployment. On top of that, APC delivers about 15 percent more TFLOPs per kilowatt at the server level than state of the art DTC, and around 55 percent more than air cooling, while operating at the same 45°C inlet coolant temperature that Vera Rubin is designed around. Compounded together, that translates to roughly 35 percent more compute output within the same facility power envelope.Â
Unlocking new geographiesÂ
Cooling has quietly become the binding constraint on where new AI data centers can be built. Permitting and site selection in water scarce regions like Texas, Arizona, and the Middle East is increasingly difficult. At the same time, sovereignty and data residency rules are forcing more workloads to stay inside specific national borders. Operators are caught between two opposing pressures: build where the water and power are, or build where the data is allowed to live.Â
Back to the engine analogy. As the world has demanded less reliance on traditional gas powered vehicles, automotive engineers have rethought how to extract more useful work from less fuel. Data center cooling is going through the same transition. The objective is to multiply compute output without multiplying environmental burden.Â
APC eliminates or simplifies infrastructure at multiple levels. Because it operates at 45°C inlet, well above ambient in most climates, dry coolers can handle heat rejection year round. Mechanical chillers, cooling towers, and their associated piping, controls, and maintenance disappear. CRAC and CRAH units, raised floors, and air management systems are also unnecessary, since heat is captured directly at the chip inside a sealed, rack-compatible chassis. Â
What remains is a dramatically simplified closed loop: the APC chassis at the rack, a coolant distribution unit, and dry coolers. That reduction in mechanical complexity drives substantial savings in capital, energy, water, and maintenance, and it opens entire geographies that were previously off the table.Â
Breaking the thermal ceilingÂ
Today’s AI data centers are limited by three thermal walls: heat flux density at the die, thermal resistance from chip to coolant, and persistent hotspots that force throttling. For most of the industry’s history, cooling was treated as a secondary concern, a constraint to engineer around rather than a lever to optimize. As workloads scale and hardware becomes more power dense, cooling is now the lever that determines performance, scalability, cost, and even where a data center can physically be built.Â
The Ford’s Model T was a remarkable car for 1908. We do not drive Model Ts today because every layer of the system, including how engines are cooled, has been reinvented many times since. AI data centers are due for the same kind of reinvention. The cooling architectures that brought us this far were never designed for the heat flux of a 1,400 watt accelerator. Â
The next generation of AI infrastructure will be built around cooling, not in spite of it, and the operators that get this right will pull meaningfully ahead on both performance and environmental footprint.Â
ReferencesÂ
[1] U.S. Congressional Research Service, “Data Centers and Their Energy Consumption: Frequently Asked Questions,” Report R48646.Â
[2] Uptime Institute, 2024 Global Data Center Survey.Â
[3] NVIDIA and Vertiv, “Power Usage Effectiveness Analysis of a High-Density Air-Liquid Hybrid Cooled Data Center,” ASME, summarized in Data Center Dynamics.Â
[4] Environmental and Energy Study Institute, “Data Centers and Water Consumption.”Â
[5] NVIDIA Developer Blog, “Inside the NVIDIA Vera Rubin Platform,” January 2026.Â



