For the past few years, AI infrastructure has been defined by a single, powerful image: massive training clusters rising in remote locations where power is abundant, land is cheap, and scale is king. That model isn’t wrong, but it isn’t the endgame.
A second phase of AI infrastructure is already forming, one that looks less like a handful of mega-sites and more like a distributed network of facilities designed for speed, proximity, and adaptability. When that shift fully materializes, it won’t just change where data centers are built. It will fundamentally change what scale means and how cooling must be designed.
Phase One: Training-Driven, Power-First Infrastructure
The first wave of AI infrastructure has been shaped almost entirely by training workloads.
Training favors:
- Large, contiguous blocks of power
- Rapid construction at scale
- High utilization of dense compute
- Locations optimized for economics, not proximity
AI model training data centers are optimized for power and cooling capacity, not proximity to the user or low network latency. Unlike edge cloud deployments, they do not need to sit close to where applications are consumed. Training jobs can run wherever the power and economics work, because no user is waiting on an immediate response.
That’s why today’s training facilities often sit far from population centers and network hubs. The design assumptions made sense for the problem being solved.
Phase Two: Inference Changes the Rules
The real impact of AI won’t come from models sitting idle, it will come from AI embedded into products and systems that operate in real time. This shift is happening sooner than you may think. In fact, per JLL’s 2026 Data Center Research Outlook Report, the shift is expected to take place over this next year – materializing fully in 2027. We are now looking at inference models that are operating live:
- Search and recommendations
- Customer support and copilots
- Fraud detection and risk analysis
- Industrial automation and robotics
- Media, gaming, and personalization
This is where inference takes over as the dominant workload and inference has very different constraints. Training tolerates distance from the consumer or application. Inference does not.
When AI is part of an interactive system, latency becomes a hard limit. Network connectivity suddenly matters as much as power availability. Compute has to move closer to users, devices, and aggregation points in the network.
This is how infrastructure shifts happen, not because the old model collapses, but because the application evolves. AI infrastructure is no longer just a destination, it’s becoming a network.
Scale Isn’t Shrinking, It’s Fragmenting
This shift is often framed as “hyperscale versus edge,” but that binary misses what’s actually happening. Scale isn’t disappearing, it’s fragmenting.
The future will likely include:
- Massive, centralized training hubs
- Regional inference facilities in the 10-50 MW range
- Highly localized edge deployments
All three will coexist.
That middle tier, regional inference, may become one of the most important layers in the AI stack. These facilities must balance density, latency, speed of deployment, and real estate constraints. They’re large enough to demand infrastructure-grade solutions, but small enough that traditional hyperscale assumptions no longer apply.
And this is where many designs begin to strain, especially around cooling.
Facility-Scale Cooling Is a Philosophy, Not a Size
One of the most persistent misconceptions in edge discussions is that facility-scale cooling only applies to very large data centers.
In reality, facility-scale cooling is not about megawatts, it’s about architecture.
Once rack densities climb and space becomes constrained, the same questions apply whether a facility is 10 MW or 100 MW:
- Where does the cooling infrastructure live?
- How does it scale without consuming compute space?
- How easily can it adapt to future hardware generations?
Edge and regional facilities don’t eliminate the need for infrastructure-grade cooling. They make it unavoidable sooner.
As densities rise, row-by-row approaches increasingly compete with revenue-generating white space. Maintenance paths shrink. Expansion becomes harder. At a certain point, cooling stops being an accessory and becomes a backbone.
Cooling Becomes Infrastructure Again
From our perspective, this shift isn’t theoretical.
When you’ve actually run high-density, liquid-cooled environments, you stop thinking about cooling as a collection of devices and start treating it as infrastructure, something that must behave predictably across an entire facility, not just a rack or row.
That experience changes how you design for the future. You assume:
- Densities will continue to rise
- Hardware will change faster than buildings
- Cooling systems must survive multiple compute generations
- White space must be preserved, not consumed
Once cooling is treated as infrastructure, facility size becomes secondary. What matters is whether the system can adapt without forcing a rebuild.
The Hidden Pressure: Silicon Moves Faster Than Buildings
There’s a deeper force accelerating all of this: the widening gap between hardware cycles and facility lifecycles. Data centers are 12-15 year assets. AI accelerators are evolving on 12-18 month cycles.
Upgrading compute is relatively straightforward. Upgrading mechanical systems is not.
Facilities designed around today’s assumptions can become constrained far earlier than expected, not because they failed, but because the silicon outpaced the infrastructure. This is where flexibility stops being a “nice to have” and becomes a survival trait.
Designing cooling as adaptable infrastructure is one of the few ways to hedge against that mismatch.
Retrofits Will Define the Near Term
Much of the inference capacity needed over the next several years won’t come from pristine greenfield builds. It will come from retrofits and expansions, facilities adapting to densities they were never originally designed to support.
Cooling is the hardest part of that transition. Pipes, pressure, flow, and heat rejection aren’t easily abstracted. But they are also where the fastest capacity gains can be unlocked when done thoughtfully.
Operators who can navigate that complexity, without tearing facilities apart, will move faster than the market.
Final Thoughts
AI isn’t just increasing demand for infrastructure, it’s changing the shape of demand.
The next phase will reward facilities that are:
- Close enough to serve real-time workloads
- Dense enough to be efficient
- Flexible enough to adapt
- Designed for change, not static peak assumptions
This is the lens we bring to our work at Nautilus.
Our focus on facility-scale liquid cooling didn’t come from forecasts or market trends, it came from operating real, high-density, liquid-cooled environments and seeing firsthand where traditional assumptions break.
As AI infrastructure becomes more distributed and more latency-sensitive, infrastructure-first, experience-led design is no longer a differentiator. It’s becoming a requirement.
Nautilus designs and delivers infrastructure-grade liquid cooling shaped by real-world AI operations. If you’re considering a retrofit or new build, let us help you with your liquid cooling design and implementation.