
Artificial Intelligence (AI) is shifting the way we work and create, and that transformative force is only just beginning to flex. Once enterprises and consumers work through AI’s early-stage kinks – hallucinations being one – the world as we know it will once again transform.
But that transformative power is not limited to the result of a prompt entered into a chatbot; beyond user-facing applications, AI is fundamentally transforming the underlying infrastructure of the network.
The rapid expansion and distribution of AI compute infrastructure is driving global investments in ultra-scalable, high-performance networks operated by cloud and service providers.
According to a Censuswide survey, commissioned by Ciena, the rapid growth of AI workloads is driving a major transformation in data center network infrastructure. It’s also changing the business models of operators looking to keep up with this new demand.
The survey queried more than 1,300 data center decision makers across 13 countries, and 53% of respondents identified AI workloads as the most significant driver of data center interconnect (DCI) demand over the next two to three years, surpassing traditional cloud computing (51%) and big data analytics (44%).
This is leading to increased investment in infrastructure – but will also pose additional challenges that need addressing along the way.
Why Large Language Model Training Will Become Distributed
To help us realize the potential of AI, we need to understand how it does what it does – and it comes down to large language model (LLM) training, and inferencing (i.e., using AI in the real-world). Training AI involves creating an LLM that can statistically identify patterns and make decisions based on input data. The model can then learn from its mistakes over time and evolve its accuracy via retraining. Inference refers to when the trained AI model makes decisions and predictions based on new data. Once an AI model is properly trained, it is optimized and “pruned” to provide an acceptably accurate decision using far less resources when compared to the LLM.
When it comes to AI, the overwhelming compute, storage, and bandwidth requirements driven by the training required is driving network operators and data centers to look at how their infrastructure will evolve to cater to demand.
The survey found that, as requirements for AI compute continue to increase, the training of LLMs will increasingly occur across geographically distributed data centers. The survey found that 81% of respondents believe LLM training will take place over some level of distributed data center facilities.
And when asked about the key factors shaping where AI will be deployed, the respondents ranked AI energy utilization over time as its top priority (63%), followed by reducing latency by placing inference compute closer to users at the edge (56%), then data sovereignty requirements (54%), and offering strategic locations for key customers (54%).
What does all this mean? In sum, the AI ecosystem of tomorrow will be a network of interconnected data centers, all with unique requirements. Edge data centers will handle inferencing but also offer strategic locations for various customers to improve performance of latency-sensitive applications, such as security applications using facial recognition from high-resolution streaming video feeds.
Harnessing Connectivity to Bring AI to the Masses
The key to connecting the network will be its arteries – the fiber-optic network – and increasing the capacity along pre-existing and new routes. When asked what the needed performance of fiber-optic capacity for DCI would be, 87% of the survey’s participants believe they will need 800 Gb/s per wavelength, or higher.
As you can see, network operators and data center experts are already thinking about what is needed to make bring AI to the masses, consumers and businesses.
The energy-intensive nature of AI has raised significant sustainability concerns, underscoring the importance of solutions like high-capacity pluggable optics to minimize power consumption and physical footprint. The International Energy Agency (IEA), for one, predicted that data center electricity demand will more than double globally from 2022 to 2026 because of power-hungry AI infrastructure — and it’s something the industry is taking seriously. The study found that 98% of data center experts believe pluggable optics are important for reducing power consumption and the physical footprint of their network infrastructure.
And part of the sustainability play for network operators comes back to being more efficient, particularly with the transmission of data.
As mentioned previously, the overwhelming majority of data center experts believe they will need 800 Gb/s or higher per wavelength – which means that they’re not only pushing more data through, but they’re doing so more efficiently.
But it’s important to note that there is no “one-size-fits-all” architecture, no template to borrow from, nor any off-the-shelf architecture that data center operators can simply plug and play. Data center operators know that they must proactively adopt scalable and customizable network architectures to accommodate AI’s evolving demands. Cloud providers and data center operators will adopt custom network strategies tailored to their specific business needs and customers.
However, regardless of the approach taken, the key will be network connectivity. Without the right foundation – driven by connectivity between and within data centers as well as out to the network edge – AI’s full potential will not be realized. Operators must ensure their overall network infrastructure is ready for an AI-centric future – and if the research is anything to go by, they’re already considering the architecture necessary to make the mass adoption of AI a success.