Data

How memory helps data centres accommodate the demands of AI

By Iwona Zalewska, Regional Director for UK & Ireland, DRAM Business Manager, EMEA Region, Kingston Technology

It’s hard to believe that it has been only two and a half years since ChatGPT was launched. The generative AI whirlwind has been widely adopted and is now transforming entire industries with recent research suggesting that AI will grow by as much as 36.6% to 2030.

As the critical infrastructure supporting the digital economy, data centres play an essential role in delivering AI tools to consumers and businesses. The growing demand for AI applications, which often require advanced multitasking or super-fast processing, is putting unprecedented stress on existing data centres, increasing their need for higher memory bandwidth and reduced latency.

Powerful background processing

Calculating just how much processing has to be carried out behind the scenes, even before a user can query an AI model, is a challenge. The training of sophisticated machine learning models requires huge data sets, accessible from rapid storage mediums (SSDs), and high-end GPUs with large amounts of High Bandwidth Memory (HBM) or VRAM. This level of high-speed processing to create Large Neural Networks (LNNs) is typically performed in data centres.

But as generative AI tools proliferate and contribute to a surge in data traffic, computational loads and storage demands on data centres are forcing operators and enterprises to assess how they can enhance server capacity and memory performance, which is why solutions such as DDR5 and HBM are gaining significant traction.

The role of DRAM in meeting demands on DCs

DRAM, which directly impacts a server’s ability to process data rapidly and manage large-scale, complex workloads, comes in many different types and resides on multiple form factors.

For environments where large amounts of memory are required and data reliability is essential, such as data centres, ECC Registered DIMMs (RDIMMs) are ideal. This type of memory features a Register Clock Driver (RCD) component on the module, facilitating efficient communication between the memory controller and individual memory chips, as well as an additional DRAM chip for ECC (Error Correction Code) feature, providing higher memory reliability.

To help servers overcome current memory bandwidth constraints, a new technology, Multiplexed Rank DIMMs (MRDIMMs), is being implemented. This allows two ranks of DDR5 DIMMS to operate simultaneously feeding double the bytes of memory data to the CPU and thus increasing DRAM speed and expanding memory capacity. This provides performance enhancements for AI applications, including training models and running Large Language Models (LLMs).

As mentioned previously, HBM has also evolved to support higher memory capacities and increased performance, making it highly suitable for AI intensive environments. This memory technology differs from traditional DDR DIMMS as DRAM dies are stacked directly onto the CPU or GPU chip. This delivers incredible performance for AI applications, but it has the drawback of being less scalable and serviceable than DDR DIMMS.

While each type of DRAM brings different elements, there are three fundamental ways in which it bolsters data centre capabilities:

  1. Improved data throughput – GenAI models and other large-scale data workloads benefit from high data throughput, which DRAM facilitates by allowing fast access to large datasets. By expanding DRAM capacity, servers can maintain quick access to data in memory, reducing the frequency of slower data retrieval from other storage tiers. This is critical for AI workloads where rapid data access directly translates to faster model training and inference times, improving the speed and responsiveness of applications.
  2. Reduced latency for real-time applications – AI often require instantaneous responses. For example, a large language model generates responses by processing multiple layers of neural networks, each requiring vast amounts of data to be accessed at high speed. By boosting DRAM memory, data centres can reduce latency, delivering quicker response times and reducing user wait times.
  3. Enhanced multitasking and parallel processing: DRAM allows for efficient handling of multiple processes simultaneously, which is particularly beneficial for applications that require intensive multitasking capabilities.  Expanded DRAM enables servers to manage concurrent tasks more effectively, allowing data centres to support a larger number of virtual machines, applications, and users on the same infrastructure.

But it’s not just memory that data centres can use to build greater resilience. Storage is another important factor. SSD drives, for example, can help data centres to improve data throughput and lower latency and combined with increased DRAM will result in a more responsive infrastructure.

Future of data centres with DRAM innovations

The ongoing advancements in DRAM and storage technology will continue to be essential for supporting the growth of genAI, complex operating systems, and other emerging technologies. As data centres implement memory and storage solutions, they will be better equipped to manage the intense computational loads of AI model training, real-time processing, and the multitasking demands of modern operating systems.

As more applications and operating systems harness AI and advanced functionalities, investment in DRAM and storage will become a strategic imperative for data centres aiming to meet the demands of a digital-first world. This path forward allows not only for the support of today’s requirements but also provides a robust foundation for the future of data-driven innovation.

 

Author

Related Articles

Back to top button