Breakthrough 1.6 Tbps NR2 AI-SuperNIC and UEC-compliant networking will set a new standard for scalable AI training and inference.
SANTA CLARA, Calif.–(BUSINESS WIRE)–Today at AI Infrastructure Summit, NeuReality, a pioneer in AI infrastructure, unveiled its roadmap for the next generation 1.6 Tbps NR2® AI-SuperNIC, purpose-built for scale-out networking with full support of Ultra Ethernet Consortium (UEC) specifications and in-network computing. The company also announced the availability of a software upgrade to its current generation NR1® solution to support the UEC 1.0 specification.
While demand for AI inference is accelerating, organizations face skyrocketing costs, overwhelming complexity and constrained scalability due to today’s infrastructure not being designed for the scale and speed of modern AI. These challenges only intensify as AI models get bigger and inputs to generative AI evolve from text to multimodal data (images, audio, video). New AI techniques such as reasoning and chain-of-thought are expanding compute-hungry pipelines and AI agents are multiplying LLM query volumes by 10–100x.
“Our mission is to drive the architectural shift the AI infrastructure industry needs,” said Moshe Tanach, CEO at NeuReality. “From silicon to systems, our NR1 AI-CPU and the NR2 AI-SuperNIC embody the performance, openness and scalability tomorrow’s AI workloads demand. With UEC compliance and in-network compute at terabit speeds, we’re enabling AI factories to exploit every cycle of their scarce and costly GPUs and XPUs. With NR1, we went after the AI inference head-node problem of driving and managing the GPU. Now with NR2, we’re going after the growing barrier in training and inference – the scale-out network that is reducing the active time of deployed GPUs, wasting dollars and energy.”
Introducing the NR2 AI-SuperNIC
Building on the networking innovations designed in NR1’s embedded AI-NIC, NR2 AI-SuperNIC enhances wire-speed to 1.6 Tbps and fuses in-network computing capabilities into the NIC’s datapath-pipe. The in-network computing leverages an upgraded implementation of NR1’s AI-Hypervisor and DSP processors, purpose-built for scalable training and inference infrastructure supporting workloads of any size – from single-rack clusters to giga-factories. These built-in, in-network-compute capabilities for both math and non-math collectives, significantly improves system compute performance for large arrays of GPUs. Designed to eliminate bottlenecks that limit current high-performance networking, the NR2 AI-SuperNIC provides unmatched Ethernet throughput, efficiency, and scalability for the next generation of AI infrastructure with any GPU or XPU.
NR2 AI-SuperNIC is deployable co-packaged with GPUs, on micro-server boards, or as a standalone NIC card, setting a new bar for Ethernet throughput and latency in distributed AI systems. On top of the original NR1 TCP and ROCEv2, it supports UEC Ethernet for ultra-low latency and seamless, end-to-end interoperability across AI inference clusters and factories. The NR2 AI-SuperNIC will be available to select customers in the second half of 2026, with mass production ramping in 2027.
The NeuReality Approach
Just as the GPU evolved from a graphics processor into a heterogeneous compute engine with embedded networking and connectivity, NeuReality has reimagined the CPU for the AI era. It purpose-built the AI-CPU to evolve at the same rapid pace as GPUs and maximize balance and efficiency in AI infrastructure systems.
The first generation NR1 solution is a network-attached heterogeneous compute device. It integrates CPU, DSP, and custom audio/computer vision codec processors inside a novel AI-Hypervisor architecture that offloads inter-processor communication and hardware–software interfacing entirely into hardware, eliminating resource-draining kernel drivers and expensive memory copies. It also integrates an embedded AI-NIC on the same silicon die and enables both north–south and east–west communications, handling client–server traffic and scale-out connections between GPUs across servers, racks, and multi-rack AI factories.
“As AI models grow larger than scale-up limits, traditional networking becomes the bottleneck that slows training, starving accelerators for data, and driving up inefficiency across the cluster,” Tanach explained. “What we need are high-performance, low-latency NICs with in-network compute that can offload communication overhead, orchestrate workloads at scale, and reduce data sets driving compute into those networks. Our second generation NR2 AI-SuperNIC product provides a path to scaling inference and training infrastructure without being constrained by latency, bandwidth ceilings, or exploding data sets. It’s a breakthrough that clears the roadblocks holding back next-generation AI.”
Looking Ahead
The second generation NR2 solution is moving from a monolithic approach to a modular approach with both a networking and input/output (I/O) die and a compute die.
NR2 networking die, or NR2 AI-SuperNIC, will be produced first as a standalone product called NR2n.
The company recently disclosed the compute die, or NR2 AI-CPU, will support up to 128 cores and will be packaged with the NR2 AI-SuperNIC. The NR2 AI-CPU will be optimized for real-time model coordination, micro-service-based disaggregation, token streaming, KV-cache optimizations, and inline orchestration. Built on Arm Neoverse Compute Subsystems V3, NR2 AI-CPU will deliver higher single-threaded performance, along with enhanced memory subsystem, built-in support for advanced interconnect technologies, and optimized software frameworks and libraries.
Extrapolating the compute, storage and interconnects that AI will require leads to a clear conclusion: cloud infrastructure architecture must be redefined. The long-standing CPU–GPU–NIC partitioning, whether integrated in one package, on a module or in separate sockets, creates structural inefficiencies. Homogeneous CPUs, general-purpose NICs burdened with legacy features, and GPUs bearing the full parallel-compute load leave valuable CAPEX and OPEX underutilized due to persistent bottlenecks.
NeuReality is committed to reshaping AI infrastructure and much needed chips with end-to-end efficiency in the first priority followed by a standard, open approach to drive industry collaboration. As part of the Ultra Ethernet Consortium (UEC) working group, NeuReality not only supports the specification by adopting it, but is actively shaping the future direction.
Join NeuReality at AI Infra Summit 2025
NeuReality will be at AI Infra Summit in Santa Clara, California, on Sept. 9-11, 2025. The company will host live demos in Booth 612 and in a dedicated workshop to showcase efficiency and ROI of GenAI with multi-modal inputs using the NR1 inference appliance; UEC communication using existing NR1 AI-NIC; test driving cutting-edge models with NeuReality AI Playground served by the NR1 inference service; and porting a deployed pipeline to NeuReality in a few simple steps.
About NeuReality
Founded in 2019, NeuReality is a pioneer in purpose-built AI inferencing architecture powered by the NR1® Chip – the first AI-CPU for inference orchestration. Based on an open, standards-based approach, the NR1 is fully compatible with any AI accelerator. NeuReality’s mission is to make AI accessible and ubiquitous by lowering barriers associated with prohibitive cost, power consumption, and complexity, and to scale AI inference adoption through its disruptive technology. It employs 80 people across facilities in Israel, Poland, and the U.S. To learn more, visit http://www.neureality.ai.
Contacts
Media Contact:
Joe Livarchik
Voxus PR
[email protected]