Two papers on MoE-specific quantization algorithms accepted at a workshop held in conjunction with ICML 2026
Recognition follows Nota AI’s overall win at the NVIDIA Nemotron Hackathon
Strengthening core optimization technologies to make large-scale AI models smaller and more efficient to run

SEOUL, South Korea, June 11, 2026 /PRNewswire/ — Nota AI, a company specializing in AI model compression and optimization, announced that two of its papers on MoE-specific quantization algorithms have been accepted to the Resource-Adaptive Foundation Model Inference (AdaptFM) Workshop at ICML 2026, one of the world’s leading machine learning conferences.

ICML is widely recognized as one of the premier global conferences in machine learning and artificial intelligence, bringing together the latest research from global technology companies, leading universities, and major research institutions. The AdaptFM Workshop focuses on technologies that enable large-scale AI models to run efficiently under limited computing resources. Researchers from global companies and research institutions, including Amazon and Meta, serve on the organizing committee, while researchers from leading AI companies such as NVIDIA, Qualcomm AI Research, OpenAI, Apple, and Microsoft are also participating as members of the program committee.

This achievement is significant as it recognizes Nota AI’s accumulated technical expertise in optimizing Mixture-of-Experts (MoE) models, an architecture increasingly regarded as a core structure for large language models (LLMs). MoE models improve both performance and efficiency by activating only a subset of expert models as needed. However, their complex structure requires a different approach to quantization, the process of making models smaller and more efficient, compared to conventional model architectures.

Nota AI previously won both its track and the overall competition at the NVIDIA Nemotron Hackathon with a data-driven MoE quantization method. With the acceptance of these two papers, Nota AI will once again present research outcomes specifically designed for MoE architectures on a global research stage.

The first accepted paper, “DREAM-MoE,” proposes a method to reduce changes in a model’s decision flow that can occur when large-scale AI models are quantized across multiple segments. The method focuses on the fact that even a small error in an earlier segment can affect expert selection in later segments. DREAM-MoE helps the quantized model select experts in a way that remains closer to the original model.

The second paper, “SRA-MoE,” proposes a method that identifies and prioritizes important inputs that have a greater impact on the model’s final output. Rather than treating all inputs equally, SRA-MoE is designed to prevent expert selection from being significantly disrupted for these key inputs, helping maintain model quality more effectively under limited resources.

Both studies demonstrated higher performance compared to the latest MoE-specific quantization methods. This shows that large-scale AI models can be executed with less memory and fewer computing resources while reducing quality degradation. As the cost, power consumption, and hardware burden of running large AI models continue to increase, MoE-specific quantization technologies are becoming increasingly important.

Nota AI has been proactively focusing its R&D efforts on optimizing large AI models that require substantial memory and computing resources. The company is advancing large-scale model optimization, including Solar MoE, as part of the sovereign foundation model project led by the Upstage consortium. It is also expanding its experience in quantizing NVIDIA Nemotron 3 Nano to newer large models such as Nemotron Ultra, further broadening the scope of its optimization technologies.

“This paper acceptance reflects Nota AI’s continued advancement of MoE-specific quantization technologies,” said Myungsu Chae, CEO of Nota AI. “Following our overall win at the NVIDIA Nemotron Hackathon, we are pleased to present our research at the ICML 2026 AdaptFM Workshop. We will continue developing optimization technologies that enable large-scale AI models to be used more efficiently and practically.”

In addition, Nota AI will host “Nota AI – Korea Efficient Days” during ICML 2026 at COEX in Seoul. The event will bring together global researchers, engineers, and business leaders visiting Korea to share research trends and industrial applications of Efficient AI. Through the event, Nota AI plans to introduce its research achievements in large-scale AI model optimization and expand opportunities for technical collaboration and business engagement.

View original content to download multimedia:https://www.prnewswire.com/news-releases/nota-ai-has-two-moe-quantization-papers-accepted-at-icml-2026-workshop-demonstrating-global-competitiveness-in-large-scale-ai-optimization-302796634.html