Conversational AI

GigaChat, the Next-Generation AI Assistant Built on the Flagship Model, Is Presented

Sber has launched a major update to its AI assistant, GigaChat, now powered by the new flagship GigaChat Ultra model. The updated assistant can remember user facts for more personalized communication, search the web on its own, and generate responses twice as fast.

The release also creates new opportunities for developers to build applied AI products and services using GigaChat Ultra. Users can run code directly in the interface and ask questions about their own capabilities, relying on current documentation.

Anton Frolov, senior vice president, head of Generative AI Development, Sberbank:

“We are moving beyond from being just an answer-giving tool to becoming a multi-agent AI assistant. But our vision goes even further: we are creating a future where traditional mobile apps give way to neural-network-based interfaces. Needed features will appear upon request, making navigation through the digital world seamless. GigaChat Ultra is one of the world’s largest models fully developed and trained in Russia. It remembers your preferences, works faster, understands tasks more deeply, and delivers higher-quality recommendations. We’re removing the last barriers between humans and machines.”

Long-term memory

One key innovation is long-term memory. While contextual (short-term) memory is limited to a single conversation session and resets when it ends, GigaChat’s long-term memory operates differently—it retains user-specific facts across sessions and uses them in subsequent conversations.

GigaChat can remember:

  • hobbies, tastes, and interests; 
  • profession, education, life goals, and habits; 
  • personal data shared by the user; 
  • information about family members and pets.

The system automatically identifies significant facts without overloading memory with trivialities such as short-term plans or widely known general knowledge. All data is stored in a unified profile synchronized between web versions, mobile applications, and Telegram bot via the Sber ID sign-in. Users can fully control this feature: memory can be enabled or disabled at any time in settings.

Response generation speed doubled

GigaChat generates textual responses twice as quickly compared to Sber’s previous flagship model. This directly affects how fast users see replies even for complex queries requiring detailed reasoning—the result appears almost instantly.

This speed increase was made possible by a Mixture of Experts (MoE) architecture. The model works like a team of specialists, with only the most relevant “experts” activated for each query, instead of the entire system working at once.

Real-time conversation mode

GigaChat now autonomously connects to internet searches for real-time updates, eliminating the need for users to manually enable this option. This ensures accurate responses when discussing recent news items, stock quotes, and other dynamically changing data. Search functionality includes a dedicated rephraser—a system that reformulates user queries to enhance relevance and improve final response quality.

Online search is now available in voice mode as well. Conversations are truly interactive: users can interrupt the model, clarify details, or switch topics instantly, with context shifts handled smoothly in real time. After the session ends, a full transcript of the conversation is saved.

GigaChat knows everything about itself

A self-awareness mechanism has been added to GigaChat, enabling the model to provide correct answers regarding its own characteristics. When responding to these kinds of questions, the model refers to up-to-date documentation describing its current version, supported functionalities, limitations, and behavioral peculiarities. This helps avoid common language-model errors, such as giving incorrect or outdated information about their abilities — for example, falsely claiming nonexistent features or failing to recognize existing ones.

Code interpreter: GigaChat as an analytical environment

An integrated code interpreter transforms GigaChat into an isolated execution environment for running software code right inside the assistant’s interface. Before introducing this function, the model could merely write code and display it to users; executing and testing results required external tools. Now, GigaChat generates code and executes it immediately within a secure sandbox, without affecting the user’s system.

The interpreter supports uploaded files, advanced numerical calculations, data structure validation, and direct chart creation in chats. This makes GigaChat a comprehensive analytical tool well suited for reports, tables, and large datasets.

The training process

The training followed three stages. First, the knowledge base was broadened with academic books and materials on mathematics and programming, while multilingual data was expanded to cover ten languages. In the intermediate stage, specialized skills were enhanced: the code corpus was enlarged, additional data included physics, medicine, finance, records of actual dialogs, and security measures strengthened. Final tuning based on examples (editor texts, dialogs triggering functions, system prompts) ensured stable performance under real-world conditions.

Significant improvements were recorded in open-ended and closed-ended question answering, along with tasks demanding sophisticated logical reasoning. Benchmark tests for Russian-language use demonstrated high levels of grammatical correctness, natural speech flow, readability, and structured responses. Enhancements also extended to practical industry scenarios: the model became more adept at legal, cybersecurity, medical, financial, and trade-related tasks—especially those involving Russian-specific nuances and sectoral terminology. Notable progress was made in mathematical computation and code generation, expanding its usefulness in fintech, education, and development.

Flagship model to be released publicly

Sber make the source code and weights of its flagship GigaChat Ultra model freely accessible. According to company experts’ assessments, it already outperforms DeepSeek V3.1, Qwen3-235B and its predecessor GigaChat 2 Max in Russian-language tasks, maths and general reasoning. By releasing the repository, organizations ranging from large banks to small startups will gain the ability to install the neural network within their private environments and adapt it to corporate data, marking a move toward genuine technological sovereignty.

Users can try the updated model free of charge in the web version, Android apps available in RuStore and AppGallery, as well as in the Telegram bot and MAX messenger. To activate voice mode and memory, sign in with Sber ID and switch on the desired options in profile settings.

Author

Related Articles

Back to top button