SAN FRANCISCO, Jan. 22, 2026 /PRNewswire/ —ย FlashLabs, an applied AI research and engineering lab building real-time agentic systems, today announced the release of Chroma 1.0, the world’s first open-source, end-to-end, real-time speech-to-speech AI model with personalized voice cloning.
Chroma is built to remove one of the largest bottlenecks in humanโAI interaction: latency. By operating natively in voiceโwithout the traditional ASR โ LLM โ TTS pipelineโChroma enables natural, fluid conversations that feel immediate, responsive, and human.
“Voice is the most universal interface in the world, yet it has remained closed, fragmented, and delayed,” said Yi Shi, Founder and Chief Research & Engineering at FlashLabs. “With Chroma, we’re open-sourcing real-time voice intelligence so builders, researchers, and companies can create AI systems that truly work at human speed.”
Built for Real-Time, Not Post-Processing
Unlike conventional voice systems that stitch together multiple components, Chroma is natively speech-to-speech, enabling:
- End-to-end TTFT under 150ms
- Natural conversational turn-taking
- Low-latency emotional and prosodic control
- Stable real-time inference without cascading delays
With Day-0 SGLang support, Chroma further reduces latency and improves throughput, achieving approximately 135ms end-to-end TTFT and real-time factors optimized for live deployment.
High-Fidelity Voice Cloning in Seconds
Chroma introduces few-second reference voice cloning, allowing users to generate highly realistic, personalized voices from minimal audio input.
In internal evaluations:
- Speaker similarity score (SIM): 0.817
- +10.96% above human baseline (0.73)
- Best-in-class performance among both open and closed baselines
These results demonstrate that high-quality voice identity no longer requires large datasets or long fine-tuning cycles.
Strong Reasoning at Efficient Scale
Despite using compact ~4B-parameter architectures, Chroma delivers strong reasoning and dialogue capabilities by leveraging modern multimodal backbones and optimized real-time inference. This makes it suitable for edge deployment, agents, call centers, and interactive systems where latency and cost matter.
Applications
Chroma enables a new class of real-time voice applications, including:
- Autonomous voice agents
- AI call centers
- Real-time translators
- Conversational assistants
- Interactive characters and NPCs
- Multimodal AI systems
Availability
Chroma 1.0 is available today:
- Open-source release:
Paper + benchmarks: https://arxiv.org/abs/2601.11141
Models: https://huggingface.co/FlashLabs/Chroma-4Bย
Inference code: https://github.com/FlashLabs-AI-Corp/FlashLabs-Chroma - Live deployment: FlashAI Voice Agents
About FlashLabs
FlashLabs is an applied AI research lab focused on real-time, agentic, and multimodal intelligence. The team builds open and production-grade systems that power autonomous agents across voice, text, and action.
Media Contact:
Koki Kobayashi
6506097501
[email protected]
View original content to download multimedia:https://www.prnewswire.com/news-releases/flashlabs-releases-chroma-1-0-the-worlds-first-open-source-end-to-end-real-time-voice-ai-model-302667072.html
SOURCE FlashLabs


