
Built on a new architecture KumoRFM-2 achieves state-of-the-art results across 41 predictive tasks and four major benchmarks, with zero feature engineering, zero task-specific training, and natural-language querying for any of them
MOUNTAIN VIEW, Calif., April 14, 2026 /PRNewswire/ —Â Kumo, a leader in predictive AI, today announced its launch of KumoRFM-2, the first foundation model to outperform fully supervised machine learning on enterprise relational data. Built by the team that created PyTorch Geometric (the most widely used library for graph machine learning, with 23,700+ GitHub stars and 1.2M+ monthly PyPI downloads) KumoRFM-2 replaces months of feature engineering and dedicated model builds with a single model that any team can query in plain English. It requires zero training, and scales to 500 billion+ rows of data.
The implications are significant: predictions that previously required PhD-level data scientists, 3 to 6 months of feature engineering, and a custom-trained model for every predictive task can now be generated instantly by anyone in the organization. On Stanford RelBenchV1, KumoRFM-2 outperforms its predecessor by 10% and surpasses the strongest supervised machine learning model by 5% across both classification and regression tasks. On the SAP SALT enterprise benchmark, KumoRFM-2 achieves state of the art results by surpassing tabular model ensembles such as AutoGluon as well as recent tabular foundation models by a wide margin. Performance further improves by 13% upon fine-tuning.Â
“Kumo.ai has transformed how we approach lead scoring at Databricks. Since deploying their platform, we’ve seen conversion rates from leads to opportunities improve from 1.2x to 6x, and we’ve doubled the volume of high-intent, quality leads entering our pipeline. The impact on our marketing performance has been substantial,” said Anoop Muraleedharan, Sr Director Data & Analytics, Databricks.
Every current approach to predictive AI on enterprise data faces the same fundamental problem: the most valuable predictive signal lives in the relationships across multiple tables in a data warehouse, but every existing tool, including LLMs, XGBoost, and tabular foundation models, destroys those relationships by flattening multi-table data into a single table before modeling even begins. KumoRFM-2 is the only foundation model that preserves these relationships natively, working directly on the graph of connected tables without flattening. Built on a new Relational Graph Transformer architecture published at ICLR 2026, the model processes data at 5 GB/sec with 20 million lookups per second, and delivers predictions across industries.
“Enterprise data – customer records, transactions, product catalogs – holds enormous untapped revenue potential. Until now, using that data to generate business predictions required months of feature engineering and deep data science expertise, putting it out of reach for most teams,” said Dr. Vanja Josifovski, Co-Founder and CEO at Kumo. “KumoRFM-2 changes that: it’s the only model that actually understands the relationships across your tables instead of destroying them, it scales to hundreds of billions of rows, and it lets any team ask predictive questions in natural language. No feature engineering. No data science expertise required.”
“For years, AI has been constrained by a fundamental limitation of not being able to reason over structured enterprise data. Database is not a document, it is a graph of relationships,” said Dr. Jure Leskovec, Co-Founder and Chief Scientist at Kumo. “KumoRFM-2 is the first model that sees the full graph. We developed Relational Graph Transformers, where the AI model can attend to any datapoint, preserving the complete structure of relational data at arbitrary scale. And by adding a natural language interface, we make it possible for teams across the organization to ask not just what happened, but what will happen next, and why.”
KumoRFM-2 was developed by a founding team with more than two decades of experience shaping modern machine learning and deploying AI at scale. The leadership team includes Co-Founder and CEO Dr. Vanja Josifovski, former CTO of Airbnb and Pinterest, who has extensive experience scaling AI systems for hundreds of millions of users; Co-Founder and Chief Scientist Dr. Jure Leskovec, a Stanford professor and pioneer of relational deep learning whose work underpins KumoRFM-2’s architecture; and Co-Founder and Head of Engineering Dr. Hema Raghavan, former Senior Director of Engineering at LinkedIn, who leads the company’s engineering and product execution, bringing cutting-edge research into enterprise-ready systems.
KumoRFM-2 key breakthroughs include:
- First foundation model to outperform task-specific supervised ML models. Across 41 predictive tasks on four major benchmark suites, KumoRFM-2 is the first few-shot foundation model to surpass task-specific supervised approaches on common benchmark tasks. It outperforms the best single-table foundation model, which operates on a single flat table without relational context, by 18%, LLM-based approaches by more than 10%, and the best supervised relational models by 1.5%, effectively automating the “Data Scientist” role in the feature engineering pipeline.
- Zero training required with extraordinary data efficiency. KumoRFM-2 achieves state-of-the-art results through in-context learning alone. No task-specific training, no feature engineering, no model building. The model is remarkably data-efficient, using as little as 0.2% of the labeled data that supervised approaches require (context examples vs. full training sets), making it dramatically faster and more practical than any existing approach.Â
- Scales to 500 billion+ rows. While KumoRFM 1 was limited to small-scale in-memory datasets, KumoRFM-2 scales to billion-scale relational databases. A custom graph engine with database connectors pushes computation directly to the data layer, building a memory-mapped data structure enabling 5 GB/sec and 20 million lookups/sec for low-latency inference and fine-tuning at production scale. KumoRFM-2 connects directly to SQL databases and cloud data warehouses, including Snowflake, Databricks, and Spark.
- 89% accuracy on SAP SALT, improving state-of-the-art by 13%. On the SAP SALT benchmark, which reflects real-world ERP data with approximately 5 million records, KumoRFM-2 achieves 0.89 MRR when fine-tuned, surpassing giant tabular model ensembles like AutoGluon (0.77) and the best-performing baseline CARTE (0.79) with a single model.
- The only model that works on both single-table and multi-table data. KumoRFM-2 is the only foundation model that operates natively on both single-table and multi-table structured data. Every competing approach requires flattening multi-table data into a single table, destroying the cross-table relationships that are the most valuable predictive signal in enterprise data.
- New architecture. This new architecture replaces Graph Neural Networks (which are limited to local neighborhoods and lose information across hops) with a transformer-based approach that preserves the ability to attend across row, column, foreign key, and cross-sample dimensions. This eliminates the information bottleneck of message-passing architectures while scaling to large context sizes.
- Natural-language interface and agent-ready design. Users can ask predictive questions across hundreds of use cases and receive predictions in plain language with explanations of the factors that influenced each result. The system translates natural language into Kumo’s Predictive Query Language (PQL), a structured intermediate representation that also serves as a composable primitive for AI agents, enabling predictive modeling to be stacked and enriched with retrieved information.
- Robustness to noise, missing data, and structural degradation. Ablation studies show KumoRFM-2 maintains high accuracy even under extreme conditions: only a 6% performance drop at high feature-dropout levels (compared to 17% for single-table models), stable accuracy even when 75% of relational edges are removed, and essentially constant performance under heavy injection of noisy columns. By aggregating information across the relational graph, the model effectively “fills in” missing information from neighboring entities and tables.
- Pre-trained on synthetic and real-world data with zero leakage. KumoRFM-2 is pre-trained on an expanded combination of synthetic data and real-world relational databases. The model has not seen any of the evaluation datasets during pre-training, guaranteeing no leakage of information. Pre-training progresses in multiple stages, transitioning from simpler tabular settings to more complex relational structures.
The company is backed by Sequoia Capital. Kumo’s investor and advisor network includes Frank Slootman (Snowflake Board), Sridhar Ramaswamy (CEO, Snowflake), Ben Silbermann (Founder, Pinterest), Matei Zaharia (CTO & Co-Founder, Databricks), Tristan Handy (CEO, dbt Labs), and more than 20 additional leaders from Discord, Amazon, Apple, and leading venture firms.
About Kumo
Kumo is the creator of KumoRFM, the first foundation model built for structured business data. Pre-trained on billions of relational patterns, KumoRFM delivers zero-shot predictions on enterprise data with no training or feature engineering required. Founded by Dr. Vanja Josifovski (former CTO of Airbnb and Pinterest), Dr. Jure Leskovec (Stanford Professor, pioneer of Relational Deep Learning, former Chief Scientist at Pinterest), and Dr. Hema Raghavan (former AI lead at LinkedIn, Inc. ‘s 2026 Female Founders 500). Kumo’s team created PyTorch Geometric (23,700+ GitHub stars, 21M+ downloads) and has published foundational research at NeurIPS, ICML, and ICLR. Backed by Sequoia Capital. Deployed in production at DoorDash, Snowflake, Databricks, Reddit, Coinbase, and Sainsbury’s. To learn more, visit kumo.ai.Â
View original content to download multimedia:https://www.prnewswire.com/news-releases/kumo-launches-kumorfm-2-the-first-foundation-model-to-outperform-machine-learning-on-enterprise-data-scaling-to-500-billion-rows-302741975.html
SOURCE Kumo.AI



