The AI industry loves talking about models. Every few months, a new benchmark, a larger parameter count, or a bigger training run grabs the spotlight. But that narrative overlooks a growing problem beneath the surface. Before AI runs out of compute, it may run into something far more fundamental: a shortage of the training data needed to keep AI models improving.

The internet is running out of high-quality human-generated text, and AI models are beginning to feel the effects. Researchers estimate that the supply of high-quality public text data could be effectively exhausted for training purposes before the end of this decade. As the ‘AI data scarcity’ grows, some frontier labs are already scraping the same sources repeatedly. Duplication is increasing, and the gap between what models need and what the public web can provide widens with every new training run.

The industry’s response has been synthetic data—AI-generated content used to train AI systems. On paper, it solves the scarcity problem by making data abundant on demand. In practice, when models are trained on AI-generated output rather than human-generated reality, errors, biases, and distortions can propagate into the next generation of models. Moreover, as synthetic data volumes grow, verifying where training data originated, whether it is trustworthy, and whether it has already contaminated upstream models becomes increasingly difficult.

Jeetu Patel, Cisco’s President and Chief Product Officer, believes the current framing—synthetic data as the solution to human data exhaustion—is not merely incomplete. “When it comes to human-generated data to train AI models, we have done a pretty good job,” he tells me. “But we haven’t done a good job when it comes to machine-generated data—and it is a massively untapped opportunity. It is clear that we need a machine data platform for the AI era.”

He argues that the next generation of AI—particularly robotics, physical AI, and large world models—cannot be trained on internet-scraped text alone. Language models benefited from vast amounts of freely available human-generated text accumulated over decades of web publishing. Embodied systems require spatial, physical, and temporal understanding. That data does not exist at the internet scale. It has to be created—as a purpose-built training substrate for systems that have no real-world equivalent dataset to draw from in the first place.

Patel, who oversees a 30,000-person product organization, previously pledged that Cisco would be “unrecognizable” within two years. This year, Cisco Live in Las Vegas offered a verdict: he has largely delivered on that promise. The company unveiled Cisco Cloud Control, a unified platform designed to give humans and AI agents shared operational context across Cisco’s technology stack through a single login.

In conversation with The Control Layer, Patel revealed why AI adoption is not moving faster inside the enterprise. “The question is not whether AI will take your job,” Patel says. “Its whether someone using AI better than you will.”

The Data Problem Nobody Is Pricing In

Patel argues that the industry’s next data challenge isn’t about collecting more information. It’s about creating the information that doesn’t yet exist. World models capable of spatial reasoning and embodied action require training environments that often cannot be gathered from the real world. They have to be simulated. In that framework, synthetic data becomes a foundational requirement for the next phase of AI.

Critics argue that Cisco benefits from both sides of the equation. The company sells the infrastructure generating the data and the platform designed to manage it. Patel doesn’t dispute the premise. Instead, he says the industry is focused on the wrong bottleneck.

“As agents get more prolific and operate around the clock, there will be continued exponential growth in machine-generated data,” he says. “At Cisco, it turns out we are at the center of all of this. The first wave of AI was powered by vast amounts of human-generated internet text. But the next wave—robotics, physical AI, large world models—doesn’t have an equivalent internet-scale dataset to draw from.”

You can’t scrape the physical world the way you scraped the web. For systems that need spatial reasoning, embodied action, and world simulation, the training data often has to be created rather than collected.

Cisco’s Splunk acquisition forms the backbone of what executives call its machine-data thesis. Cisco Data Fabric—an extension of the Splunk platform for federated, cross-domain data management—entered alpha in February 2026.

“The context gap is the real problem,” he says. “AI agents need rich, relevant, continuously refreshed context to make high-quality decisions. Historically, that context has come from very large training runs or from expanding context windows at inference time—but those approaches are expensive and increasingly insufficient for enterprise use. The more durable answer is combining public information with enterprise data: the proprietary, operational data that reflects real workflows and real environments.”

He added that risks around provenance, auditability, and data quality do not disappear when the data is synthetic. They become more acute.

Why the Network Is the Bottleneck, Not the Chip

The question that seems to frustrate Patel most is also the one he hears most often: Is Cisco’s argument that networking is becoming AI’s most critical infrastructure layer a technical conviction—or simply a sales pitch? His answer is architectural.

“GPUs only create value collectively if they can communicate effectively,” he says. “Memory sharing, coordinated computation, multi-GPU servers, racks of servers, rows of racks, geographically distributed data centers—all of it depends on the network as the connective tissue. Without that connective tissue, large-scale AI systems do not function.”

Cisco’s Silicon One G300, announced in February 2026, delivers 102.4 terabits per second of full-duplex switching capacity and contains roughly a quarter-trillion transistors. As of fiscal Q3 2026, Cisco had secured two additional hyperscaler design wins on its Silicon One P200 platform for scale-across deployments, with AI orders for fiscal Q4 projected to exceed $3.6 billion.

The broader bet is on what Cisco calls “scale-across” architecture, linking AI compute spread across hundreds of kilometers into a single logical system. It is, in many ways, a direct challenge to NVIDIA’s dominance in AI networking through InfiniBand. Cisco is betting that Ethernet, backed by the Ultra Ethernet Consortium, can become the open alternative. Patel also pointed to coherent optics as a critical technology as copper approaches its physical limits under AI-scale bandwidth demands.

“Every agentic action is a routing challenge, a trust decision, and a telemetry event,” Patel says. “Humans click, but agents swarm. The sheer volume of infrastructure required to support a future with trillions of autonomous agents working around the clock is meaningfully higher than anything that has existed before.”

If Patel is right about data, his next claim is even harder to ignore. In his view, global data center build-outs could require investments approaching $5 trillion in the coming years. “In the long term, if you think about this as a seven-to-ten-year window, we are grossly underestimating the capacity required to fulfill the needs of AI. Underestimating, not overestimating. We are still trailing demand with supply.”

Inside Cisco’s AI-First Culture

While many executives are still debating how aggressively to embrace AI, Patel has a single data point he returns to when skeptics push back. A unified Cisco platform project originally scoped for seven to twelve years of engineering work shipped in nine months as Cisco Cloud Control. That compression happened because the company shifted to what Patel calls spec-driven development — three humans and five AI agents replacing a team of eight, with output tripling under that configuration.

“We didn’t hedge on AI. We said we’re going to go all in,” he says. “Large companies don’t fail to experiment — they succeed at experiments and then refuse to go all in, always wanting to hedge, always wanting more data before committing. The firms getting the most benefit today are the ones who leaned forward, experimented early, and built institutional instincts around AI. A wait-and-see strategy isn’t prudent — it’s the wrong approach. The cost of hesitation compounds every quarter you delay.”

Patel has committed to at least six fully AI-written product releases by the end of 2026, with a target of 70% of all Cisco products written entirely by AI by the end of 2027. The first fully AI-written product — AI Defense — has already shipped.

Humans write the specifications. They review the code. The bottleneck, he argues, has fundamentally shifted: “It is no longer around the writing of the code. The bottleneck is going to be around the reading and reviewing of the code.”

He is unsparing about what non-adoption looks like from his vantage point: “Engineers who do not use AI for coding may become unemployable within Cisco within roughly 18 to 24 months.” The productivity gap between AI-fluent and non-fluent workers, he says, reaches 50x to 100x in some contexts.

He explicitly disagrees with Anthropic CEO Dario Amodei’s thesis that AI could eliminate close to half of all entry-level white-collar jobs within five years. “I disagree with that thesis. New job categories will emerge that do not yet have names.” The real risk is the lag between role displacement and workforce retraining — and he treats closing that lag as a social obligation that the technology industry cannot outsource to governments.

Patel also noted that tokenomics is the newest and the one most enterprises are currently misinterpreting. “The costs of AI tokens are far higher than the actual value those tokens are generating at scale,” he says — a notable concession from the president of a company whose customers pay for the infrastructure those tokens run on.

An AI-empowered employee consuming approximately $200 in tokens per week generates $10,000 per year in AI spend per person before any productivity gain appears on the ledger. But Patel’s frame is that companies are misreading temporary inefficiency as structural failure.

“Learning to use AI well is like learning to ride a bicycle,” he says. “You cannot become efficient immediately because you first have to learn. Some degree of heavy use early on is not only normal — it is necessary, because the experimentation and practice are what build the skills required for future efficiency.”

Learning to Unlearn Is the Most Important AI Skill

As AI becomes embedded in everything from classroom assignments to hiring decisions, I asked Patel a question increasingly on the minds of students and early-career professionals: What should people entering the workforce be studying today?

“Strong grounding in first-principles thinking, physics, computer science, mathematics, and language—communication, English—these remain important,” he says. Yet, he argues the more important shift is not what people study, but how they study it. AI, he says, should not be viewed as a standalone tool, but rather a core part of the learning process itself, regardless of discipline.

“People need to learn how to learn with AI, and equally, how to unlearn with AI,” he says. The second half is the harder discipline. It requires not just acquiring new knowledge but actively revising assumptions, discarding outdated methods, and remaining willing to rebuild mental models as the underlying technology shifts. Static mastery is not the goal; continuous adaptability is.

AI fluency is a functional requirement at the highest levels of organizational leadership. “I could not do my current role without AI. The scope and complexity of my responsibilities require AI agents to help me stay current and operate effectively at the level required.” The social obligation Patel describes is explicit: the technology industry has a responsibility to partner with governments worldwide on retraining and AI dexterity programs at scale. Cisco, he shared, is already running those programs across multiple geographies.

The Control Layer Verdict

For most of its history, Cisco built the roads rather than the destinations. Patel is betting that AI changes that equation. If intelligence becomes the primary workload, then the company that provides the context, connectivity, and control layer sits closer to the center of value creation. The question is whether the rest of the market arrives at that conclusion as quickly as he has.

Author

Victor Dey

Victor Dey is a tech analyst and writer who covers AI, data science, startups, and cybersecurity. A former AI editor at VentureBeat, his work also appears in New York Observer, Fast Company, Entrepreneur Magazine, HackerNoon, and more. Victor has mentored student founders at accelerator programs at leading universities including the University of Oxford and the University of Southern California, and holds a Master's degree in data science and analytics.

View all posts

Victor Dey 59 minutes ago

8 minutes read

The AI Data Crisis: Cisco’s Jeetu Patel Says Human-Generated Training Data Is Running Out Fast

Cisco President Jeetu Patel reveals why human-generated training data is running out, constraints stalling enterprise AI, and driving an AI-first culture at a 90,000-person company.

The Data Problem Nobody Is Pricing In

Why the Network Is the Bottleneck, Not the Chip

Inside Cisco’s AI-First Culture

Learning to Unlearn Is the Most Important AI Skill

The Control Layer Verdict

Author

The Data Problem Nobody Is Pricing In

Why the Network Is the Bottleneck, Not the Chip

Inside Cisco’s AI-First Culture

Learning to Unlearn Is the Most Important AI Skill

The Control Layer Verdict

Author

Related Articles

AI Video Generation Is Burning Through Your Budget. This Platform Wants to Fix That.

The Future of AI and Creativity

If AI is so good at radiology, why is so little of it being used?

Adaptive Phase Cooling: The Future of Waterless AI Data Centers