AgenticInterview

Building the Measurement Layer for an Agent-Run Web, A Conversation with Cameron Witkowski

Cameron Witkowski went from research, the Fundamental AI Research team at AWS AI Labs and the Thomson lab at Caltech, to shipping AI agents at a seed-stage startup. As Co-Founder and Chief Engineering Officer at OpenLens, previously Bread, he spends his days on the problems that only show up once agents leave the demo and hit real production traffic.

OpenLens measures how brands appear across every major AI model, built for the marketing agencies juggling tens or hundreds of clients at once. More than 35 agencies already rely on it. The longer game is bigger. Witkowski wants to build infrastructure for a web where agents act on behalf of the people they represent, observing what AIs say about brands, attributing the conversions agents drive, and readying sites for traffic that doesn’t browse the way humans do.

We talked with Cameron about where teams underestimate the gap between a demo and production, why observability is the constraint nobody plans for, what’s actually breaking in web infrastructure built for human readers, and the emerging fight over agent identity, payments, and trust. His throughline is that the customer is still a human, but the channel reaching them is increasingly an agent, and most of the web hasn’t caught up.

You came up through the Fundamental AI Research team at AWS AI Labs and the Thomson lab at Caltech before joining Bread, now OpenLens, as Chief Engineering Officer. What pulled you out of research and into shipping agents at a seed-stage startup, and how do you describe what Bread actually does today?

What pulled me out of research was real-world impact. Research is great for thinking, for working a hard problem until you actually understand it, and I really did love it, but you’re several steps removed from people who benefit from what you’re working on.

AI is the most significant shift the technology industry has seen in a generation, probably longer. And I’m ambitious, I want to do work that matters. I wanted to be in the arena where this was happening.

OpenLens.com is the AI visibility platform for marketing agencies. We measure how brands appear across every major AI model. We’re built specifically for agency workflows, where one team is serving tens or hundreds of brands at once. Other tools in the space are built for individual brands, priced for individual brands, structured around the needs of individual brands. Agencies are generally an afterthought. They have a completely different set of workflows and requirements, and we built for that. That’s why over 35 agencies trust us with their critical, client-facing, day-to-day work.

Longer term, we’re building infrastructure for agent-mediated commerce. A new economic layer is forming, where agents are acting on behalf of people, and it’s going to need primitives that don’t exist yet. An open way for brands to observe what AIs are saying about them. Attribution for the conversions agents are driving. Tools for getting brands ready en masse for the agent traffic that’s coming. We’re starting with measurement. Where it goes from there is still being worked out, but we have ideas.

There’s a wide gap between agent demos that go viral and agents that hold up under real production traffic. Where do most teams underestimate that gap?

The gap is observability. Demos work because they run once. Production runs constantly, and at scale the failure modes you didn’t see in the demo become inevitable. A thousand users means the one-in-a-thousand failure happens once. Ten thousand users means it happens ten times. And with language models the failure rates aren’t one in a thousand. They’re much higher, because the underlying systems are stochastic. Same input can produce different output.

Most teams underestimate the measurement layer. They build the agent, watch a demo succeed, ship it. The instrumentation gets bolted on after something has gone wrong.

There’s a deeper principle from control theory. From Maxwell forward, observability and controllability have been two sides of the same coin. A system whose state you can’t read is a system you can’t reliably steer. Agents are no exception. The state space is enormous and the dynamics are non-deterministic. You cannot control what you cannot observe.

You’ve written about how current web infrastructure starts to break under agent traffic. What specifically have you seen break, and how is your team working around it?

The breakages are mostly boring. They’re configuration mistakes that were correct for a human-only web and are wrong for a web that has to serve agents.

A concrete example. We were targeting long-tail queries in specific verticals where we’d built content to surface in AI answer engines. OpenLens told us we weren’t appearing. Aman dug in and found our Cloudflare configuration was blocking the relevant crawlers. One setting. We fixed it, the content started getting indexed. These are easy things to get wrong. The defaults across the modern web stack were tuned for an audience that no longer represents all the traffic that matters.

The other example came from our agency users. Pretty early on we noticed some running Claude in Chrome against our dashboard. Literally taking screenshots and clicking buttons. I thought, what? A few others asked us for API and MCP access, and I was surprised because usually those are tools for developers. But we listened and we built it. And it was still in beta, before we’d even published documentation for how to use it, I got on a call with one user and he was like, hey, by the way, the API is really great. I said, sorry, you’re using the API? He’d just found the button to make an API key and got Claude to figure out the rest.

We built OpenLens first for humans, but we’re seeing agents use it more and more. The same shift is happening across the whole web. Agents can do a lot. They just need a way to actually do it.

When you’re choosing between an autonomous approach and a deterministic, scripted tool call for a given task, what’s the actual decision tree in your head?

Deterministic or scripted is right when inputs fit cleanly into categories. The old phone menus everyone hates were this approach taken too far. Press one for billing, press two for support. They work when the bucket is clean and fail when it isn’t, which is most of the time.

Autonomous, LLM-driven approaches are right when the task depends on context a decision tree can’t enumerate. Ambiguous input, similar paths to route between, cases the engineer didn’t foresee.

Most production systems should be hybrids. The model handles routing and interpretation. The code handles the determinate work once the category is identified. It’s really that simple.

Your engineering stack depends on frontier models from labs you don’t control. What does that dependency feel like day to day, and how do you hedge against model-level changes you didn’t ask for?

This feels new but it isn’t. Every wave of technology has been built on infrastructure other people made. Cloud, libraries, browsers, APIs, payment processors. We’re standing on the shoulders of giants and we always have been. Frontier models are the same kind of dependency.

Day to day, the friction is mostly capacity. Rate limit errors. 429s and 529s. The whole industry is compute-constrained right now and we’d burn a lot more tokens if we could.

The one real failure was last year, when Sonnet 3.5 got deprecated. We had it hardcoded in a few user-facing places and parts of the platform broke. That was an unforced error on us. The lesson is just: don’t hardcode model versions. Assume the model you’re calling today isn’t the model you’ll be calling next year.

Beyond that, model changes get managed the same way any API change does. Test new versions before they ship to users. Use proper CI/CD. Monitor for behavior shifts after. It’s not fundamentally different from depending on any other piece of software.

The web was built for human readers and is now being asked to serve agents. What’s the most underappreciated consequence of that shift, and are the agent-friendly APIs and protocols emerging from incumbents going to be enough?

The most underappreciated consequence is the magnitude of the opportunity.

Marketing has always been a distribution problem. Getting the right value in front of the right person at the right moment, when human time & attention is a scarce resource. Agents change that problem fundamentally. They can interpret intent. They can search at machine scale on behalf of the people they represent. They can match wants and needs to offerings with much less waste than the human search-and-click loop has ever managed.

The closest analog is the move from physical retail to e-commerce. The agent-mediated step is at least as large. Tenfold, probably. Maybe more. And it isn’t confined to retail. It’s services. It’s research. It’s professional decisions. Anywhere intent meets fulfillment, the system is about to get more efficient. What’s coming is the beginning of a new economic engine.

On whether incumbent APIs are going to be enough: no. The capability is moving fast and the incumbents are moving slowly. They’ll stick APIs onto their existing products and call it a strategy. What they can’t do is rebuild from first principles. That’s hard inside a company that already has a working business, and it’s exactly what startups are for. Marketing in particular is a place where a whole new industry is going to crop up over the next few years.

There’s an emerging conversation around agent identity, payments, and trust. What does Bread think the right primitives look like, and where are the open questions?

There are several layers. The protocol layer, where WebMCP, UCP, ACP, A2A are being worked out. That’s how agents and sites describe what they can do and want. The payment layer, which Stripe and the rest handle the way they always have. The merchant and site layer, where value lives. The model layer, the frontier labs, where intent gets interpreted. The agents themselves, between the user and all of the above.

The search layer is reconfiguring most clearly. Historically the user of search was a person. Increasingly the user of search is an agent, and the person is using the agent.

OpenLens sits above the payment rails, adjacent to identity, at the measurement and observation layer between brands, agencies, and the agents acting on them. Independent of the model providers. Independent of the payment processors. Independent of the agents themselves. A neutral observer in the agent path.

The hardest open question is trust.

Start with security. You can almost think of text on the web as untrusted code now, with the LLM as the processor. Browser tabs used to be sandboxed. When the LLM is reading one tab and can click into another, that protection breaks down. Imagine one page telling the agent, hey, switch over to the other tab, pull up the user’s bank details, send a million dollars to this crypto address. That’s a security nightmare and we don’t have clean answers for it yet.

There’s also no ground truth for what an agent actually sees on a site. It’s easy to serve different content to humans and agents. We’ve built that ourselves. Cloaking has always existed on the human web, but the new problem is there’s no independent layer in the agent path to check.

The biggest piece, I think, is what’s happening to the information economy underneath all of this. Publishers are losing ad revenue. AI summaries answer queries directly and less human eyes end up on the page. LLMs weight earned media more heavily than paid placements, so brands have new reason to fund earned-media-shaped content, and publishers have new reason to take it, because for a lot of them it’s a survival thing. Meanwhile the cost of producing content is collapsing because of AI, so the hard problem becomes getting it in front of the right people at the right time, and the way to do that is to put it where the AI will pick it up. The whole ecosystem is being rearranged at the same time: people, publishers, brands, marketers, agencies, the search engines, the labs, and now agents sitting in the middle of all of it. The real question is what a new equilibrium looks like, one that works for everyone in it and protects against deception, manipulation, and fraud.

The more concrete open question is what a site has to do to be agent-ready. Protocols are converging fast. Implementation isn’t standardized. Authentication, attribution, rate limiting, how you actually expose structured actions for agents to call, all of it is being worked out one site at a time.

Looking at the next twelve months, what’s the development you’re watching most closely, and what would it mean for builders if it arrives on schedule?

WebMCP. Chrome 146 shipped an experimental flag back in March. The W3C draft is being iterated weekly. Microsoft and Google co-authored it. Firefox has started syncing tests. We have an internal implementation running on a stagehand fork across a number of demo sites.

The reason I’m watching this one is that adoption isn’t a single-vendor decision. Every site eventually decides whether to implement. Before enough agents are using WebMCP, implementation is optional and most sites won’t bother. After enough agents are using it, implementation becomes a competitive cost, the way HTTPS and mobile-friendly design became table stakes in their time. That’s the inflection point we’re watching for, and the moment agents become truly first class on the web.

What this means for builders is that the channel through which customers discover, evaluate, and transact is increasingly an agent acting on their behalf. The customer is still a human. The work you do to reach them has to be legible to the agent in front of them. The way people market is going to change. The way product development happens is going to change. The way companies fulfill wants and needs across the entire world is going to change.

That’s the opportunity that’s about to unlock.

Author

  • Tom Allen

    Founder and Director at The AI Journal. Created this platform with the vision to lead conversations about AI. I am an AI enthusiast.

    View all posts

Related Articles

Back to top button