Sixty-nine percent of researchers have already integrated synthetic data into their work. Most of them are using it backwards.
Let’s be clear on terms. Synthetic data isn’t fake data. It’s statistically modeled data, and when it’s built on proprietary respondent-level survey responses rather than scraped internet text, it reflects something close to real consumer thinking. We’re building and testing synthetic capability at Ideally. I’m not arguing against the tool. I’m arguing against using it for the wrong job.
The limitation is structural, not technical. Synthetic models are optimized for accuracy. That means they trend to the mean. They’re very good at telling you what most consumers think most of the time. But the innovation signal doesn’t live at the mean. It lives in the edges. Synthetic models suppress exactly those voices by design. The World Economic Forum’s 2025 briefing paper names this directly. The risk isn’t just bias. It’s the progressive erosion of rare-case diversity, which is the entire raw material of innovation research.
So flip the use case. Don’t use synthetic to find innovation signals. Use it to confidently map and set aside the mainstream. Once you know what the center looks like, you can go looking for what deviates from it. Synthetic as a filter, not a finder. The finding itself requires real humans. That conversation cannot be modeled. It has to be had.
LLMs compound the problem, and expose something deeper about how researchers validate anything. Language models are trained on what people say about their behavior, not what they actually do. The gap between stated and actual behavior is the entire reason norms and benchmarks exist in research. A top-two-box purchase intent score only means something relative to a norm, because we know people systematically overstate intent. Adding a synthetic layer on top of stated intent introduces another abstraction away from real behavior. You’re not modeling what consumers will do. You’re modeling the average of what consumers say they’ll do. Every layer of abstraction compounds.
Then there’s the convergence problem, which I think is the most underdiscussed risk in this conversation. If every brand in a category runs innovation through synthetic models trained on overlapping data, the output converges. Everyone ideates the same concepts, worded slightly differently. We’ve tested this internally. The same prompts across different AI tools yield the same ideas in marginally different language. Differentiation is the entire commercial point of innovation investment. Convergence kills it. When a capability is universal, it stops being an advantage. The differentiator reverts to what AI cannot replicate. Category expertise. Contextual judgment. The ability to identify which signals in the noise are worth pursuing.
This is where skilled researchers earn their place, and not as a check on the machine. AI can surface patterns. It cannot decide which patterns matter. Identifying that a signal exists is different from knowing whether that signal is relevant to this category, this brand, this moment. The same applies to concept discrimination. Small differences in framing, language, or benefit articulation produce significant differences in consumer response. These effects are real, repeatedly validated, and frequently counterintuitive. A synthetic model will smooth them out. A skilled researcher will hunt for them.
The interesting horizon is world models. Yann LeCun left Meta to build AI systems that learn the dynamics of reality rather than patterns in text. If that matures, synthetic’s utility for consumer research changes. But the industry isn’t there yet. And the underlying cost-benefit question remains open. Will a richer, behaviorally-grounded synthetic model ever be cheaper than simply asking real people?
For now, hybrid isn’t a compromise. It’s the most methodologically sound approach available. And the organizations that get the most out of synthetic data will be the ones who know when not to use it.



