
The uncomfortable truth is that Gen AI doesn’t erase your data problems; it amplifies them. Across customer service, marketing, HR, and product design, enterprises are racing to deploy GenAI tools that promise faster content, better candidate experiences, and smarter product workflows.
Yet a high percentage of these projects are stalling or quietly failing to meet ROI expectations. The root cause is neither the choice of model nor the cleverness of a prompt. It’s the data. Specifically, the unstructured data that now sits center stage.
We built decades of enterprise analytics on structured data: tables, schemas, and dashboards. We instrumented pipelines, perfected ETL, and governed data warehouses. Meanwhile, unstructured data (documents, files, chats, wikis, policies, tickets, creative assets) was treated as an afterthought: stored, sometimes indexed, rarely curated. GenAI inverts that hierarchy.
The most valuable use cases depend on language and context. That means your unstructured corpus becomes the model’s world. When that world is noisy, stale, contradictory, or permissioned poorly, garbage in is not just garbage out: it’s plausible, confident, and costly garbage out.
LLMs don’t need pristine schemas. They need trustworthy context
GenAI tools are astonishing at synthesizing text and reasoning over concepts embedded in language. But they rely heavily on the corpus you feed them. Whether by fine-tuning, retrieval augmented generation (RAG), or prompt-grounding. In this new regime:
- Coverage matters more than columns. Can the system “see” the right policies, procedures, product specs, and past decisions?
- Freshness beats historical completeness. Are the latest templates, FAQs, and compliance rules available, or is the model summarizing last year’s reality?
- Consistency and provenance trump volume. Can the model reconcile conflicts and cite sources that a human can verify?
Structured data governance can’t solve these problems because unstructured data is different in kind. It is qualitative, context-heavy, and changes in subtle ways. Concepts are distributed across long-form text and attachments; meaning lives in relationships between documents; and access constraints are nuanced. Approaches that worked for relational data (schema enforcement, batch ETL, KPI dashboards) don’t directly translate.
Why unstructured data breaks today’s AI projects
Scale blows past manual control. Over 90% of enterprise data is unstructured. It accumulates across email, chat, shared drives, ticketing systems, intranets, and design repositories. Manual curation cannot keep pace.
Structure is emergent, not predefined. Concepts are latent in text and media. They require extraction, normalization, and enrichment. Traditional MDM tools don’t natively operate here.
Quality is multidimensional. It’s not just accuracy; it’s clarity, deduplication, version lineage, labeling, permissions, and recency.
A pristine but outdated policy is harmful in an AI workflow. Governance is entangled with security. A RAG pipeline that retrieves confidential or mis-permissioned content can expose sensitive information to the wrong audience or even the model provider, depending on architecture.
Evaluation is nontrivial. “Accuracy” for generative answers involves grounding, source attribution, and consistency across long-tail queries, not a single metric.
The result is predictable: systems hallucinate, produce outdated answers, overgeneralize across teams, or expose content they shouldn’t. Users lose trust, ROI slides, and teams retreat to small pilots that never scale.
Your prompt is not the product; your corpus is
Enterprises often pilot with impressive demos on small, curated sets of documents. When they scale to the messy reality of the full repository, performance degrades. The issue is not the model; it’s the knowledge substrate.
A useful mental shift: treat unstructured content as a product. That means product management for your corpus: owners, roadmaps, SLAs, quality metrics, and user feedback loops. It also means investing in an “unstructured data stack” rather than bolt-ons to your existing analytics stack.
What a modern unstructured data foundation looks like
Vendor-neutral, the architectural ingredients are becoming clear. Not every organization needs them all on day one, but ignoring them guarantees pain later.
Content inventory and lineage
- Automated discovery and cataloging across drives, wikis, CMS, mailboxes, ticketing, and repositories.
- Version history and document lineage to track the evolution of policies and specs.
- Deduplication and near-duplicate detection to reduce noise.
Metadata and enrichment
- Consistent taxonomies, tags, and entity extraction (people, products, policies, SKUs).
- Document embeddings created with transparent versioning for reproducibility.
- Sensitivity labels and access metadata baked into indexing.
Retrieval and grounding
- Retrieval pipelines that combine keyword, semantic search, and hybrid approaches to mitigate embedding blind spots.
- Chunking strategies that respect document structure and context.
- Source-aware answer synthesis with citations and confidence scoring.
Quality, freshness, and lifecycle
- Recency and validity windows so the system prefers current content.
- Scheduled re-indexing, re-embedding, and content retirement policies.
- Feedback capture: “was this answer useful?” tied to specific sources.
Policy and security by design
- Fine-grained access control enforced at retrieval time—answers should inherit the user’s permissions.
- Redaction and PII handling in the pipeline, not as an afterthought.
- Auditability: who saw what, when, and why.
Evaluation and observability
- Benchmarks for representative, business-relevant questions (not toy prompts).
- Metrics: grounded accuracy, citation rate, coverage, latency, hallucination rate, and traceability.
- Canary tests to catch drift when content or models change.
Human-in-the-loop operations
- Curators who resolve conflicts, promote canonical sources, and retire obsolete content.
- Editorial standards for policy and knowledge articles aimed at AI consumption: clarity, structure, and explicitness over tribal shorthand.
Where to start: a pragmatic 90-day plan
Marketing, HR, and product design share a pattern: the most demanded AI assistants are those that answer questions, draft content, and summarize institutional knowledge. Build for that.
Weeks 1–2: Define a narrow, high-value scope
- Choose one domain (e.g., HR leave and benefits; brand guidelines; release notes).
- List 50–100 real user questions and the documents that should answer them.
- Identify ownership and a single canonical repository for that domain.
Weeks 3–6: Stand up the minimum viable knowledge substrate
Consolidate and deduplicate the target corpus.
- Apply baseline enrichment: entities, tags, security labels, and recency flags.
- Implement hybrid retrieval with chunking that respects headings and sections.
- Build an evaluation set from the user questions with expected sources.
Weeks 7–10: Ship with guardrails and measurement
- Launch to a limited audience with citation-required answers.
- Track grounded accuracy, coverage, freshness, and user trust (thumbs up/down tied to sources).
- Add a “suggest fix” button that routes content issues to owners.
Weeks 11–12: Institutionalize practices
- Create content SLAs (e.g., benefits policy must be reviewed quarterly).
- Establish a change workflow: draft → review → publish → index → verify.
- Publish a short playbook on “writing for AI” to standardize structure and clarity.
Key design principles
- Scope ruthlessly. Start with one domain where authoritative sources exist and ownership is clear. Breadth is the enemy of trust.
- Prefer citations over persuasion. Require the system to show its sources; make “I don’t know” an acceptable answer when content is missing.
- Make freshness measurable. Track time-to-update for critical policies; expire stale content automatically from retrieval.
- Secure by default. Answers must respect user permissions across all sources, full stop.
- Treat embeddings as artifacts. Version them, test them, and re-generate on content or model changes.
What good looks like in the wild
- Customer service: A self-service and agent-assist knowledge copilot that grounds every response in current policies, KB articles, and customer/context data; understands intent across channels; maintains brand tone; and says “I don’t know” when sources are missing.
- Metrics: 90% grounded accuracy; <5% stale citations; 20–35% reduction in average handle time (AHT) for assisted channels; 15–25% increase in first-contact resolution (FCR); 20–40% self-service deflection; median answer latency <2 seconds; zero permission violations.
- Contact centers: Real-time agent assist and auto-QA that retrieves context from CRM/tickets, suggests next-best responses with citations, summarizes calls, and flags compliance risks in-stream. Post-call, it generates structured dispositions and updates records automatically.
- Metrics: 30–50% reduction in after-call work (ACW); 10–20% reduction in repeat contacts; 15–30% faster agent ramp; 3–5 point CSAT lift; 95%+ compliance adherence; suggestion latency <500 ms for in-call prompts; 95% accurate summaries traceable to transcript spans.
- Innovation teams: A discovery and experimentation copilot that synthesizes internal research, customer insights, patents, and market scans; maps opportunity spaces; detects duplicate efforts; and drafts experiment designs with risk/compliance pre-checks.
- Metrics: 30–40% reduction in time from idea intake to first experiment; 2x reuse of prior art and internal learnings; 20–30% increase in POC-to-pilot conversion; 50% reduction in redundant experiments; zero critical IP or confidentiality breaches; living documentation coverage >90% for active initiatives.
Myths to retire
- “A stronger model will fix our data problems.” Larger models amplify noise and hallucinations unless the corpus is curated and grounding is robust.
- “We can retrofit governance later.” By the time you scale, permissions, provenance, and duplication debt will be entrenched and expensive to unwind.
- “We just need better prompts.” Prompting is seasoning, not the meal. The meal is your content: its clarity, coverage, and currency.
The organizational shift: roles, ownership, and accountability
Generative AI forces a rethink of responsibilities:
- Content owners become data product managers. They own the truth for their domain, define SLAs, and accept feedback loops from AI usage.
- Knowledge engineers and librarians return to the center. Taxonomies, ontologies, and editorial standards are not academic—they are operational.
- Security partners early, not late. Access models, classification, and redaction must be integrated in the pipeline from day one.
- Platform teams provide the unstructured data substrate. Retrieval services, embedding/version control, evaluation harnesses, monitoring, and cost observability become shared infrastructure.
Measuring ROI without fooling yourself
Link ROI to friction removed from real workflows, not vanity demos.
- Employee self-service: reduction in support tickets and time-to-answer in HR/IT.
- Content velocity: cycles saved in drafting and review while maintaining brand/legal compliance.
- Decision quality: fewer rework cycles caused by outdated or conflicting documentation.
- Risk reduction: measurable declines in permission violations and stale policy citations.
Set thresholds for “production readiness” and hold the system to them. If grounded accuracy is below target, treat it as a data and process issue first, not a model swap opportunity.
The strategic takeaway: AI changed the unit of work from records to narratives
Structured data told us what happened in numbers. Generative AI explains how and why in words. It reasons across narratives: policies, designs, decisions, and conversations. That’s where your institutional memory lives and where your competitive advantage either compounds or decays.
If your organization is failing to realize AI ROI, look past your model experiments and into your corpus. Clean isn’t enough; it must be complete, current, consistent, and controllable. Build the unstructured data foundation, and the results will follow. Skip it, and you’ll keep watching impressive pilots fade into production disappointments.
In other words: the prompt is not the product. The corpus is. Treat it accordingly, and you’ll turn generative AI from a flashy demo into a durable capability.



