AI & Technology

Why Voice AI Is Quietly Eating HR Tech

Something strange happened in HR tech between late 2024 and now, and most people outside the category have not noticed it yet.

A decade ago, the dominant idea in hiring software was the asynchronous video interview. A candidate would sit alone in front of a webcam, read a question off a screen, hit record, and try to perform for an empty room. HireVue built a roughly billion-dollar business on that primitive, hosted more than 70 million of those interviews, and acquired Modern Hire in 2023 to consolidate the assessment science layer underneath it. For a while, that was what AI in hiring meant. A human writes the questions, a candidate records the answers, an algorithm scores the tape.

That model is now the legacy system.

The shift has been quiet because no single product launch announced it. It happened in pieces. OpenAI shipped the Realtime API. Google shipped Gemini Live. LiveKit and Pipecat made it possible for a small team to put a sub-second voice agent into production in a few weeks. Mercor raised at a 10 billion dollar valuation in October 2025, four months after Paradox’s Olivia was already hitting 50%-plus conversion rate gains across enterprise pipelines. And somewhere in the middle of all that, the asynchronous one-way video interview started looking like the BlackBerry of recruitment software.

What is replacing it is a real-time voice AI interviewer that talks back, asks follow-up questions, and gets done with first-round screening before a human recruiter has finished their coffee.

The numbers behind the quiet takeover

The conversational AI market hit roughly 41 billion dollars in 2026, with the HR and recruiting slice growing at a 25% CAGR, the fastest of any vertical inside it. The recruitment chatbot market alone is projected to triple from 2.03 billion in 2025 to 5.41 billion by 2030. AI use across HR tasks climbed from 26% in 2024 to 43% in 2026, and analysts now expect roughly 80% of high-volume hiring to start with an AI-led voice screen by mid-2026.

Speed is the unsubtle reason. Candidates today apply to 40+ jobs at once, and conversion drops 50 to 70% for every day a hiring team waits to respond. Friday applications that get a Monday callback are usually already gone. A voice agent that calls within 90 seconds of submission solves a problem that no amount of recruiter headcount can solve, because the bottleneck is no longer effort. It is wall-clock time.

Underneath that, the technology actually got good. End-to-end voice latency on a modern stack now sits in the 320 to 800ms range, which is roughly the response time of an attentive human in a phone screen. Semantic turn detection handles the awkward “are you done talking” problem. Barge-in lets candidates interrupt the agent the way they would interrupt a recruiter. The uncanny valley closed without much fanfare while everyone was watching image generation.

Three companies, three theories of where the money is

The interesting thing about this moment is that the leading players disagree about what voice AI in hiring is actually for. They are pointing at three different markets, and each one is a real business.

HireVue, the incumbent, still owns enterprise video interviewing. Its customer list reads like the Fortune 100 because it was the first to industrialise structured interviewing at scale. Hilton went from a 42-day to a 5-day time-to-hire on its platform. Unilever cut a million dollars a year out of recruiting cost. But HireVue’s core product is still rooted in the asynchronous, scored-on-the-back-end paradigm, and its candidate experience reputation has taken steady damage in the press over the past three years, including ACLU complaints and lawsuits over algorithmic scoring. The company is moving toward conversational AI, but it is moving from a position where its installed base expects the old workflow.

Paradox, with its assistant Olivia, made a different bet. It treated the recruiter calendar, not the interview itself, as the bottleneck. Olivia is conversational AI for screening, scheduling, and candidate communication, mostly over text and SMS. McDonald’s halved its hiring cycle on it. Compass Group hires 120,000 frontline workers a year with a recruiting team of 20 because Paradox handles the conversation flow. Chipotle cut time-to-hire by 75%. The Paradox thesis is that frontline and high-volume hiring is a logistics problem disguised as a hiring problem, and conversation is just the right interface for logistics.

Mercor, the newest of the three, is the one that genuinely changed the conversation. The company was at a 250 million dollar valuation in September 2024 and a 10 billion dollar valuation thirteen months later. It conducts a 20-minute AI interview with every candidate on its platform, combines that with crawled data from GitHub and resumes, and trains a proprietary model to predict job performance. Its main customers are AI labs hiring contractors at scale, including OpenAI and Anthropic. Mercor’s bet is that the AI interview is not just a screening step. It is the data-collection mechanism that lets you train better models that make better hiring decisions over time. The interview is the moat.

Each of those three is a defensible position. HireVue owns enterprise inertia. Paradox owns frontline volume. Mercor owns the AI-native talent layer. The category is not consolidating around a single winner, which is itself a tell that something foundational is shifting.

Why this is bigger than any one product

The honest read on what is happening is that the unit of work in hiring is changing.

For thirty years, the unit was the recruiter screen. A human took 20 minutes on the phone with a candidate, made a thumbs-up or thumbs-down call, and passed the survivors to a hiring manager. Everything in HR tech was built to schedule, route, and audit that human screen. ATS systems, video interviewing tools, assessment platforms, even the early chatbots, all of them assumed the human screen was the load-bearing step.

Voice AI removes the load-bearing step.

When a real-time AI agent can hold a structured 15-minute conversation with a candidate, ask context-aware follow-ups, and produce a transcript that a hiring manager can scan in 90 seconds, the human screen stops being the bottleneck and starts being a luxury reserved for the final shortlist. That is not a feature. It is a re-pricing of the entire pipeline. Recruiter time, which used to be the scarce resource, becomes abundant for the first stage. Candidate time, which used to be plentiful, becomes the new scarce resource that everyone is competing for.

This is why incumbents are vulnerable in a way they were not eighteen months ago. HireVue’s asynchronous video format optimises for recruiter convenience at the cost of candidate experience. That trade-off made sense when recruiter time was the binding constraint. It does not make sense now. A candidate who can have a real conversation at 9pm on a Tuesday will not, in the long run, accept a system that asks them to perform for a webcam at 9am on a Wednesday.

This is also why a wave of newer entrants has emerged in this space. Mercor on the AI-labs and contractor side. Paradox dominating frontline volume. Companies like Skillora running real-time voice interviews on LiveKit infrastructure with a focus on roleplay-style assessment and workforce-development use cases. Whether you are looking at an ai interviewer for technical screens, a chatbot for retail hiring, or an agent for sales nesting programs, the stack underneath is converging. WebRTC for transport. A real-time LLM for the conversation. A scoring layer trained on outcomes. The names on the front are different. The architecture is becoming the same.

The honest counter-argument

It would be incomplete to write this without naming what is genuinely hard about this category right now.

The candidate-experience risk is real. Tools trained on historical hiring data can mirror historical bias at scale, and Gartner’s 2026 talent acquisition research is clear that candidates expect transparency about AI use and a path to a human if they want one. The EU AI Act compliance window starts in August 2026 for in-scope deployments, which adds disclosure, audit trail, and human-handoff requirements that many vendors have not finished building. Anti-cheating in AI-conducted interviews is its own arms race, with everyone working on response-latency analysis, browser telemetry, and trap questions to detect candidates who are running another LLM in a side window. None of these are reasons not to deploy. They are reasons to deploy carefully and with the human-in-the-loop checkpoints that the legal teams will eventually require.

There is also a quality-of-evaluation question that does not have a clean answer yet. A voice AI can run a structured interview at scale, but whether it can evaluate the genuinely senior, judgment-heavy roles where the value of a great hire is highest is still open. Nobody is yet putting a voice agent in front of a CFO candidate. The category is winning at the volume layer first, and the executive layer will likely stay human-led for years.

What changes in 2026

The three predictions worth making.

First, the asynchronous video interview will become a fallback option, not the default. By the end of 2026, the modal first-round screen for high-volume roles will be a real-time voice conversation, with the recorded transcript replacing the video as the primary artifact for review.

Second, the line between screening and interview prep will blur. The same voice infrastructure that conducts a screening call can also conduct a coaching session, a roleplay, or a sales nesting drill. Companies that started in one of those use cases are quietly entering the others, because the underlying tech is the same.

Third, the competitive moat is moving from interface to data. Whoever runs the most interviews accumulates the most outcome-linked training data, and that data, fed back into the models, produces better predictions of who will actually perform on the job. This is the Mercor thesis, and it is correct. In the long run, the company that has run a million conversations with candidates and a hundred thousand outcomes from hiring managers will out-rank a company with prettier UX and fewer reps.

The quiet part out loud

Voice AI did not eat HR tech with a single hero product. It ate the assumption underneath the entire category, which was that a human had to be the first conversation. Once that assumption falls, every product in the stack has to be rebuilt around the new constraint, and that is exactly what is happening across the incumbents and the new entrants.

The companies that will matter in 2027 are the ones building for a world where the first conversation is always with an AI, the second is always with a human, and the data flowing between the two is the actual product.

Everyone else is selling video tape.

Author

  • I am Erika Balla, a technology journalist and content specialist with over 5 years of experience covering advancements in AI, software development, and digital innovation. With a foundation in graphic design and a strong focus on research-driven writing, I create accurate, accessible, and engaging articles that break down complex technical concepts and highlight their real-world impact.

    View all posts

Related Articles

Back to top button