AI Leadership & Perspective

The Problem with Leadership Metrics Is Not the Metrics

By James Nash, Founder of inBeta

The issue is not which metric. It is when the metric is used, and over what period of evidence. 

Picture a senior executive who has just been appointed into their next role. The contract is signed. There are months of transition ahead, paid time between roles, when preparation should be possible, but rarely is. Yet there has been no serious development work in years, sparse coaching relationships, and a gut-feel plan for the first hundred days.  

The hiring organisation treated this individual as highly valuable on the day the contract was signed and forgot about them the moment after. I have sat across enough of these transactions to confidently state that it’s a pattern, not a one-off. It is also where the current debate about impressive looking yet often superficial “vanity metrics” in leadership analytics actually begins. 

No leadership claim survives without a record 

For the past six years, I have been building inBeta to challenge the conditions that lead to that kind of executive being overlooked, both before a role is filled and after. The first few years were incubation and live partnerships with FTSE 100, Fortune 500, and international private and listed businesses, testing how executives are identified, defined, and developed.  

Dozens of structured case studies later, combined with more than 15 years at the executive search frontline, have taught me a simple lesson: No leadership claim is defensible without a record behind it, and no record is defensible without a system holding it.  

The ledger discipline came first, the platform came later. Today, that system holds 2.5 million verifiable leadership data points. Those years taught me something I did not expect. 

The vanity metrics debate starts in the wrong place (and at the wrong time) 

These are the decisions that shape what an organisation becomes. Succession. Appointment. Renewal – all made with names on the table and careers in the balance. 

The failure in the picture above and the failure the vanity-metrics critique is trying to name are the same failure, even though the critique points to the wrong cause. The problem is not what is being measured. Rather, the problem is when, and over what period of evidence. 

When visibility masquerades as evidence 

In too many senior appointments, the organisation only starts looking the day a vacancy opens and stops the day a contract is signed. That is the window. Everything before it is marketing. Everything after is HR. Inside the window, the tools the market reaches for are not really measuring capability at all. They are finding people who have already been found – visibility is doing the work, dressed as evidence. 

The vanity-metrics critique calls these tools superficial, which is fair as far as it goes. It just gets the cause the wrong way round. A credential halo is the footprint of visibility. A CV filter is a bet on last year’s patterns. Constructs such as “strategic gravitas” fill the space where proper evidence was never gathered in the first place. But what is failing the board is not the metric – it is the thin record behind it. 

What assessment tools can and cannot define and defend 

I am a certified practitioner in FIRO-B, Cultural Intelligence, Hogan, and Gallup’s CliftonStrengths, and I have used all four in live senior hiring work. They do not sit in the same place in a decision.  

Hogan, for instance, can earn its keep at shortlist, inside a role-specific assessment design. Gallup is explicit that CliftonStrengths is not a selection tool; it’s developmental by design, not predictive of role success and not intended for comparing candidates.  

FIRO-B and Cultural Intelligence sit in greyer territory, useful in the role, but strained pre-hire. The pattern across the incubation (the project) held, and the variable that mattered most was in none of the tools themselves – that was immaterial. It was the quality, detail, and depth to which the record around them went that really mattered. 

Why timing supersedes tool quality 

I watched one CFO arrive at a new role four months after signing her contract, still rebuilding her narrative the week she started. Not because she had not prepared, but because no one in the hiring organisation had treated the window before her start as part of the job. That pattern, repeated across sectors, is what made us invest in the window itself. 

Used at the wrong stage of the lifecycle, even a strong tool becomes a narrow dressing of rigour. A predictive assessment run as a first-pass filter is not a decision record. It is a narrowing choice, and it can remove candidates whose role-relevant evidence has not yet entered the record that the tool is able to see. The tool is not wrong. Rather, the lifecycle position is.  

There is a serious defence of early filtering, and it deserves to be put at its best. A governed tool can be more consistent than human screening. It can expand access where the alternatives are private networks and pattern recognition. But only when it extends the record. If it narrows the field before role-relevant evidence has entered the record, it has not solved the visibility problem; it has automated it. 

Garden leave is the market’s confession. The market declares an executive valuable enough to wait months for, and then it stops gathering evidence about them. This is a window it has paid to create and then ignored. 

The governance test is now landing 

Regulation is not going to solve the leadership science problem, but it is doing something narrower and more useful. It is making it harder to leave compressed, automated judgment undocumented. 

Indeed, on 31 March 2026, the Information Commissioner’s Office published recruitment findings and opened consultation on updated guidance for automated decision-making, including profiling. The UK framework now places greater emphasis on automated decisions made without meaningful human involvement, where legal or similarly significant effects may follow. Sixteen employers have already been written to.  

The EU AI Act adds a second pressure point. Employment-related AI systems, including tools used to analyse and filter job applications or evaluate candidates, are listed within the high-risk Annex III regime, subject to scope and classification rules. The timing is live; the enacted Act points to the 2026 application framework, while the Commission’s Digital Omnibus proposal links high-risk obligations to support tools and sets 2 December 2027 as the latest application date for Annex III if the proposal proceeds.  

For relevant high-risk operator obligations, fines can reach €15 million or 3% of global turnover, and the reach can extend to non-EU providers or deployers where the output is used in the Union. 

The regulatory examples start in recruitment, and the governance lesson travels upward and the more consequential the leadership decision, the less defensible it becomes to rely on evidence the board cannot inspect. 

What a defensible leadership record looks like 

A board should ask for a plain record. The role claim. The evidence. The gap. The refusal to answer. That record should sit inside a system that maintains source provenance against every claim, publishes its abstention conditions as part of the record rather than tucking them into a footnote, separates audited outcomes from self-declared signals using explicit tier rules, treats readiness as role-conditional rather than a portable label, and carries an owner for the window between contract and arrival. Otherwise, the system stops at the exact moment a new record should begin. 

Three questions carry that standard into any boardroom. 

  1. Where does the claim stop? Will you publish it? 
  2. When does the system abstain? How will we see that it did? 
  3. Can you showcase evidence, from source to claim, and survive scrutiny by a legal counsel twelve months later? 

If the answers are vague, the tool is not ready. If the discipline holds, the executive on extended garden leave gets seen differently. Not at the moment the vacancy opens, but earlier, across a longer record, through evidence of what they have actually done in role, corroborated, bounded, and honest about what is still unknown. And still seen after the contract is signed, because the leader on garden leave is not outside the leadership system. They are proof of where the system stops looking. 

Leaders are not properly seen at the moment the job opens. They are seen across a record. Used as early-stage filters, current predictive tools can leave too many leaders outside that record before the board has properly looked. 

The vanity-metrics critique is not wrong; it is just not the complete diagnosis. The fix is a longer record and a governed claim behind it. Not less judgment. A better record. And the work, finally, of seeing leaders before the vacancy opens, before the filter narrows, and before the market decides it is ready to look. 

Author

Related Articles

Back to top button