Get in touch

Who we are

What we do

What we think

Join us

Get in touch

What we think

News & Events

Volha Dashkevich's Viewpoint

Jan 12, 2026

Volha Dashkevich's Viewpoint

Jan 12, 2026

"AI agent disillusionment comes from mistaking POC's for products. Value comes from enterprise solutions backed by real QA, among other things. Evals is test automation for agents, getting them right is critical"

Source:

Demystifying evals for AI agents

Source Summary

This article explains how to design, build, and maintain automated evaluations (evals) for AI agents. It defines eval components (tasks, trials, graders, transcripts, outcomes, harnesses), compares grader types (code-based, model-based, human), and recommends practices for capability vs. regression suites. The authors give concrete guidance for evaluating coding, conversational, research, and computer-use agents, discuss handling non-determinism (pass@k and pass^k), and provide a pragmatic roadmap for starting and scaling evals. The piece emphasizes reading transcripts, preventing brittle graders, monitoring for eval saturation, and integrating automated evals with production monitoring, A/B testing, and human review.

Topics

People-First

Thought Leadership

Leadership

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

Get in touch

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

Get in touch

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

Get in touch

SAP S/4HANA migration taking years?
With AI. Done in weeks.

AI will do it quicker, cheaper and cleaner. We can compress your technical migration by as much as 70%.

Accelerate your migration

SAP S/4HANA migration taking years?
With AI. Done in weeks.

AI will do it quicker, cheaper and cleaner. We can compress your technical migration by as much as 70%.

Accelerate your migration

Valliance Newsletter

Insights and thinking direct to your inbox.

Join the Valliance Newsletter

Valliance Newsletter

Insights and thinking direct to your inbox.

Join the Valliance Newsletter

_Related thinking

View all thinking

_Related thinking

View all thinking

_Related thinking

View all thinking

_Related thinking

View all thinking

In the Press

Agentic Systems

Trust

Image description - An light painting image representing the Meta logo

Tarek discusses the impact and lessons from Meta's AI agent leaking sensitive data to employees

Mar 20, 2026

2 Mins

Ontologies

Image description - An AI generated image of extruded frosted glass colour blocks representing the semantic layers of data and the connections between them

The State of Enterprise Semantic Layers: A 2026 Market Overview

Dom Selvon

Dec 15, 2025

15 Mins

AI in Practice Survey ...

Agentic Systems

Ronan Forker

Nov 24, 2025

"This small survey demonstrates that leveraging synthetic data and running evals is moving from niche to mainstream."

Trust

Image description - An AI generated image of a digital illustration depicting a hand shaking hands with digital data particles.

Why trust is the real barrier to AI value

Rad Parvin

Nov 17, 2025

2 Mins

Disrupting the first reported AI-orchestrated cyber espionage campaign...

Trust

Ronan Forker

Nov 14, 2025

"Autonomous AI changes the threat landscape. To stay safe, organisations must tighten access, log every AI action, and use AI defensively to match the speed of emerging attacks."

People First

Image description - An AI generated image of the world showing regional data variations across the globe conceptually representing cultural distribution of which humans have raised our LLMs.

Which humans raised our LLMs? The built-in bias the C‑Suite needs to know about.

Nov 12, 2025

10 Mins

In the Press

Agentic Systems

Trust

Image description - An light painting image representing the Meta logo

Tarek discusses the impact and lessons from Meta's AI agent leaking sensitive data to employees

Mar 20, 2026

2 Mins

Ontologies

Image description - An AI generated image of extruded frosted glass colour blocks representing the semantic layers of data and the connections between them

The State of Enterprise Semantic Layers: A 2026 Market Overview

Dom Selvon

Dec 15, 2025

15 Mins

Agentic Systems

Trust

Image description - An light painting image representing the Meta logo

Tarek discusses the impact and lessons from Meta's AI agent leaking sensitive data to employees

Mar 20, 2026

2 Mins

Ontologies

Image description - An AI generated image of extruded frosted glass colour blocks representing the semantic layers of data and the connections between them

The State of Enterprise Semantic Layers: A 2026 Market Overview

Dom Selvon

Dec 15, 2025

15 Mins

Agentic Systems

Image description -

AI in Practice Survey

Ronan Forker

Nov 24, 2025

2 Mins

Trust

Image description - An AI generated image of a digital illustration depicting a hand shaking hands with digital data particles.

Why trust is the real barrier to AI value

Rad Parvin

Nov 17, 2025

2 Mins

In the Press

Agentic Systems

Trust

Image description - An light painting image representing the Meta logo

Tarek discusses the impact and lessons from Meta's AI agent leaking sensitive data to employees

Mar 20, 2026

2 Mins

Ontologies

Image description - An AI generated image of extruded frosted glass colour blocks representing the semantic layers of data and the connections between them

The State of Enterprise Semantic Layers: A 2026 Market Overview