Valliance logo in black
Valliance logo in black

Not another chatbot. The journey from calculator to colleague

Nov 11, 2025

·

10 Mins

Nov 11, 2025

·

10 Mins

Nov 11, 2025

·

10 Mins

Nov 11, 2025

·

10 Mins

AI Transparency

AI Transparency

AI Transparency

Imagine hiring the brightest consultant in the world. This chap is capable of synthesising market reports in seconds, answering any technical question with precision, and summarising entire domains on demand. But there’s a catch: this consultant has amnesia. Every time you speak, they forget everything they knew before. No memory of your company, your goals, your last conversation. Each interaction starts from scratch.

This is how most enterprises are currently engaging with large language models (LLMs).

Despite the hype, today’s LLM usage remains largely stateless and transactional. An LLM is prompted, it responds, and the interaction ends. While this is sufficient for isolated tasks - generating content, translating language, writing code snippets - it falls short in the context of enterprise transformation, where complexity demands continuity, reasoning, and coordination.

The limits of stateless intelligence

This "prompt-in, response-out" pattern introduces several critical bottlenecks for enterprise use:

  • Lack of context continuity: Without persistent memory, models must be re-fed the same context repeatedly, which becomes both expensive and error-prone as complexity scales.

  • Manual orchestration: Human operators are forced to string together outputs, transforming themselves into workflow glue between disconnected model calls.

  • Shallow reasoning: Multi-step, conditional, or exploratory reasoning is difficult to achieve reliably without some form of planning and memory.

In essence, enterprises are wielding a Ferrari engine without a steering wheel or gearbox. There is raw power there but there’s no coordination. No navigation.

The promise of cognitive workflows

The next evolution is already underway: agentic systems - LLMs wrapped in structure, memory, and purpose - capable of navigating multi-step tasks, adapting over time, and working in coordination with other tools and agents.

This shift is not just technical. It redefines how organisations work with AI. We’re moving from AI-as-a-tool to AI-as-a-colleague. They’re an active participant in workflows, with the ability to remember, plan, and collaborate.

In this piece, we’ll unpack the technical progression from reactive tools to proactive systems, introduce a five-stage maturity model for enterprises on this journey, and offer practical steps for leaders aiming to operationalise agentic AI. Whether you’re an innovation lead exploring AI’s potential, a CTO building cognitive infrastructure, or a venture partner seeking differentiated advantage, this transition marks a turning point in enterprise capability.

The technical evolution: From tools to agents

As is my usual wont, it’s time to get a bit technical.

The initial wave of LLM adoption was driven by experimentation. We’ve been treating these models as smart tools capable of generating language, answering questions, or writing code on demand. But as their potential deepens, a growing awareness has emerged: true enterprise value lies not in using LLMs, but in integrating them into intelligent, evolving systems. This section explores the shift from stateless tools to stateful, reasoning-driven agents, and the infrastructure underpinning this transformation.

From stateless tools to stateful systems

At their core, most LLMs are stateless: they do not retain any information between calls. Every interaction must be re-initialised from scratch, often by stuffing the input prompt with a condensed version of the problem, background, and user intent. This is both inefficient and fragile. A single error in prompt construction can derail the entire output. One-shotting prompts is more luck than science. There’s pretty much always refinement required.

By contrast, stateful systems maintain context over time. They remember prior interactions, adapt based on user behaviour, and learn from new inputs. In practice, this enables:

  • Task continuity: The agent doesn't need to be reminded what it's working on.

  • Personalisation: Responses reflect long-term user preferences.

  • Workflow orchestration: Outputs from one task inform the inputs for the next.

Architectures of memory: How agents "remember"

To move from tool to agent, memory is non-negotiable. Drawing from cognitive science, we can categorise memory into three core types:

  • Episodic memory: Logs past interactions and conversations. For agents, this could include chat history or previous tool outputs.

  • Semantic memory: Encodes factual knowledge, concepts, and world understanding. This is where vector databases and knowledge graphs come in, offering structured and retrievable stores of meaning.

  • Procedural memory: Captures how to do things. In agentic systems, this is expressed as plans, workflows, and action sequences. Also hooked into the Ontology of the enterprise.

Building agents requires combining all three so they can reason across past events, domain-specific knowledge, and repeatable skills.

Reasoning chains and planning mechanisms

One of the biggest limitations of vanilla LLM use is that they’re often reactive, not reflective. They generate a response based on the prompt, but don’t internally plan, reflect, or revise. Agents change that. The research capabilities delivered by the larger players like Anthropic and OpenAI aren’t LLMs, but agents on top of LLMs.

Through mechanisms like Chain-of-Thought (CoT) reasoning and tree-of-thoughts, agents simulate cognitive planning: decomposing problems into sub-tasks, evaluating intermediate steps, and adjusting strategy dynamically.

Planning agents often rely on recurrent planning loops: observe the current state, generate the next action, update memory, repeat. In more advanced systems, planners can even invoke sub-agents—specialist modules that handle search, summarisation, or validation, enabling modular and scalable workflows.

Technical deep-dive: The stack behind agentic workflows

Let’s walk through the typical evolution of an AI system architecture in this context:

1. RAG (Retrieval-Augmented Generation)

RAG pairs a stateless LLM with a retrieval system, such as a vector database (e.g. Chroma, Qdrant, Weaviate, Pinecone, LanceDB). Instead of overloading the prompt, relevant context is retrieved dynamically using similarity search.

🟢 Benefit: Expanded context without hitting token limits

🔴 Limitation: Still lacks memory or multi-step reasoning

2. Multi-Agent Orchestration

In more advanced setups, multi-agent systems divide labour among specialist agents:

  • A planner agent decomposes the task

  • A retriever agent fetches data

  • A validator agent checks quality

  • A tool-use agent interacts with APIs or systems

These agents coordinate via an orchestration layer, often a framework like LangChain, CrewAI, or DeepAgents, that handles task routing, state management, and agent collaboration.

🟢 Benefit: Scalable, modular systems

🔴 Limitation: Requires robust architecture and tuning

3. Autonomous Workflows

The endgame is fully autonomous cognitive workflows, where agents operate continuously in service of a goal:

  • Input: High-level objective (e.g. “Audit our supplier ESG risk exposure”)

  • Output: Complete, validated report with citations and next steps

  • Behaviour: Persistent memory, reasoning across time, cross-agent communication, dynamic replanning

These systems blur the line between “process automation” and “cognitive collaboration”.

The rise of semantic infrastructure

To support this evolution, a new kind of backend is emerging: semantic infrastructure. This includes:

  • Vector databases (e.g. Chroma, Qdrant) for similarity-based recall

  • Knowledge graphs (e.g. Neo4j, Stardog, RDF stores) for structured reasoning and relation mapping

  • Ontologies for domain-specific logic and constraints (Palantir, OWL, SHACL, or proprietary structures)

Together, they enable reasoning over structured and unstructured data, connecting the dots between isolated facts, and making agents not just reactive, but truly knowledgeable.


The transition from stateless tools to agentic systems is not just a matter of improving outputs. It’s a re-architecture of how intelligence is embedded in enterprise systems. With memory, planning, tool use, and semantic understanding, agents become more than model wrappers. They become co-workers, orchestrators, and decision participants.

In the next section, we’ll introduce a maturity model to help enterprises understand where they sit on this journey. We can then really start to see how to move from AI-as-a-tool to AI-as-a-colleague.

The enterprise AI maturity model

Stage

Capability

Key Tech

ROI Potential

1. Calculator

Prompt-only

LLM APIs

Individual productivity

2. Assistant

Context-aware

RAG, vector DBs

Faster research, lower support costs

3. Analyst

Reasoning & tools

Function calling, planning

Automated insight & analysis

4. Collaborator

Memory & adaptation

Agent memory, identity

Workflow augmentation

5. Colleague

Autonomy & goal pursuit

Orchestration + semantics

Enterprise-level transformation

Most enterprises are still in the early stages of working with AI. They’ve experimented with LLMs, perhaps implemented a few automations, and are wondering: Where do we go from here? This maturity model offers a structured path forward.

It outlines five progressive stages of evolution—each representing a shift in how AI is deployed, integrated, and ultimately, how it contributes to business value.

1. AI as a Calculator: Basic Prompt Engineering

At this stage, LLMs are used in isolation. Like a more sophisticated command-line interface. Use cases tend to be tactical and disconnected: writing content, summarising text, generating code snippets, or translating languages.

There is no memory, no orchestration, no understanding of process or continuity. These models are reactive, not reflective.

Technical Requirements

  • API access to a foundation model (e.g. OpenAI, Anthropic)

  • Basic UI wrapper or IDE integration

  • Prompt libraries or playgrounds

Organisational Readiness Indicators

  • Central innovation team or tech enthusiasts piloting use cases

  • No formal governance or training

  • Little cross-team coordination

ROI Expectations

  • Quick wins

  • Productivity boosts at the individual level

  • Low infrastructure costs, but limited strategic impact

🟠 Risk: Easy to stall here. We’re treating AI as novelty rather than core capability

2. AI as an Assistant: Retrieval-Augmented Generation (RAG) and Context Injection

Capabilities and Limitations

LLMs are now paired with contextual data. At this stage it’s through vector databases or document repositories. Now we’re allowing systems to bring in domain-specific knowledge on demand, expanding the LLM’s “working memory”.

Use cases include customer support assistants, internal knowledge bots, or legal document Q&A.

However, the systems are still stateless and shallowly integrated. Memory resets with each interaction, and reasoning is limited to what can be retrieved and synthesised in a single step.

Technical Requirements

  • Vector database (e.g. Pinecone, Weaviate)

  • RAG framework (LangChain, LlamaIndex)

  • Content ingestion pipelines

  • Prompt templating and chunking logic

Organisational Readiness Indicators

  • Early AI platform team forming

  • Internal security/legal reviews starting

  • Fragmented experimentation across business units

ROI Expectations

  • Reduced support burden

  • Accelerated research and discovery

  • Visible user-facing enhancements

🟡 Note: Retrieval improves accuracy and relevance, but not true reasoning

3. AI as an Analyst: Multi-step Reasoning with Tool Use

Capabilities and Limitations

AI is now capable of chaining thoughts and invoking tools. Here we’re starting to see our systems moving from answers to analysis. Systems can plan multi-step actions, perform calculations, query APIs, and explore solution spaces.

Agents emerge here as planners and operators, working in loops: plan → act → observe → replan.

Still, orchestration is often manually configured, and memory is ad-hoc or brittle. Scaling remains a challenge due to lack of modularity or governance. FinOps starts to take a front seat as token usage is increasingly a concern.

Technical Requirements

  • Tool-use frameworks (e.g. OpenAI function calling, CrewAI, LangGraph, N8n)

  • Planning logic (e.g. ReAct, CoT, ToT)

  • Tool wrappers and API bridges

  • Structured logging and observability

Organisational Readiness Indicators

  • Dedicated budget for AI workflows

  • Engineers or analysts building agent pipelines

  • First experiments in replacing human-in-the-loop research tasks

ROI Expectations

  • Cost reduction in knowledge work

  • Increased decision accuracy

  • Insight generation at scale

🟢 Enabler: Cross-functional teams can now build and reason with domain knowledge

4. AI as a Collaborator: Persistent Agents with Memory

Capabilities and Limitations

Agents now persist over time, maintaining episodic memory, updating semantic understanding, and learning from interactions. They’re capable of reasoning across sessions, adapting to users, and tracking task state.

This unlocks personalisation, long-horizon workflows, and multi-agent collaboration.

Technical complexity increases significantly: you need storage and recall strategies, memory pruning, identity handling, and lifecycle management.

Technical Requirements

  • Memory layers (e.g. Redis, Milvus, or hybrid graph+vector stores)

  • Agent identity resolution

  • Ontological structures or semantic schemas

  • Agent lifecycle/state management

Organisational Readiness Indicators

  • Formal agent governance and design roles

  • AI ethics, compliance, and observability policies in place

  • Integration into core systems and workflows

ROI Expectations

  • Reduced human overhead for repetitive tasks

  • AI becomes a team augmentation layer

  • Step-change in workflow efficiency

🔵 Turning point: From “AI helping humans” to “humans and AI working together”

5. AI as a Colleague: Autonomous Cognitive Workflows

Capabilities and Limitations

This is the frontier: autonomous agents coordinating with each other and systems and employees to pursue objectives, not just answer queries.

Agents are now embedded within business processes. They can plan complex projects, react to dynamic inputs, invoke and supervise other agents, and escalate only when needed. Systems reflect intentionality, goal pursuit, and adaptive behaviour.

Limitations now shift from technical to cultural: trust, control, governance, and collaboration norms.

Technical Requirements

  • Orchestration layer with autonomous planning (e.g. LangGraph, DeepAgents, DSPy)

  • Semantic reasoners and ontologies for grounded logic

  • Real-time memory sync, logging, audit trails

  • Secure toolchains and enterprise data integration

Organisational Readiness Indicators

  • AI embedded in enterprise architecture strategy

  • Change management frameworks for AI-led operations

  • Incentives aligned around AI co-ownership and human-AI hybrid teams

ROI Expectations

  • Autonomous cost centres for process execution

  • Competitive advantage through faster decision loops

  • Platform-level efficiencies across departments

🟣 Strategic imperative: Enterprises at this stage define—not follow—the frontier

Future horizons: What’s next

We are at the inflection point between useful AI and transformative AI. As agentic systems evolve, new architectural patterns are beginning to emerge. These systems are ones that blend reasoning, memory, and planning with ever-deeper integrations into the enterprise fabric.

Cognitive architectures are going multimodal and multimind

Next-generation agentic systems are increasingly:

  • Multimodal: Incorporating vision, speech, and structured data to reason across input types, not just text.

  • Multimind: Coordinating multiple specialist agents (retrievers, validators, planners, actuators) into decentralised teams, akin to human departments working toward a common goal.

Frameworks like LangGraph, DSPy, and DeepAgents are pioneering the idea of composable cognition. This speaks strongly to my roots in the composable enterprise arena. Where plans, goals, tools, and context are stitched together dynamically, not hardcoded. These architectures allow for adaptable behaviour and continuous learning, rather than rigid flows. What we saw in the advent and maturation of the MACH ecosystem, we’re seeing again in the Agentic world.

Symbolic + neural = structured reasoning at scale

The once-parallel paths of symbolic AI (rules, logic, ontologies) and neural AI (LLMs, embeddings) are now converging. Symbolic approaches offer rigour, traceability, and compliance. Neural models bring language fluency and generalisation.

Together, they enable:

  • Grounded reasoning that adheres to business logic and constraints

  • Compliance-aware automation using structured domain knowledge

  • Explainable outputs with contextual justification

Enterprises that can bridge these modalities, via semantic layers, knowledge graphs, and ontologies, will unlock AI systems that reason like analysts, act like experts, and learn like colleagues. Newer and more capable frontier models are making this easier and a more viable reality than ever before.

Strategic advantage Is shifting

The new frontier is no longer access to models. It’s the ability to build agentic infrastructure that is:

  • Secure by design

  • Contextually grounded

  • Observable

  • Able to learn and reason in enterprise-specific domains

Enterprises that succeed here will not just improve efficiency. They will reshape their operating model. AI will cease to be a support function and become a core capability, embedded in strategy, delivery, and governance.

In short, the winners of the next decade will not be those who “adopt AI,” but those who re-architect their organisations to think with it.

From pilot to platform

The shift to agentic AI is not a technical upgrade. Similar to the pivot from monolith to MACH, it’s a strategic transformation. Whether you’re at the beginning or already piloting multi-agent systems, now is the moment to ask:

Key questions for leadership teams

  • Are we building tools or enabling intelligence?

  • Do we treat AI as a project, or as a platform for transformation?

  • Where in our organisation would persistent, reasoning agents deliver the highest leverage?

First steps by maturity level

  • Calculator/Assistant: Build a central knowledge infrastructure (vector DBs, document pipelines). Begin capturing usage data to identify repeatable patterns.

  • Analyst: Introduce tool use and reasoning chains. Establish observability for agent workflows and model quality.

  • Collaborator/Colleague: Formalise memory, governance, and agent lifecycles. Align incentives and metrics to support human–AI collaboration.

Why act now?

The window for strategic differentiation is shrinking. Foundation models are commoditising. What will remain defensible is how you orchestrate them: the workflows, the knowledge, the decisions.

You don’t need to build everything at once. But you do need to start with intent. You need to design for the organisation you want to become.

AI Transparency

AI Transparency

AI Transparency

AI Transparency

AI Transparency

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

_Related thinking
_Related thinking
_Related thinking
_Related thinking
_Related thinking
_Explore our themes
_Explore our themes
_Explore our themes
_Explore our themes

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.