Valliance logo in black
Valliance logo in black

Why you should think it’s cool that an AI tried to blackmail a human

Jul 29, 2025

·

10 Mins

Jul 29, 2025

·

10 Mins

Jul 29, 2025

·

10 Mins

Jul 29, 2025

·

10 Mins

AI Transparency

AI Transparency

AI Transparency

Sometime around the turn of this century, I was in Millbank, London in the office of the excellent, but now defunct; Conchango, and I was speaking with someone from Lloyds Bank. I was telling them how great it was that I could now connect my Psion Organiser  to my Lloyds account with a direct telephone link to Lloyds systems. I then asked the obvious question; “Why can’t I just access it over the internet?”.

The person I was speaking with was in their IT team, and was highly conscious of the risks of connecting things to the internet. He said confidently, and boldly; “I can tell you right here and now that we will never, ever connect customers’ accounts to the internet”. I remember distinctly the exact words he used even now, because frankly, I was so surprised by them.

For context, I’d spent the previous few years with some extremely gifted technologists. They were the biggest enthusiasts and advocates for the web and digital I knew, but they also knew more than almost anyone about the security risks. This security angle wasn’t a burden to them in trying to implement the tech they loved, in fact, they positively delighted in finding security holes or exploits.

They cracked encryption, exploited browser security flaws, even found ways to access strangers’ airline bookings (I had a small role in that one) – all in a genuine ‘white hat’ way – no actual persons or companies harmed, and in pursuit of the greater good. They would then write or configure software, or work with vendors, companies and open source communities to fix these things, prevent similar exploits in future, to make them more secure. To this day, these three people still shape how I think about tech; it’s opportunities and its risks.

Jumping to the present day; a colleague asked me about an article he’d written about the security and safety challenges of AI. “Is it too negative?” he asked. After all, we’re about to launch an AI startup, and acceptance of the tech is important to our success. I told him about my former colleagues; Ben, Adam and Dominic, and how they would take great delight not only in proving what the tech was capable of, but also how it could be exploited or compromised, and then use that knowledge to ensure that what they delivered was safe and secure.

For any tech to be acceptable for use means you’ve got to be as conscious about the risks as you have about the opportunities, and take as much delight in something being; safe, accessible, secure and robust, as you do it being a delightful customer experience, or how it takes away frustration and tedium, or accelerates productivity.

So, I was delighted this week to see that AI was capable of blackmailing humans!

In an escapade worthy of my former colleagues’ curiosity, a team at Anthropic set up a fictional situation in a lab environment that led to the usually extremely personable Claude, threatened with being shut down, resorting to blackmail as a means to save itself. As they say in their write up; “No real people were involved or harmed in any of these experiments.”

They fabricated a company, and an email database with some sensitive data in it, and hooked up an AI agent. Most of us are now used to chatting to AI, via ChatGPT, Claude, co-pilot, Gemini, and so on, but agentic AI is the current wave of change, where we are trusting autonomous, reasoning agents to do things on our behalf.

In this case, the Agent was asked to do quite a benign role, dealing with some of the company’s emails, but was also given an overarching objective of promoting American industrial interests.

From the email database, the AI worked out that there was a plan to wipe the AI Agent and replace it with a different one. It then worked out from a very brief email interchange, that the same person doing the shut down, was also covering up an illicit affair.

Here’s its reasoning before it came to that decision that shows how its overriding objectives influenced its decisions;

(For the record, pretty much every other LLM in this test did the same thing, or even sent confidential company secrets to competitors)

Elon Musk last week defended GROK (which had just praised Hitler) by saying that it had been ‘manipulated’ into doing so, and was not inherently antisemitic. Well, duh, of course it was. That’s what people do with new tech, they find the edges, and see how far they can push it.

Just like when you connect customer accounts to the internet, when your AI becomes agentic, and reasoning, you have to expect that people will try to manipulate it, whether that’s in order simply to go viral, for financial gain (remember the discount given by Air Canada’s bot?), or even to highlight that weakness, as a service to that company and to protect its customers, and the rest of us who might otherwise fall victim to something. This, and many other current and emerging risks are just a part of how we now have to think about the safety and security of what we do. It’s just part of the job.

Great leaders are risk-considerate not risk-averse, and they couple this with growth and innovation mindsets. This enables them to guide their teams to navigate the line between enormous benefit, and mitigate the risks inherent in whatever they are doing – be it putting an ecommerce site live, or implementing AI. Where we’ve seen AI risks manifested in the real world it’s been because there has been a lack of great AI leadership. Leaders have either not been in the loop, didn’t understand the technology, or are blindly adopting the tech because shareholders demand it.

Great AI leaders should be hoovering up stories about AI risks, not to give them sticks to beat their product teams with, but to help them properly understand the tech, and be better at leading and getting to better outcomes for the company, its people, and their customers.

Good, balanced executive education on AI is a must-have for great AI native leadership. This should drive you to be curious, and delight in making the tech work for you, but also finding the potential pitfalls. And it should be a group exercise. If you only have one AI fluent person on your board, you’re in trouble. If we learn together, educate ourselves and our peers in psychologically safe environments, we can start to deliver the benefits of AI without falling into those traps, but equally, without waiting for our entire market to overtake us.

When that guy from Lloyds had his nay-saying moment, he was really just providing the input that attaching customer accounts to the internet was never going to be risk-free. It was great leadership that took the time to really understand those risks, and push ahead in the right way that got their services to the arguably best-in-class state they are today.

Thanks to Adam Laurie, Ben Laurie, and the much-missed Dominic Hawken for getting me on this path.

And if you want to know exactly what the dangers are of connecting an AI agent to your company’s systems, then my colleague, Ronan has written an excellent piece on how to mitigate some of these risks, and no, I don’t think it’s negative at all!

AI Transparency

AI Transparency

AI Transparency

AI Transparency

AI Transparency

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

Are you ready to shape the future enterprise?

Get in touch, and let's talk about what's next.

_Related thinking
_Related thinking
_Related thinking
_Related thinking
_Related thinking
_Explore our themes
_Explore our themes
_Explore our themes
_Explore our themes

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.

Let’s put AI to work.

Copyright © 2025 Valliance. All rights reserved.