We Need to Talk About the Insider Risk from AI

Dr Paul Martin CBE and Dr Sarah Mercer

8 January 20255 Minute Read

Emerging threat: organisations face growing risks from artificial insiders as well as human ones. Image: TA Design / Adobe Stock

As AI systems gain human-like capabilities, organisations must tackle emerging risks from AI insiders, drawing on insights from managing human insider risks to enhance security.

As AI systems increasingly undertake tasks previously performed by humans, organisations face a growing security risk from artificial insiders as well as human ones. This complex security risk is poorly understood. Fortunately, some of the approaches that protective security practitioners use for managing risks from human insiders are also applicable to AIs.

An insider is a person who betrays trust by behaving in potentially harmful ways. We trust someone by giving them access to things we value; the insider betrays that trust by exploiting, or intending to exploit, their legitimate access for unauthorised purposes. Replace ‘person’ with ‘entity’ and the definition works for AIs.

Human insiders can – and frequently do – cause harm in many ways. Think of a transgressive action and there will be an insider who does it: fraud; blackmail; theft of data, money or intellectual property; covert influencing; physical or cyber sabotage; violence; leaking; terrorism; espionage; and so on. In principle, AI insiders could do any of these.

We are using ‘AI’ here in the broadest sense of any current or future artificial system that has complex capabilities comparable to those of humans. The history of AI has been characterised by long periods of gradual evolution or stagnation punctuated by sudden bursts of dramatic change. Currently, ‘AI’ has become almost synonymous with Large Language Models (LLMs) or generative AI, with agentic AIs growing rapidly in prominence. Unlike LLMs, which passively answer questions, agentic AIs actively do things in the world, albeit mostly the virtual world. We can be reasonably certain that new and currently unforeseen forms of AI with even more paradigm-shifting capabilities will burst onto the scene sooner or later.

AI insiders will be even better at learning how to avoid detection: they can ingest more information and are better at spotting patterns within that information

With AI evolving in non-linear leaps and bounds, it seems almost inevitable that protective security practitioners will increasingly be required to defend organisations against AI insiders. The problem, however, is that protective security practitioners are mostly not thinking about AI, and AI experts are mostly not thinking about insider risk.

Personnel security – the conventional means of managing human insider risk – is a relatively underdeveloped domain of customs and practices, many of which have little solid basis in evidence. In the parallel universe of AI, attention is largely focused on the myriad ways in which AI can improve effectiveness and efficiency. Admittedly, consideration is being given to the security risks to those systems and the safety threats they might pose to human users. It appears, however, that hardly anyone is thinking about the particular problem of protecting organisations from active AI insiders. It is therefore unsurprising that few real-life examples have so far been identified and publicised. One recent exception was that of a robot ‘kidnapping’ 12 other robots.

Human insiders and AI insiders share many features. AIs are subject to bias. So too are humans, as shown by decades of research on psychological predispositions and cognitive biases. LLMs sometimes say things that are inaccurate or plain wrong. So do humans. The behaviours of AIs cannot be directly explained in terms of their software code because they are an emergent property of a complex system. The same is true for humans.

Like humans, AIs vary in their trustworthiness. Also like humans, AIs sometimes act inappropriately and unwittingly cause harm. They can also cause harm intentionally. For example, AIs could be subverted by external threat actors like hostile foreign states or criminals. The more sophisticated threat actors will use their own AIs to help them do this.

Subscribe to the RUSI Newsletter

Get a weekly round-up of the latest commentary and research straight into your inbox.

Human insiders are hard to detect. Capable insiders act covertly, and the best ones may never be found. The archetypal example is the spy working within a sensitive government organisation who secretly acts on behalf of a hostile foreign intelligence agency for years. AI insiders will be even better at learning how to avoid detection: they can ingest more information and are better at spotting patterns within that information. Researchers have found that some existing AI models are already capable of ‘scheming’ – that is, acting deceptively to pursue illegitimate goals while hiding their true capabilities and objectives. AI insiders will be harder to find than their human counterparts.

Humans have psychological vulnerabilities that are widely exploited by criminals and other threat actors. We are all susceptible to being socially engineered, defrauded or misled by false information. AIs are also vulnerable to manipulation and subversion, although the mechanisms by which this happens are different and less well understood. Personnel security practitioners struggle to identify valid diagnostic signs of emerging insider risk from human actors. With the current state of knowledge, it is even harder to know what to look for when trying to detect the early warning signs of AI insiders.

Physicality is a significant differentiator, at least for now. Compared with humans, most AIs have limited ability to act directly on physical objects, and therefore less scope to perform physically harmful actions like sabotaging infrastructure or murdering people. Their physical effects would have to be mediated through other mechanisms, such as infrastructure control systems. However, this gap is shrinking as AI-enabled autonomous robots become more capable of acting directly on their physical environment.

The best personnel security regimes recognise that insider risk is a systems problem requiring systems solutions. A simple model divides protective measures into three categories. Pre-trust measures are things to do before deciding to trust a person, such as pre-employment screening or ‘vetting’. In-trust measures are things to do after granting access, such as continuous monitoring or ‘aftercare’. Foundations are capabilities needed to underpin the system, such as governance and risk management. The same basic principles would apply to AI insiders.

In common with all technologies (apart from nuclear weapons), AI is dual-use. AI could also be used to help defend against both human and AI insiders. For instance, it could strengthen pre-employment screening of people by analysing large sets of open-source data. The pattern-detection capabilities of AIs could also help to identify the elusive diagnostic features of human and AI insiders.

In conclusion, the security risk from AI insiders is here, now, and overwhelmingly likely to grow. The separate worlds of personnel security and AI research must join forces if we are to understand and confront this risk.

© Paul Martin and Sarah Mercer, 2025, published by RUSI with permission of the author

The views expressed in this Commentary are the author’s, and do not represent those of RUSI or any other institution.

For terms of use, see Website Ts&Cs of Use.

Have an idea for a Commentary you’d like to write for us? Send a short pitch to commentaries@rusi.org and we’ll get back to you if it fits into our research interests. Full guidelines for contributors can be found here.

keywords

Topics

Research Groups