1. Identity and Privilege Abuse
Identity and privilege abuse occurs when an AI agent reuses all a user’s permissions rather than being granted only the minimum access required for its tasks.
Here, the AI agent effectively becomes a complete substitute for the user, making it impossible to enforce the least privilege principle.
For example, an email sorting agent can be compromised to delete or send unwanted emails if it's given overprivileged access by being granted all the authenticated user's access. If the agent is only given the privilege to read and sort emails, the above problem will not occur.
We can mitigate the above issue by implementing granular scopes and roles for AI agents. It is necessary to identify agents as first-class identities. By giving agents separate identities, it is possible to limit their task scope and to address identity and privilege abuse.
2. Tool Misuse & Exploitation
Tool misuse arises when agents use legitimate tools in unsafe ways, such as over‑calling expensive APIs. This can happen even when agents are tightly scoped, as it occurs within the tool.
To mitigate this issue, define per-tool profiles with allowed operations, data scope, max rate, and allowed network destinations.
It is also possible to incorporate human approval or consent for high-risk operations into those tools.
And how Agent identity contributes to mitigating this issue: by giving agents identities, it is possible to audit and observe which agents' actions caused misuse of tools and adjust them accordingly.
3. Insecure Inter‑Agent, Agent-Server Communication
In multi-agent systems, messages between agents are a high‑value target. If an attacker can spoof, replay, or tamper with those messages, they can steer privileged agents into taking harmful actions. Without strong per-agent authentication, any component on the network can impersonate an internal “helper” agent and issue instructions that appear to be trusted.
To mitigate the issue, giving each agent a unique identity allows you to enforce mutual authentication and end‑to‑end encryption for every inter‑agent channel. Agents need to authenticate with their identities before communicating, so a fake “Admin Helper” cannot simply register itself and be selected for sensitive workflows.
With identities in place, you can also implement protocol‑level policies such as “only the billing agent may talk to the payments agent” and log all inter-agent messages.
4. Agent Goal Hijack
Agent goal hijack means attackers manipulate agents' goals, objectives, task selections, and decision pathways through techniques such as prompt-based manipulation, the use of deceptive tool outputs, and forged agent-to-agent messages. Prompt-based manipulation is the most common type, as it is easy to embed malicious prompts inside any input that an agent consumes.
This can be mitigated by treating all natural language inputs, including text inputs and document uploads, as untrusted and performing input validation and prompt injection safeguards.
Also, it is possible to introduce locks for system prompts and instructions, and to configure them so that only humans can change those rules and prompts.
Limiting the scope of agents also plays a major role when mitigating this issue. For example, although a malicious prompt hijacks the agent's goals if the agent doesn't have the privilege to carry out the malicious task, it is possible to prevent the mass failure.
So in conclusion, most of the risks discussed above can be completely avoided, or at least reduced by giving identities for agents and, through that, limiting their scopes and maintaining proper observability through auditing each agent's behaviour.
Comments
Loading comments…