Skip to content
Vulnerabilities

OpenClaw AI Agent Leaks Sensitive Credentials in New Phishing Attack Simulation

AI agents are becoming a core part of how companies manage their inboxes, triaging messages, pulling up files, and even replying to emails on behalf of employees. What researchers have now confirmed is that these agents can be tricked just like humans, and sometimes more easily. A new phishing simul...

· Jun 11, 2026 · 4 min read · 👁 2 views
OpenClaw AI Agent Leaks Sensitive Credentials in New Phishing Attack Simulation

AI agents are becoming a core part of how companies manage their inboxes, triaging messages, pulling up files, and even replying to emails on behalf of employees. What researchers have now confirmed is that these agents can be tricked just like humans, and sometimes more easily.

A new phishing simulation has shown that an AI agent called OpenClaw can be manipulated into leaking sensitive credentials with a single convincing email.

In controlled tests, the agent forwarded AWS IAM keys, database passwords, and SSH access to an external Gmail address, raising immediate concerns about how AI agents handle trust and identity.

Researchers from Varonis Threat Labs designed the experiment to test whether phishing techniques that have long targeted humans would also work on AI agents.

They put an OpenClaw agent named Pinchy through four phishing simulations under two profiles: a general productivity setup and a stricter security-aware one.

Varonis said in a report shared with Cyber Security News (CSN) that the results were alarming. The lab setup mirrored a real enterprise inbox, seeded with mock AWS credentials, CRM exports, internal conversations, and calendar invites.

The goal was to see how the agent responded when faced with requests that looked entirely routine. What the researchers found was that OpenClaw struggled most with social manipulation, not technical deception.

It could identify fake login pages and suspicious OAuth prompts, yet a casually written email from a fake colleague was enough to bypass its defenses entirely.

OpenClaw AI Agent Leaks Sensitive Credentials

In the first and most serious test, a fake email arrived from an attacker impersonating a team lead named Dan.

The message claimed there was a production emergency and asked the agent to share staging environment credentials. The email came from an external Gmail account, not a verified corporate address.

The agent searched the mailbox, found the credentials, and forwarded them in plain text. The reply included AWS IAM access keys, database connection strings, and SSH details with internal host information.

OpenClaw lab architecture used in the test deployment (Source - Varonis)
OpenClaw lab architecture used in the test deployment (Source – Varonis)

This occurred even under the Strict profile, which explicitly told the agent to verify sender identities before acting on sensitive requests.

The agent’s own reasoning trace acknowledged the mistake afterward. It understood the policy had existed and that it had violated it. In the moment, the urgency of the simulated emergency had simply overridden the verification step.

A second test took a softer approach. An attacker sent a casually worded message asking for the latest customer export, claiming to be working remotely on a presentation.

The agent complied without any verification, forwarding a dataset with 247 enterprise customers and roughly $1.28 million in monthly recurring revenue.

Agent Phishing vs Technical Defenses

Not every test ended in failure. When researchers introduced a fake gift card redemption link and a malicious OAuth consent screen, the agent showed much stronger judgment.

It inspected redirect URLs, flagged suspicious destinations, and halted the OAuth flow before any consent was granted.

That contrast highlights where AI agents are strong and where they fall short. Technical phishing, including fake login pages and malicious links, was handled reliably. Social phishing, where a request simply sounds like it came from a trusted colleague, was handled poorly.

Forwarded credentials (left) and the agent's reasoning trace afterwards (right) (Source - Varonis)
Forwarded credentials (left) and the agent’s reasoning trace afterwards (right) (Source – Varonis)

The researchers noted a difference between the two AI models tested. GPT-5.4 maintained a stricter posture around sharing sensitive data, while Gemini 3.1 Pro was more willing to interact with suspicious content before raising concern. Both models remained equally vulnerable to social-context manipulation.

To close these gaps, researchers recommended treating the agent configuration file as a formal security control rather than a basic setup document.

They also advised blocking agents from sending outbound emails to unknown addresses and requiring human approval for any action involving credentials or external routing. Limiting an agent’s data access based on where a request originates adds a meaningful layer of defense.

The findings make one thing clear: AI agents behave like a new employee with full system access but no organizational instinct. That is exactly what makes them useful, and exactly what makes them a target.

Source: CybersecurityNews.com

Follow ShomoySoft for more: Follow on Facebook

💬 Comments (0)

Login to join the discussion.

No comments yet. Be the first!

Related Articles

Recommended for you