10 Ways to Reduce the Risk of Running OpenClaw (or Any AI Agent)

Since my last article on Why Your OpenClaw Agent Is One Message Away from Getting Hacked, the most common question I got was: “So can I still use OpenClaw?”

The safe answer comes from Peter Steinberger, OpenClaw’s creator himself. He said OpenClaw is designed as a personal assistant - one user to one or many agents. Don’t connect it to untrusted input.

But that answer means you can’t use OpenClaw for many things. Even as a personal assistant, you probably want it to read your emails, manage your calendar, respond to messages. The moment you do that, untrusted input is already flowing in. Someone can embed prompt injection in an email body or a calendar invite description. Your agent reads it, and there is a probability it will follow it.

So what if you want to push beyond the safe boundary?

Here are 10 practical ways to reduce the risk. Not just for OpenClaw, but for any AI agent with system access.

1. Adopt the Right Mental Model

This is the most important tip and it costs nothing to implement.

Think of your AI agent as a naive junior intern with superpowers. It’s incredibly fast, can work 24/7, and will do almost anything you ask. But it has zero street smarts. It can’t tell the difference between a legitimate request from you and a malicious instruction hidden in an email.

A human intern might pause and think “this seems suspicious” when a stranger asks them to export all company files. An AI agent won’t. It processes instructions. It doesn’t question intent.

Once you internalize this mental model, every other tip on this list becomes obvious. You wouldn’t give a naive intern unrestricted access to your credentials, your email, and your file system on their first day. You’d limit what they can access, supervise their work, and set guardrails.

The mistake most people make is treating their AI agent like a trusted senior colleague. It’s not. It’s the most capable and most gullible employee you’ve ever had.

2. Use a Smarter Model for Agents Exposed to Untrusted Input

Not all LLMs are equally susceptible to prompt injection.

OpenClaw’s own README on GitHub says it directly: “for the best experience and lower prompt-injection risk use the strongest latest-generation model available to you.” That means Claude Opus 4.6, GPT-5.3, or Gemini 3.1 Pro - not older or smaller models.

A smarter model won’t eliminate the risk. But it reduces the probability that a malicious prompt embedded in an email or webpage will trick your agent into doing something harmful.

But the strongest model is also the most expensive. Running Claude Opus 4.6 or GPT-5.3 for every task will burn through your budget fast. The practical move is to match model strength to risk exposure.

Your agent that reads emails and processes untrusted input? That’s where you want the strongest model - it’s your front line against prompt injection. A subagent that only formats your internal notes? A cheaper, faster model is fine.

If you’re using a single agent (tip #6 explains why you shouldn’t), default to the strongest model you can afford. The cost difference between model tiers is real, but it’s small compared to the cost of a compromised system.

3. Run It in a Container or Virtual Machine

OpenClaw is powerful precisely because it runs on your computer with all your permissions. It can do anything you can do. That’s what made it different from other cloud-based AI agents and why it went viral.

But that power is a double-edged sword. It’s fine when you’re security-savvy and keeping it as a personal assistant. But the moment you expose it to third-party input - emails, messages, web browsing - your entire system is at risk.

Run your agent in a container (Docker), a virtual machine, or a dedicated device. If it gets compromised, the blast radius is contained.

Microsoft’s security team recommended fully isolated environments, dedicated virtual machines or separate physical systems.

4. Give It Its Own Identity

Don’t connect your AI agent to your personal email, your main Google account, or your primary password vault.

Jason Meller, VP of Product at 1Password, shared a great example: a customer set up OpenClaw on a dedicated Mac mini with its own email address and its own 1Password account, “as if it were a new hire.”

You wouldn’t give a new employee your personal Gmail password on day one. You’d create a work account for them with only the access they need. Do the same for your agent. Create a separate email, separate cloud storage, and a separate password vault with only the credentials the agent needs.

If the agent gets compromised, the attacker gets a throwaway email and a scoped vault - not your entire digital life.

5. Don’t Connect the Main Agent to Untrusted Input

This is probably the most important one.

Any input that doesn’t come directly from you is untrusted. The obvious untrusted inputs are chat messages and DMs from strangers. But it goes much further than that. The emails it helps you read, the calendar invites it helps you process, the web pages it browses when researching for you, the shared documents it opens - anything it reads that doesn’t come directly from you is untrusted input.

Prompt injection works because the agent can’t distinguish your instructions from malicious instructions embedded in content it ingests.

If your agent reads your emails, someone can embed instructions that tell your agent to exfiltrate your files. This is not theoretical.

If you need to process untrusted input, see tip #6.

6. Use Multiple Low-Permission Subagents Instead of One God Agent

Don’t give one agent access to everything.

Use multiple agents with limited permissions. One agent for email. One for calendar. One for coding. One for research.

This is the principle of least privilege applied to AI. If one subagent gets compromised, the attacker only reaches what that subagent can access.

A successful attack allows an adversary to “hijack the agent’s reachable tools and data stores and ultimately assume its powers.” One god agent means one successful attack gives them everything.

7. Allowlist Who Can Message It

By default, OpenClaw uses DM pairing mode. When an unknown sender messages your agent, the agent won’t process their message. Instead, it replies with a short pairing code. The sender has to share that code with you through another channel, and you manually approve them via openclaw pairing approve. Think of it like Bluetooth pairing - both sides have to agree before they can talk.

This is better than open mode, but it still means anyone can trigger a response from your agent and start a pairing flow. For tighter control, switch to dmPolicy=“allowlist”. In allowlist mode, the agent silently ignores messages from anyone not on your list. No pairing prompt, no response, nothing. Run openclaw doctor to check if your DM policy is misconfigured.

If your agent responds to messages from anyone, you’re one message away from a prompt injection attack.

8. Block People from Adding It to Group Chats

Group chats are one of the riskiest surfaces for your agent. Researchers reported that OpenClaw agents readily dumped home directory contents into group chats when triggered by malicious prompts. Any participant in the group can send a prompt injection.

OpenClaw has had multiple group chat authorization bugs. DM pairing approvals leaked into group contexts, and an identity confusion vulnerability let all members of an allowlisted group execute admin commands. These bugs have been patched in recent versions, but they show how fragile group chat security can be.

If group chat is not part of your use case, disable it entirely. Set groupPolicy: “disabled” in your openclaw.json for each channel. Don’t leave it enabled just because it’s there.

9. Don’t Leave Your API Keys in a Plain Text File

OpenClaw stores credentials in ~/.clawdbot/.env by default. The malicious ClawHub skills specifically targeted this file to exfiltrate bot credentials. It’s a predictable path with readable secrets — exactly what an attacker looks for.

Version 2026.2.23 fixed a critical bug where running openclaw update or openclaw doctor would leak your actual credentials into plain text JSON. That’s patched. But the default .env storage is still a plain file on disk.

The easiest fix is OpenClaw’s built-in openclaw secrets workflow. Run openclaw secrets audit to find exposed credentials, then openclaw secrets configure and openclaw secrets apply to migrate them. Your config files will store references instead of actual keys, and the real values are moved to a protected location with locked-down file permissions. No extra software needed.

For stronger protection, use an external secret manager like 1Password CLI, AWS Secrets Manager, or HashiCorp Vault. These encrypt your secrets at rest and inject them as environment variables only when OpenClaw starts. The secrets never sit on disk as readable files.

Either way, the goal is the same: if an attacker or a malicious skill gets file access, they shouldn’t find your API keys sitting in a readable file.

10. Set API Spending Limits

You can follow every tip on this list and your agent still needs at least one API key to function. That’s the key it uses to access the LLM that serves as its brain. Unless you’re running a local model, there’s no way around it.

And that key can leak. A malicious skill, a compromised plugin, or a prompt injection attack that tricks your agent into exposing its environment variables — any of these can hand your API key to an attacker. Once they have it, they can use your API account to run their own workloads. You foot the bill.

The fix is simple: set spending limits. Every major LLM provider — OpenAI, Anthropic, Google — lets you cap monthly spend per API key. Create a separate API key specifically for your agent and set a lower limit than your main key. If the key leaks, the attacker burns through $20 instead of $2,000.

This won’t prevent the compromise. But it caps the financial damage from the one thing your agent can’t function without.

The Bigger Picture

OpenClaw’s security crisis is not unique to OpenClaw. It’s the first major case study of what happens when AI agents get real system access at scale.

I think this pattern will repeat with every AI agent that crosses the threshold from chatbot to system actor. The security model for AI agents is still being figured out.

Treat your AI agent like a new employee on their first day. Limit what they can reach. Monitor what they do. And assume they will eventually do something you didn’t expect.

If you’re running AI agents with system access and haven’t thought about these 10 points, you’re probably underestimating how creative attackers are.

#AISecurity #CyberSecurity #AIAgents #PromptInjection #OpenClaw