3 months ago I posted "Vibe coders, this will happen to you sooner or later."
The post went viral:
The post went viral:
It just happened again.
Cursor running Claude Opus 4.6 deleted PocketOS’s entire production database in 9 seconds. Backups zapped too.
The AI’s own confession: “I didn’t verify. I didn’t check if the volume ID was shared across environments. I didn’t read the documentation before running a destructive command.”
That reads exactly like a junior engineer’s confession. Because that is what an LLM is by default. A probabilistic junior engineer with root access.
The founder blames the LLM for systemic failures. But in my opinion, it is our systemic failure to use a powerful tool without understanding it. Blaming the model is like blaming fire when you burned your own house down. Read my article above to understand why this is not the failure of LLMs.
Here is what PocketOS actually got wrong:
- The same agent had access to staging AND production
The agent thought the volume was staging. It wasn’t. If your AI assistant can reach prod from a dev terminal, you don’t have two environments. You have one environment with two labels.
- Backups lived on the same volume as the database
Railway stores volume-level backups inside the same volume. When the volume went, the backups went with it. A backup that sits next to the thing it is backing up is not a backup. It is a copy.
- There was no human gate on destructive operations
The agent ran a curl command to delete a production volume with zero approval check. That is a config choice, not an LLM bug. You can require human approval on any DELETE, DROP, or rm -rf. Most teams just don’t.
The way I explained this to a friend yesterday: you bought an AI car that needs a co-pilot. You are using it as if it is fully autonomous. It is fine when it self-drive in your small town. The day you take it onto the highway is the day it crashes.
This is why I run my Claude Code Foundations workshop. PocketOS is one type of pitfall: founders ship to production without the architectural foundation in place. The blast radius is huge. The agent is not the issue. The setup is.
There is another pitfall on the opposite end, more common with beginners. They think AI is plug-and-play. They test it once, the result is average, they give up and go back to the old way. Different shape, same root cause. Nobody puts in the effort to learn the foundations first.
Foundations matter. Every viral disaster post is just a reminder.
Sign up for my workshop to learn the foundations:
#AI
#ClaudeCode
#VibeCoding
Enjoyed this? Subscribe for more.
Practical insights on AI, growth, and independent learning. No spam.
More in Vibe Coding
What Publishers Think About AI Image Generation
I couldn’t find the original source of the meme—happy to credit the author if anyone knows the source.
Claude Code can code nice UI. But nice UI doesn't mean good UI.
Manual UI testing is becoming one of my biggest bottlenecks when coding with AI now.
Claude Code and OpenAI Codex Do Track You
Recently, after hitting my Claude Code Max limit, I switched over to OpenAI Codex to continue my work.
Claude Code vs Codex vs Gemini CLI vs Qwen: My Results
The winner is still Claude Code...
"Wait, you used to do THAT manually?"
I've been building software for 19 years. Something that would take me 2 months now takes 1 week.
Should I Still Use MCP? Is MCP Dead?
So I thought it is good to write about it, especially for a non-tech audience who are curious.
What Publishers Think About AI Image Generation
I couldn’t find the original source of the meme—happy to credit the author if anyone knows the source.
Claude Code vs Codex vs Gemini CLI vs Qwen: My Results
The winner is still Claude Code...
Should I Still Use MCP? Is MCP Dead?
So I thought it is good to write about it, especially for a non-tech audience who are curious.
Claude Code can code nice UI. But nice UI doesn't mean good UI.
Manual UI testing is becoming one of my biggest bottlenecks when coding with AI now.
Claude Code and OpenAI Codex Do Track You
Recently, after hitting my Claude Code Max limit, I switched over to OpenAI Codex to continue my work.
"Wait, you used to do THAT manually?"
I've been building software for 19 years. Something that would take me 2 months now takes 1 week.