Yesterday, we were discussing when we could actually let AI agents auto-merge and publish code.
My take - probably not so soon.
My take - probably not so soon.
These screenshots are exactly why. It is the output from Claude Code when I asked it to add a feature.
Before you think it is a prompt and workflow issue, let me tell you that I have a comprehensive workflow that spawns 9 subagents to review code based on 9 best practices. Two of the subagents review React Native and UX/UI.
Yet, it still keeps making these mistakes:
-
UI does not respect screen safe area - the navigation bar overlaps with the status bar.
-
Layout misalignment issue.
LLM output right now is like a slot machine. Most spins land fine. But about 10% to 20%, it fails at the simplest layout.
The code compiles. The app runs. But visually, it is off.
For web apps, there is a test harness. You can now let Claude Code see your browser UI through extensions. It spots visual issues and self-corrects. This helps a lot.
But for mobile apps? I still haven’t come across a way to let Claude Code see the screen.
And that I think is the real bottleneck in AI coding now. I spent more than 50% of my time doing manual UI testing.
AI can write code that compiles. But right now, it can’t tell if the screen looks right.
Until an AI agent can see its own work the way a developer does, we probably still need a human UI tester.
Enjoyed this? Subscribe for more.
Practical insights on AI, growth, and independent learning. No spam.
More in AI Agents
Not all AI projects need data scientist and AI engineers.
One of the most common mistakes business leaders make in their AI project is getting the wrong team to build.
20 FAQs on AEO, GEO and the New SEO
Marketers, search has changed. AI is rewriting the rules. If you're not adapting, you're disappearing.
The "Dead Internet Theory" is officially dead.
It just went from theory to reality with its first official HQ: Moltbook. 👻
I’m honored to be invited to moderate an insightful roundtable on 𝗘𝗹𝗲𝘃𝗮𝘁𝗶𝗻𝗴 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗜 𝗔𝗴𝗲, hosted by The Ortus Club and MuleSoft — with an exceptional group of tech and data leaders across industries like banking, telco, healthcare, transport, finance, and travel.
We unpacked tough questions on:
Everyone tells you how easy it is to set up an AI agent with OpenClaw.
Nobody tells you how hard it is to maintain it.
Congrats to cohort #2 for surviving the "torture" of my Foundations of Claude Code workshop.
Since cohort #1, the feedback has been all over the place. Same workshop, very different reactions:
Not all AI projects need data scientist and AI engineers.
One of the most common mistakes business leaders make in their AI project is getting the wrong team to build.
I’m honored to be invited to moderate an insightful roundtable on 𝗘𝗹𝗲𝘃𝗮𝘁𝗶𝗻𝗴 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗦𝘆𝘀𝘁𝗲𝗺𝘀 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗜 𝗔𝗴𝗲, hosted by The Ortus Club and MuleSoft — with an exceptional group of tech and data leaders across industries like banking, telco, healthcare, transport, finance, and travel.
We unpacked tough questions on:
Everyone tells you how easy it is to set up an AI agent with OpenClaw.
Nobody tells you how hard it is to maintain it.
20 FAQs on AEO, GEO and the New SEO
Marketers, search has changed. AI is rewriting the rules. If you're not adapting, you're disappearing.
The "Dead Internet Theory" is officially dead.
It just went from theory to reality with its first official HQ: Moltbook. 👻
Congrats to cohort #2 for surviving the "torture" of my Foundations of Claude Code workshop.
Since cohort #1, the feedback has been all over the place. Same workshop, very different reactions: