Claude's Mistakes
An AI Engineer's Perspective
Marc Baumholz
AI Engineer @ Flip | Founder, START Stuttgart
CS Bachelor | Cognitive Science Master
01
Who
am I?
Marc Baumholz
AI Engineer @ Flip
AI Engineer @ Flip
Building the AI agent for frontline workers. Handling orchestration: which tool or agent gets called, when, and why.
Founder, Start Stuttgart
Founded the startup initiative, connecting tech founders and pushing innovation in the Stuttgart ecosystem.
CS Bachelor & Cognitive Science Master
Computer Science fundamentals combined with an understanding of how humans think and interact with systems.
AI Power User Since Day One
Tested GPT, Cursor, Bolt, Lovable, Replit, Manus, and many more. Heavy AI user from the very beginning.
Claude's Mistakes | Marc Baumholz
1 min | 2 / 12
02
My Journey to Claude
From tool-hopping to deep integration
2023
GPT Era
Started with ChatGPT.
First AI-assisted coding.
Mind officially blown.
2024
Tool Hopping
Cursor, Bolt, Lovable,
Replit, Manus and more.
Searching for the right fit.
2025
Claude Click
Claude Code clicked.
Plan mode, MD files,
hooks. Real workflow.
NOW
AI Agent Builder
Building the AI agent
for frontline workers.
Orchestration is key.
Claude's Mistakes | Marc Baumholz
1 min | 3 / 12
03
My Daily Setup
Tools, modes & workflow
DAILY FLOW
Whisper Flow + Shortcuts
Dictating prompts and keyboard shortcuts. No typing walls of text into the terminal.
CLAUDE.md Per Directory
Each project has its own context file under 200 lines. Architecture, patterns, conventions.
Plan Mode / Insert Mode
Nearly never use auto mode. No vibe-coding. Plan first, execute deliberately.
SKILLS & MODES
Skills for Tickets & SQL
Start, plan, review tickets. SQL queries and evaluation. All automated via custom skills.
Superpowers & GSD
Brainstorm, spec-driven dev, GSD mode. Challenge my architecture with Claude as sparring partner.
MCPs Everywhere
Google, Linear, Context7, Notion, Miro. Company strategy, values, and ticket scope all connected.
Python | Pydantic | AI Engineering | SQL | Claude Code CLI | Antigravity
Claude's Mistakes | Marc Baumholz
2 min | 4 / 12
03
My Philosophy
Plan, execute, evaluate — that simple
Plan
Use plan mode. Stay on the product side.
Once you go deep into technical implementation
you lose the creative perspective.
Execute
Small changes, step by step. Code quality high,
speed is decent, testing is good, review interactive.
No vibe-coding, no auto-accept.
Evaluate
Challenge the output. Stress-test from wrong
angles. Use Claude to prep for meetings.
Always validate the full user flow.
Claude's Mistakes | Marc Baumholz
2 min | 5 / 12
04
Key Learnings
What changed my daily workflow
CLAUDE.md < 200 Lines
One per directory. Focused context beats a massive wall of instructions every time.
Understand Every Line
Never accept code blindly. If you can't explain it, you can't debug it later.
Claude = Your Buddy
Ask questions. Learn. Try to solve a problem first with Claude before asking a colleague. Use it to explain the code of others too
Claude Hooks
Pre-tool-use: explain your reasoning first. Seems horrible at first, but forces real thinking.
Visualize & Stress-Test
Flow diagrams, mock examples. Then try to break it from the wrong angle. Wrong assumptions surface.
Teach Key Principles
DRY, SRP, YAGNI, Kaizen, Separation of Concerns. Also: unresolved questions after plan mode.
Claude's Mistakes | Marc Baumholz
2 min | 6 / 12
05
Mistakes Claude Makes
Patterns, safety & silent failures
Root cause: Claude is trained to make code run, not to follow your architecture or let things fail loudly.
Ignores Your Patterns
Finds a solution, but not YOUR pattern. Doesn't know your service layers, DTO conventions, or error flow unless explicitly taught.
Over-Engineers Safety
Too many try-catches, redundant null checks, extra validation. Use guard clauses over nested ifs. Claude bloats clean code.
Silent Error Swallowing
Catches errors instead of failing loudly. Information silently drops. The system 'works' but data is lost.
Claude's Mistakes | Marc Baumholz
3 min | 8 / 12
05
Mistakes Claude Makes
Types, assumptions & codebase replication
WHAT CLAUDE DOES
WHAT YOU SHOULD DO
Uses `any` type or makes values nullable without checking if they truly are
Define concrete types. Verify if a value is actually nullable before allowing it.
Replicates bad patterns from your codebase. Garbage in, garbage out.
Clean your codebase first. Claude amplifies both good and bad patterns.
Violates SRP. One function does three things. Introduces new test frameworks.
Write conventions in CLAUDE.md. One function, one responsibility. Use existing tools.
Wrong assumptions, tries the same thing repeatedly when stuck.
Tell Claude to validate from scratch and explain its assumptions explicitly.
Claude's Mistakes | Marc Baumholz
3 min | 9 / 12
06
The Mistakes I Made
Lessons learned the hard way
01
Auto-Accept Overuse
Accepted everything without reviewing. Use plan mode and stay on the product side. Once you go deep into technical implementation, you lose creativity.
Use hooks. Block auto-accept. Understanding > speed.
02
Accept Before Understand
Used AI to shortcut, not to learn. Autocomplete felt fast but created debugging sessions where I didn't understand the code I was running.
Use Claude to learn, not to skip. Ask 'why' not just 'how'.
03
Vague Instructions
Just said 'fix it' without context. Works for simple bugs, but complex logic needs your assumptions and constraints spelled out.
Be explicit: what's broken, what I expect, what constraints exist.
04
Not Resetting Context
When Claude goes down a wrong path, I kept nudging. A poisoned context can't be fixed incrementally.
Start fresh. New chat. The agent who writes should not validate.
Claude's Mistakes | Marc Baumholz
2 min | 10 / 12
07
Let's Discuss
Open questions for the room
Claude Dream
Does it make sense to run Claude at night or on weekends to explore ideas autonomously?
Agent Harness Complexity
Do we really need this much complexity? Or can agent orchestration be simpler?
GSD / BMAD / Superpowers
Are these meta-frameworks actually necessary, or are we over-engineering our prompts?
Token-Saving Stone Age Mode
When Claude sacrifices grammar and filler words to save tokens... it writes like a caveman. Worth it?
What's Your System?
Everyone is individual. Genuinely curious: what's your AI development workflow?
Claude's Mistakes | Marc Baumholz
1 min | 11 / 12
Plan. Execute. Evaluate.
AI won't replace developers. Developers who use AI will.
Understand the code
Challenge the output
Teach your AI partner
Marc Baumholz
marc.baumholz@getflip.com
What's your system?