1 of 11

Claude's Mistakes

An AI Engineer's Perspective

Marc Baumholz

AI Engineer @ Flip | Founder, START Stuttgart

CS Bachelor | Cognitive Science Master

2 of 11

01

Who

am I?

Marc Baumholz

AI Engineer @ Flip

AI Engineer @ Flip

Building the AI agent for frontline workers. Handling orchestration: which tool or agent gets called, when, and why.

Founder, Start Stuttgart

Founded the startup initiative, connecting tech founders and pushing innovation in the Stuttgart ecosystem.

CS Bachelor & Cognitive Science Master

Computer Science fundamentals combined with an understanding of how humans think and interact with systems.

AI Power User Since Day One

Tested GPT, Cursor, Bolt, Lovable, Replit, Manus, and many more. Heavy AI user from the very beginning.

Claude's Mistakes | Marc Baumholz

1 min | 2 / 12

3 of 11

02

My Journey to Claude

From tool-hopping to deep integration

2023

GPT Era

Started with ChatGPT.

First AI-assisted coding.

Mind officially blown.

2024

Tool Hopping

Cursor, Bolt, Lovable,

Replit, Manus and more.

Searching for the right fit.

2025

Claude Click

Claude Code clicked.

Plan mode, MD files,

hooks. Real workflow.

NOW

AI Agent Builder

Building the AI agent

for frontline workers.

Orchestration is key.

Claude's Mistakes | Marc Baumholz

1 min | 3 / 12

4 of 11

03

My Daily Setup

Tools, modes & workflow

DAILY FLOW

Whisper Flow + Shortcuts

Dictating prompts and keyboard shortcuts. No typing walls of text into the terminal.

CLAUDE.md Per Directory

Each project has its own context file under 200 lines. Architecture, patterns, conventions.

Plan Mode / Insert Mode

Nearly never use auto mode. No vibe-coding. Plan first, execute deliberately.

SKILLS & MODES

Skills for Tickets & SQL

Start, plan, review tickets. SQL queries and evaluation. All automated via custom skills.

Superpowers & GSD

Brainstorm, spec-driven dev, GSD mode. Challenge my architecture with Claude as sparring partner.

MCPs Everywhere

Google, Linear, Context7, Notion, Miro. Company strategy, values, and ticket scope all connected.

Python | Pydantic | AI Engineering | SQL | Claude Code CLI | Antigravity

Claude's Mistakes | Marc Baumholz

2 min | 4 / 12

5 of 11

03

My Philosophy

Plan, execute, evaluate — that simple

Plan

Use plan mode. Stay on the product side.

Once you go deep into technical implementation

you lose the creative perspective.

Execute

Small changes, step by step. Code quality high,

speed is decent, testing is good, review interactive.

No vibe-coding, no auto-accept.

Evaluate

Challenge the output. Stress-test from wrong

angles. Use Claude to prep for meetings.

Always validate the full user flow.

Claude's Mistakes | Marc Baumholz

2 min | 5 / 12

6 of 11

04

Key Learnings

What changed my daily workflow

CLAUDE.md < 200 Lines

One per directory. Focused context beats a massive wall of instructions every time.

Understand Every Line

Never accept code blindly. If you can't explain it, you can't debug it later.

Claude = Your Buddy

Ask questions. Learn. Try to solve a problem first with Claude before asking a colleague. Use it to explain the code of others too

Claude Hooks

Pre-tool-use: explain your reasoning first. Seems horrible at first, but forces real thinking.

Visualize & Stress-Test

Flow diagrams, mock examples. Then try to break it from the wrong angle. Wrong assumptions surface.

Teach Key Principles

DRY, SRP, YAGNI, Kaizen, Separation of Concerns. Also: unresolved questions after plan mode.

Claude's Mistakes | Marc Baumholz

2 min | 6 / 12

7 of 11

05

Mistakes Claude Makes

Patterns, safety & silent failures

Root cause: Claude is trained to make code run, not to follow your architecture or let things fail loudly.

Ignores Your Patterns

Finds a solution, but not YOUR pattern. Doesn't know your service layers, DTO conventions, or error flow unless explicitly taught.

Over-Engineers Safety

Too many try-catches, redundant null checks, extra validation. Use guard clauses over nested ifs. Claude bloats clean code.

Silent Error Swallowing

Catches errors instead of failing loudly. Information silently drops. The system 'works' but data is lost.

Claude's Mistakes | Marc Baumholz

3 min | 8 / 12

8 of 11

05

Mistakes Claude Makes

Types, assumptions & codebase replication

WHAT CLAUDE DOES

WHAT YOU SHOULD DO

Uses `any` type or makes values nullable without checking if they truly are

Define concrete types. Verify if a value is actually nullable before allowing it.

Replicates bad patterns from your codebase. Garbage in, garbage out.

Clean your codebase first. Claude amplifies both good and bad patterns.

Violates SRP. One function does three things. Introduces new test frameworks.

Write conventions in CLAUDE.md. One function, one responsibility. Use existing tools.

Wrong assumptions, tries the same thing repeatedly when stuck.

Tell Claude to validate from scratch and explain its assumptions explicitly.

Claude's Mistakes | Marc Baumholz

3 min | 9 / 12

9 of 11

06

The Mistakes I Made

Lessons learned the hard way

01

Auto-Accept Overuse

Accepted everything without reviewing. Use plan mode and stay on the product side. Once you go deep into technical implementation, you lose creativity.

Use hooks. Block auto-accept. Understanding > speed.

02

Accept Before Understand

Used AI to shortcut, not to learn. Autocomplete felt fast but created debugging sessions where I didn't understand the code I was running.

Use Claude to learn, not to skip. Ask 'why' not just 'how'.

03

Vague Instructions

Just said 'fix it' without context. Works for simple bugs, but complex logic needs your assumptions and constraints spelled out.

Be explicit: what's broken, what I expect, what constraints exist.

04

Not Resetting Context

When Claude goes down a wrong path, I kept nudging. A poisoned context can't be fixed incrementally.

Start fresh. New chat. The agent who writes should not validate.

Claude's Mistakes | Marc Baumholz

2 min | 10 / 12

10 of 11

07

Let's Discuss

Open questions for the room

Claude Dream

Does it make sense to run Claude at night or on weekends to explore ideas autonomously?

Agent Harness Complexity

Do we really need this much complexity? Or can agent orchestration be simpler?

GSD / BMAD / Superpowers

Are these meta-frameworks actually necessary, or are we over-engineering our prompts?

Token-Saving Stone Age Mode

When Claude sacrifices grammar and filler words to save tokens... it writes like a caveman. Worth it?

What's Your System?

Everyone is individual. Genuinely curious: what's your AI development workflow?

Claude's Mistakes | Marc Baumholz

1 min | 11 / 12

11 of 11

Plan. Execute. Evaluate.

AI won't replace developers. Developers who use AI will.

Understand the code

Challenge the output

Teach your AI partner

Marc Baumholz

marc.baumholz@getflip.com

What's your system?