CLIs Are for Robots, IDEs Are for Humans

A mental model for agentic coding workflows: where machines execute, where humans judge, and why keeping that distinction sharp makes everything work better.

Scan texture
Scan texture

Published On

Mon Apr 06 2026

0
/
0

CLIs are for robots, IDEs are for humans. That's the mental model I keep coming back to when thinking about agentic coding. Agents do their best work in text and terminal space, executing, iterating, grinding through boilerplate. Humans do their best work from within the editor, reading, judging, reshaping. The workflow clicks when you stop treating those two things as the same surface. The brief friction of those two modes of creation being separated provides a much needed brief boundary for clarity and reflection.

Agentic coding as delegated execution

The way I think about it: agents handle the groundwork for a new feature, planning a script for repetitive changes, or taking that first stab at a problem. I prompt them the way I'd hand off a task to an intern or junior engineer. It is high leverage, but I review the output before it goes anywhere. The shift is moving your attention up a level. You're thinking about approach, structure, and intent rather than which keys to press.

The IDE as the decision-making surface

The editor is where humans are strongest, so that's where review lives. Diff views, refactors, and jump-to-definition exist because reading and navigating code is a human problem. I want to be reading, reshaping, and rejecting agent changes before they become committed history. Lending a careful eye to data flow and data structure keeps the paralell contributions on track.

Natural language becomes code in the terminal. Whether it stays as code gets decided in the IDE.

Tests as behavior contracts

Tests keep agent output from becoming a black box. We want high quality software not user facing regressions. I'm not using them just to verify correctness, I'm using them to document what the system is supposed to do, independent of how it does it. That unlocks a clean TDD-style loop: the agent generates, the tests define truth, and I refine. As long as the tests pass, I can optimize or rewrite freely. The implementation becomes a detail.

Git worktrees and stacked PRs for multi-agent workflows

Git worktrees are great. Worktrees let me run multiple agents on isolated features at the same time without them stepping on each other. It allows PRs to remain focused and reviewable. Each agent gets its own branch and its own context.

Stacked PRs keep the changes organized for peer review, large work stays incremental instead of landing as one undiffable blob. You can scale up experimentation without losing the discipline of actually reviewing what you're shipping.

The /docs folder as shared memory

I keep a /docs folder in the repo, not in Notion or Jira. I find the file system accessible documentation helpful in ways that feature agent.md files aren't. These md files include architecture decisions, trade-offs, system explanations — all of it lives next to the code. Agents can read it like a README when they search for keywords related to the task at hand. So can I, six weeks later. It cuts down on repeated explanations and the prompt drift that builds up over long sessions when the agent loses thread on why things are structured the way they are. Tribal knowledge is preserved and activly used.

Closing the loop: having agents update their own docs

After a significant change, especially if I've corrected a repeated mistake from the coding LLM harnes or changed direction, I have the agent update the relevant documentation. It ends up writing explanations for itself as much as for future humans. That's the point. Keeping the architectural context fresh and actually aligned with the codebase is the part that usually falls apart first.

Pitfall: LLMs are mirrors

The agent reflects your language, your tone, and your level of precision back at you. Vague prompts produce vague code. Casual language leads to casual structure. If you're working through a data-heavy problem, you need to be explicit about data models and algorithms upfront. If the task is UI/UX-heavy, design and interaction terms matter. The quality of what comes out is proportional to the clarity of what goes in. That one takes a while to really internalize after some experimenation. Asking for the same output role playing as a leetcode question author vs an art student will yield results that are a world apart.

Wrapping up

Agentic coding scales execution, not responsibility. You still own correctness, intent, and taste. The workflow holds up because it's honest about where machines are strong and where people need to stay in the loop and it does not try to blur that line.

More writing

Recent posts.

Notes on engineering, design, and building products.

ai

CLIs Are for Robots, IDEs Are for Humans

A mental model for agentic coding workflows: where machines execute, where humans judge, and why keeping that distinction sharp makes everything work better.

ai

A Practical Pattern for Hydrating AI-Generated Object Templates

How I hydrate server-side LLM templates with client constants and API data using a queue-based pattern.

flutter

Stateless Classes Are Better: A Lesson from Flutter

Stop storing state in your classes. Localize it via providers, minimize your bug surface area, and write code that's actually testable.

data-visualization

Treat Your Chart Like MVVM: Client-Side ETL for Better Visualizations

Learn how to build better data visualizations by treating charts like MVVM components with proper client-side ETL pipelines. Stop fighting your charting libraries and start feeding them clean, predictable data.

typescript

Caching Isn’t a SaaS Product, It’s a Data Structure

The industry won’t tell you this, but a hashmap does 90% of what you need

typescript

The Ultimate Tool for Managing Types in Monorepos

gRPC Is the Secret Weapon Your Monorepo Desperately Needs

typescript

Divide and Conquer Timeline Data with Typescript

Typescript time series and Date objects

ai

The Growing Importance of SMEs in AI Agent Design

AI agents are revolutionizing industries, but here's why they still need us more than ever

engineering-culture

The Myth of the “Universal Language” for Internal Tool Development

We've all heard this story before. You finally get buy-in to build a tool that solves your pet peeve. You have a plan figured out, but then your manager says the dreaded phrase, “Use a different language so that others can contribute”… which seldom happens.

react

The Performant Interface Dilemma: Taming Object Equality in React

A comprehensive guide for developers on handling object equality issues in JavaScript, with a focus on practical solutions for React applications

startup

Will My Startup's Problem Be Big Enough?

How to evaluate the potential of a startup idea by understanding your Serviceable Obtainable Market, the Venn diagram of opportunity

ux

UX Meets Database Design, a Match Made in Heaven

Putting UX at the heart of user-centric SQL schema data modeling

ai

I Stuffed TensorFlow.js Into a React App

Here's what I learned about Web Workers

ux-research

Social Distanced UX Research Strategies For your Next iOS App

COIVD forced us to stop using in-person UX research. Here are some tried and true methods we're keeping after the lockdowns lift

flutter

How to Deploy Flutter for Web Apps with Netlify

Have you ever wanted to turn your iPad or iPhone app into a website?

All posts →