Pi, Coding Agents, and Context Engineering: A Synthesis of Practices

How should we build and use a coding agent? Should we stack MCP servers and give it full powers, or restrict its tools and frame its state?

Since 2025, Mario Zechner (creator of libGDX and co-founder of Earendil) has been documenting an opinionated vision of engineering with AI agents. This vision led to the creation of pi, a deliberately minimalist agent. I took the time to re-read and synthesize his essays from 2025 and 2026.

Here is what I retain about how to integrate LLMs into our daily engineering workflow.

The Philosophy: Minimalism as a Strength

The Pi agent relies on a radically simple set of tools: read, write, edit, and bash. Its system prompt and tool definitions weigh in at less than 1,000 tokens. Yet, benchmarks (like Terminal-Bench 2.0) prove that this approach competes with much heavier frameworks (Claude Code, Cursor, Codex).

Zechner’s conclusion is clear: current models have been sufficiently trained via reinforcement learning (RL) to inherently understand what a coding agent is. There is no longer a need for 10,000-token system prompts.

“If I don’t need it, it won’t be built. And I don’t need a lot of things.” — Mario Zechner

Context engineering above all

Context is an LLM’s most precious resource. Beyond 100k tokens, response quality drops noticeably (details get lost in the middle of the conversation). This implies a strict rule: you must minimize what enters the context.

No heavy tools (like certain MCP servers) that inject 15,000 tokens with every request.
Prefer pre-generating useful data rather than letting the agent ad-hoc “explore” a large codebase.

Slowing down is a feature, not a bug

Agents generate code fast. Too fast. In his March 2026 article, Thoughts on slowing the fuck down, Mario highlights a major trap: small errors, code smells, duplication, and bad abstractions accumulate at a frantic pace when there is no longer a human bottleneck. A human learns from their mistakes; an LLM repeats them endlessly.

The human must therefore remain the entry point and the final quality gate:

“Anything that defines the gestalt of your system — architecture, API — write it by hand.”

6 Best Practices for Taming Agents

1. Prompts are code, files = state

Treat LLMs like slow, non-deterministic computers. An agent’s workflow is a full-fledged program. The system prompt is your source code, your bash/jq scripts are your libraries, and your markdown or json files are your persistent state.

The Java to C++ port of the “Spine Runtimes” demonstrates this: a port that took 2 to 3 weeks for a human dropped to 2-3 days. How? By defining a porting-plan.json file and a port.md workflow file that the agent follows scrupulously.

2. Don’t let the agent guess, pre-generate

Ad-hoc exploration is costly in tokens, non-deterministic, and slow. Instead, write scripts (e.g., generate-porting-plan.js) that prepare the exact data the agent will need.

3. Document your conventions (for the agent)

Before a complex task, write a file (e.g., conventions.md) listing:

Naming styles
Desired memory management
Expected file structure

The agent will read it once, absorb it, and apply your standards.

4. Enforce checkpoints

The workflow should not be a black box. Ask the agent to request confirmation before committing a large change. This is defensive programming applied to AI.

5. Prefer small composable scripts

Build small script pipelines. Each script takes files in, outputs files, and is independently testable. It’s reproducible and auditable.

6. Flee built-in features, use CLIs

The Pi agent deliberately refuses to integrate to-do lists, MCP servers, or background bash environments. Use a good old TODO.md file, command-line tools (CLI), and tmux. Keep the agent “dumb” and transparent.

Daily Traps with AI

LLMs lack taste: They produce the “statistical average” of GitHub. The result: code that is often verbose or over-engineered. The human should write the architecture and the APIs; the agent only rolls them out.
Agentic search has low recall: The larger the codebase gets, the more the agent “misses” areas of code during its searches. The human must guide the agent or use static plans to identify areas to modify.
The overhead of the MCP protocol: An MCP server like Chrome DevTools exposes 26 tools and weighs 18,000 tokens injected into the system. This cost is paid on every request. Mario recommends using CLI tools with READMEs (progressive disclosure): the model only pays the tokens when it chooses to read the README.
Providers alter the AI behind your back: Tools like Claude Code have their system prompts or tools updated silently, which breaks the reproducibility of your workflows. Keep control of your system prompts (with a tool like pi).

The Recommended Workflow

Combining these ideas, here is the roadmap for efficient assisted development:

Phase 1: Planning (without agent). Understand the problem yourself. Write the architecture, define conventions in markdown, and pre-generate lists of files to modify.
Phase 2: Execution. Launch the agent with a strict context and the plan file. Let it iterate in small cycles with human verification.
Phase 3: Validation. Read the code via diffs (do not rely on the model’s textual summary). Test, compile, and fix errors immediately.
Phase 4: Cleanup. Remove dead code and generic abstractions the LLM may have introduced to “look good.”

“Let the agent do the boring stuff, the stuff that won’t teach you anything new. Then you evaluate, take the ideas that are actually reasonable, and finalize the implementation.”

Zechner’s Toolkit

The agent: pi (minimalist, open-source, multi-provider)
The Terminal: tmux (essential for dev servers and sub-agents)
Exploration: ripgrep (faster than grep) and jq (for JSON state)
CLI tools: agent-tools (instead of MCP)
Analysis: lsp-cli to extract deterministic types.
Search & Scraping: Sitegeist (traceable scraping).

Software development has mutated. AI produces volume, but it is human discipline (context structuring, conventions, clear tools) that transforms it into real value. And in this game, simplicity and determinism remain king.

If you want to dig deeper, I highly recommend Mario Zechner’s original articles on his blog: mariozechner.at, particularly “Prompts are code, .json/.md files are state” (June 2025) and “Thoughts on slowing the fuck down” (March 2026).

Pi, Coding Agents, and Context Engineering: A Synthesis of Practices

The Philosophy: Minimalism as a Strength#

Context engineering above all#

Slowing down is a feature, not a bug#

6 Best Practices for Taming Agents#

1. Prompts are code, files = state#

2. Don’t let the agent guess, pre-generate#

3. Document your conventions (for the agent)#

4. Enforce checkpoints#

5. Prefer small composable scripts#

6. Flee built-in features, use CLIs#

Daily Traps with AI#

The Recommended Workflow#

Zechner’s Toolkit#