Code Quality

Pi, Coding Agents, and Context Engineering: A Synthesis of Practices

2026-07-03 by [Mathias WOLFF]

A synthesis of Mario Zechner's ideas on coding agents. Why the minimalism of the Pi agent, coupled with strict context engineering and CLI tools, often beats heavy frameworks.

Measuring If Tests Are Worth Anything: Four Axes

2026-07-02 by [Mathias WOLFF]

tests mutation‑testing coverage test‑smells flakiness audit

A test suite can show good coverage and miss an obvious bug. To measure real quality, I look at four complementary axes, with a simple audit to apply to an existing project.

Who Writes What? Splitting the Roles Between Humans, AI, and Deterministic Tools

2026-06-18 by [Mathias WOLFF] (updated: 2026-07-03)

ai tests multi‑agent deterministic‑tools workflow

Once the spec is in place, I split the work between humans, AI, and deterministic tools. The critical test does not always come from the same author as the code, and the critic-agent often matters more than the writer-agent.

Executable Specification: What I Show to Humans and AI

2026-06-04 by [Mathias WOLFF] (updated: 2026-06-09)

ai tests spec gherkin bdd workflow

The first step in the AI workflow is specification. This article shows how I turn a vague user story into something a human and an agent can use without guessing, with Gherkin, typed examples, and properties.

AI Didn’t Kill Testing — It Made It Essential

2026-05-21 by [Mathias WOLFF] (updated: 2026-06-03)

ai testing tdd code‑quality workflow methodology

The implicit promise of AI coding assistants was that tests would become a thing of the past. The reality documented by Kent Beck, ThoughtWorks, and several 2025 studies is the opposite: with AI, tests become essential — but the work has shifted.