Securing pi from the Inside: Guards, Scanners, and Audit with pi-secured-setup

A few days ago, I covered Greywall — a kernel-level sandbox that contains pi with a deny-by-default approach. That’s your outer wall. But what about threats inside the boundary? The agent that accidentally writes to the wrong project, the .env file that ends up in the LLM context, the skill whose SKILL.md was silently modified. That’s a different problem, and it needs a different tool.

Today I’m releasing pi-secured-setup — a pi extension that adds Guards, Scanners, and an audit trail directly inside the agent. No kernel modules, no containers, no external dependencies. Just a pi install and you’re protected.

Threat Model — What Are We Actually Protecting Against?

pi-secured-setup is not designed to stop a determined attacker. That’s Greywall’s job. It targets five realistic scenarios that every pi user will encounter sooner or later:

Threat	Example	Severity
Accidental cross-project damage	`write` to `/home/user/other-project/file.ts`	High
Sensitive file exposure	Agent reads your `.env` and the contents reach the LLM context	High
Destructive commands	`rm -rf /`, `sudo`, `git push --force`	High
Supply chain (skills)	A skill’s `SKILL.md` is modified to inject malicious instructions	Medium
Data exfiltration	`curl` or `aws` commands sending data externally	Medium
No accountability	Something went wrong, but you have no idea what	Medium

Every one of these has happened to me. The last one is the most insidious — without an audit trail, you can’t even diagnose the problem.

Architecture — Three Layers, One Extension

pi-secured-setup is a single pi extension with three distinct layers:

Guards evaluate tool calls before execution and can block or confirm them. They run in a fixed pipeline with a single combined handler — no double dialogs, no race conditions.

Scanners observe data without blocking. They detect secrets in the provider payload and verify skill integrity. They never prevent a tool from running.

Audit records everything as append-only JSONL with automatic rotation.

The key design decision: Guards and Scanners are separate by intent. Guards can block. Scanners can only observe and report. This means the secret scanner will never accidentally block your workflow — it redacts silently and notifies you after the fact.

The Guard Pipeline

All three Guard modules (boundary, protected paths, bash gate) are evaluated by a single tool_call handler in fixed order:

boundary → protected-paths → bash-gate

First block wins. If boundary blocks a write, protected paths and bash gate never run. If boundary allows, protected paths gets a turn. If both allow and the tool is bash, the command gets classified. One verdict per tool call, one audit entry.

Guards in Action — What Gets Blocked

Boundary Enforcement

The boundary is your project directory (cwd). File operations via read, write, and edit are checked against it. Bash is explicitly excluded — you can’t reliably extract paths from arbitrary shell commands.

Situation	Tool	Verdict
Write inside project	`write`, `edit`	✅ Allow
Read inside project	`read`	✅ Allow
Write outside project	`write`, `edit`	🚫 Block
Read outside project	`read`	⚠️ Confirm
Read/write to allowed external path	any	✅ Allow
Any bash command	`bash`	✅ Allow (handled by bash gate)

The allowed-external list lets you whitelist paths like ~/.agents/skills or /tmp that are outside the project but needed for normal operation.

Protected Paths

Even inside the boundary, some files deserve extra protection. Protected paths use glob patterns matched against the target file:

.env, .env.*, *.key, *.pem, id_rsa*, *secret*, *credential* ...

Write or edit to a protected path → block. Read from a protected path → confirm (configurable to allow or block). The patterns are fully configurable per project.

Bash Gate

Every bash command is classified into one of four categories based on regex rules:

Category	Examples	Verdict
SAFE	`ls`, `cat`, `grep`, `git status`	✅ Auto-approved
MODERATE	`npm install`, `mkdir`, `git commit`	✅ Auto-approved (logged)
DANGEROUS	`rm -rf`, `sudo`, `eval`, `dd if=`	⚠️ Confirm
EXTERNAL	`curl`, `ssh`, `aws`, `gcloud`	⚠️ Confirm
Unknown	anything not matching	⚠️ Confirm

Pipes are handled correctly: cat file | rm -rf / is classified as DANGEROUS because the pipeline takes the most dangerous component. Subshells like $(whoami) are extracted and classified independently.

Here’s what the confirmation dialog looks like in practice:

🔒 Bash Command
⚠️ Dangerous command — allow execution?

  rm -rf node_modules

Classification: DANGEROUS

Scanners — Secrets and Skills

Secret Scanner

The secret scanner runs on before_provider_request — it sees the exact payload about to be sent to the LLM, before it leaves your machine. It recursively walks every string value in the payload (provider-agnostic — works with Anthropic, OpenAI, Google, any provider) and matches against 15+ patterns:

AWS access keys (AKIA...), AWS secret keys
Anthropic, OpenAI, Gemini API keys
Private key headers (-----BEGIN RSA PRIVATE KEY-----)
Generic API keys, bearer tokens, passwords
Database connection strings (postgres://, mongodb://, redis://)
GitHub tokens (ghp_, ghs_), Slack tokens, Discord tokens
High-entropy fallback detection

When a secret is found, it’s replaced with ***REDACTED:{pattern-name}***. The agent retains the type of secret without the value, so it can still reason about your configuration.

DATABASE_URL=postgres://user:supersecret@db.example.com:5432/prod

becomes:

DATABASE_URL=***REDACTED:db-connection***

Why only scan the request, not the response? If secrets are redacted before they reach the model, the model can never repeat them in a response. Input-side redaction is sufficient.

False positive mitigation is built in: placeholders like YOUR_API_KEY, xxx, REPLACE_... are skipped. Comment lines (#, //, --) are skipped entirely. And PEM private key headers (which start with -----) are correctly distinguished from SQL comments (which start with -- ).

Skill Scanner

Skills are powerful — a SKILL.md file enters the LLM context directly. If someone modifies a skill’s SKILL.md, they can inject arbitrary instructions into your agent. The skill scanner:

Discovers all skills across standard directories (~/.agents/skills/, ~/.pi/agent/skills/, .pi/skills/, .agents/skills/)
Hashes each SKILL.md with SHA-256
Compares against skill-approvals.json
New or changed skills → approval prompt (once)
Previously unapproved → notification only (no blocking dialog)

Only SKILL.md is hashed — supporting scripts and assets are not. The bash gate covers script execution. This keeps verification fast (one file per skill) and focused on the actual LLM attack surface.

Installation and Configuration

One-Command Install

pi install git:github.com/mwolff44/pi-secured-setup

That’s it. On first run, the extension creates ~/.pi/agent/security/ with default configs and scans your skills for approval.

First-Run Experience

🔒 Skill Review: grill-me
Skill: grill-me
Source: ~/.agents/skills/
Path: /home/user/.agents/skills/grill-me/SKILL.md

🆕 New skill detected.

--- SKILL.md preview ---
# Grill Me
Interview the user relentlessly about a plan or design...
---

  Approve
> Deny
  Skip

Each skill gets one prompt. Subsequent sessions show only a notification for unapproved skills.

Three-Layer Configuration

Configuration is loaded from three layers, merged in priority order:

1. defaults/              — shipped with the package (don't edit)
2. ~/.pi/agent/security/  — machine-specific overrides
3. .pi/security/          — project-specific overrides

Pattern lists are additive. A ! prefix excludes an inherited pattern:

// .pi/security/protected-paths.json
{
  "patterns": [
    "!*secret*",        // Remove the inherited *secret* pattern for this project
    "config/prod.yaml"  // Add a project-specific pattern
  ],
  "readAction": "allow" // Don't confirm reads for protected files in this project
}

This means you can tighten security per-project without touching your global config, or relax specific rules where they don’t make sense.

Per-Project Example

mkdir -p .pi/security

// .pi/security/command-rules.json
{
  "dangerous": [
    "terraform destroy",
    "kubectl delete namespace"
  ]
}

// .pi/security/allowed-external.json
{
  "paths": [
    "../shared-libs"
  ]
}

The `/security` Dashboard and Commands

`/security` — Dashboard

🔒 Security Dashboard — Session m5x8k2-abc123

This session:
  🔴 Blocked:       3 actions
  🟡 Confirmed:     5 actions
  🔵 Auto-approved: 42 actions
  ⚠️  Secrets redacted: 2

Skill status:
  ✅ 18 approved, ⚠️ 0 pending, 🚫 0 denied

Recent events:
  10:28 [BLOCKED] write → /home/user/other-project/file.ts (outside boundary)
  10:25 [CONFIRMED] bash → curl https://api.example.com (external)
  10:22 [REDACTED] secret (anthropic-key) in read → .env

Log file: ~/.pi/agent/security/audit.jsonl

Full Command Reference

Command	Description
`/security`	Dashboard with counts, recent events, skill status
`/security:skills`	Re-trigger skill approval flow for all skills
`/security:trust <name>`	Approve a skill by name, persist to config
`/security:allow <path>`	Add an external path to the allowed list
`/security:clean [days]`	Trim audit log entries older than N days (default: 30)

Audit Log

Every action is recorded as JSONL in ~/.pi/agent/security/audit.jsonl:

{
  "timestamp": "2026-05-07T10:28:15.123Z",
  "sessionId": "m5x8k2-abc123",
  "type": "boundary.block",
  "severity": "warning",
  "details": {
    "tool": "write",
    "path": "/home/user/other-project/file.ts",
    "boundary": "/home/user/my-project",
    "reason": "write outside project boundary"
  }
}

The log rotates automatically — default is 10MB per file, 3 files retained. Configurable via ~/.pi/agent/security/audit-config.json.

Greywall + pi-secured-setup: Defense in Depth

These two tools solve different problems. They complement each other.

Concern	Greywall	pi-secured-setup
Kernel-level containment	✅	❌
File boundary enforcement	❌	✅
Secret redaction in LLM context	❌ (filters destinations, not content)	✅
Command classification	✅ (kernel syscall level)	✅ (regex patterns, user-facing dialogs)
Skill integrity verification	❌	✅
Audit trail	❌ (violation logs)	✅ (JSONL, dashboard, rotation)
Protected path glob matching	❌	✅
Network filtering	✅ (SOCKS5 proxy)	❌
Credential substitution	✅ (HTTP proxy)	❌ (redacts in payload)
Requires kernel modules / root	✅	❌
Works on any OS	❌ (Linux/macOS specific)	✅ (runs inside pi, cross-platform)

Recommended setup: Greywall for the outer wall — kernel-level containment, network filtering, credential protection. pi-secured-setup for the inner wall — application-level guards, secret redaction, skill verification, audit trail. Together they provide meaningful defense-in-depth without any single point of failure.

Conclusion

Security is layers, not silver bullets. Greywall contains pi at the kernel level. pi-secured-setup adds awareness inside the agent itself — blocking accidental damage, redacting secrets before they reach the LLM, verifying skill integrity, and keeping a complete audit trail.

The extension is open source, tested (98 unit tests), and installable with a single command. Every guard is a pure function with no pi dependency, making the core logic independently testable and auditable.

If you’re using pi daily — especially with access to sensitive projects — I’d argue this isn’t optional. It’s the minimum viable security posture.

pi install git:github.com/mwolff44/pi-secured-setup

The source is on GitHub. Contributions, feedback, and bug reports are welcome.

What security concerns do you have with AI coding agents? Are there threats I haven’t covered here? Let me know in the comments!

Securing pi from the Inside: Guards, Scanners, and Audit with pi-secured-setup

Threat Model — What Are We Actually Protecting Against?#

Architecture — Three Layers, One Extension#

The Guard Pipeline#

Guards in Action — What Gets Blocked#

Boundary Enforcement#

Protected Paths#

Bash Gate#

Scanners — Secrets and Skills#

Secret Scanner#

Skill Scanner#

Installation and Configuration#

One-Command Install#

First-Run Experience#

Three-Layer Configuration#

Per-Project Example#

The /security Dashboard and Commands#

/security — Dashboard#

Full Command Reference#

Audit Log#

Greywall + pi-secured-setup: Defense in Depth#

Conclusion#