Securing pi from the Inside: Guards, Scanners, and Audit with pi-secured-setup
A few days ago, I covered Greywall — a kernel-level sandbox that contains pi with a deny-by-default approach. That’s your outer wall. But what about threats inside the boundary? The agent that accidentally writes to the wrong project, the .env file that ends up in the LLM context, the skill whose SKILL.md was silently modified. That’s a different problem, and it needs a different tool.
Today I’m releasing pi-secured-setup — a pi extension that adds Guards, Scanners, and an audit trail directly inside the agent. No kernel modules, no containers, no external dependencies. Just a pi install and you’re protected.
Threat Model — What Are We Actually Protecting Against?
pi-secured-setup is not designed to stop a determined attacker. That’s Greywall’s job. It targets five realistic scenarios that every pi user will encounter sooner or later:
| Threat | Example | Severity |
|---|---|---|
| Accidental cross-project damage | write to /home/user/other-project/file.ts | High |
| Sensitive file exposure | Agent reads your .env and the contents reach the LLM context | High |
| Destructive commands | rm -rf /, sudo, git push --force | High |
| Supply chain (skills) | A skill’s SKILL.md is modified to inject malicious instructions | Medium |
| Data exfiltration | curl or aws commands sending data externally | Medium |
| No accountability | Something went wrong, but you have no idea what | Medium |
Every one of these has happened to me. The last one is the most insidious — without an audit trail, you can’t even diagnose the problem.
Architecture — Three Layers, One Extension
pi-secured-setup is a single pi extension with three distinct layers:
Guards evaluate tool calls before execution and can block or confirm them. They run in a fixed pipeline with a single combined handler — no double dialogs, no race conditions.
Scanners observe data without blocking. They detect secrets in the provider payload and verify skill integrity. They never prevent a tool from running.
Audit records everything as append-only JSONL with automatic rotation.
The key design decision: Guards and Scanners are separate by intent. Guards can block. Scanners can only observe and report. This means the secret scanner will never accidentally block your workflow — it redacts silently and notifies you after the fact.
The Guard Pipeline
All three Guard modules (boundary, protected paths, bash gate) are evaluated by a single tool_call handler in fixed order:
boundary → protected-paths → bash-gate
First block wins. If boundary blocks a write, protected paths and bash gate never run. If boundary allows, protected paths gets a turn. If both allow and the tool is bash, the command gets classified. One verdict per tool call, one audit entry.
Guards in Action — What Gets Blocked
Boundary Enforcement
The boundary is your project directory (cwd). File operations via read, write, and edit are checked against it. Bash is explicitly excluded — you can’t reliably extract paths from arbitrary shell commands.
| Situation | Tool | Verdict |
|---|---|---|
| Write inside project | write, edit | ✅ Allow |
| Read inside project | read | ✅ Allow |
| Write outside project | write, edit | 🚫 Block |
| Read outside project | read | ⚠️ Confirm |
| Read/write to allowed external path | any | ✅ Allow |
| Any bash command | bash | ✅ Allow (handled by bash gate) |
The allowed-external list lets you whitelist paths like ~/.agents/skills or /tmp that are outside the project but needed for normal operation.
Protected Paths
Even inside the boundary, some files deserve extra protection. Protected paths use glob patterns matched against the target file:
.env, .env.*, *.key, *.pem, id_rsa*, *secret*, *credential* ...
Write or edit to a protected path → block. Read from a protected path → confirm (configurable to allow or block). The patterns are fully configurable per project.
Bash Gate
Every bash command is classified into one of four categories based on regex rules:
| Category | Examples | Verdict |
|---|---|---|
| SAFE | ls, cat, grep, git status | ✅ Auto-approved |
| MODERATE | npm install, mkdir, git commit | ✅ Auto-approved (logged) |
| DANGEROUS | rm -rf, sudo, eval, dd if= | ⚠️ Confirm |
| EXTERNAL | curl, ssh, aws, gcloud | ⚠️ Confirm |
| Unknown | anything not matching | ⚠️ Confirm |
Pipes are handled correctly: cat file | rm -rf / is classified as DANGEROUS because the pipeline takes the most dangerous component. Subshells like $(whoami) are extracted and classified independently.
Here’s what the confirmation dialog looks like in practice:
🔒 Bash Command
⚠️ Dangerous command — allow execution?
rm -rf node_modules
Classification: DANGEROUS
Scanners — Secrets and Skills
Secret Scanner
The secret scanner runs on before_provider_request — it sees the exact payload about to be sent to the LLM, before it leaves your machine. It recursively walks every string value in the payload (provider-agnostic — works with Anthropic, OpenAI, Google, any provider) and matches against 15+ patterns:
- AWS access keys (
AKIA...), AWS secret keys - Anthropic, OpenAI, Gemini API keys
- Private key headers (
-----BEGIN RSA PRIVATE KEY-----) - Generic API keys, bearer tokens, passwords
- Database connection strings (
postgres://,mongodb://,redis://) - GitHub tokens (
ghp_,ghs_), Slack tokens, Discord tokens - High-entropy fallback detection
When a secret is found, it’s replaced with ***REDACTED:{pattern-name}***. The agent retains the type of secret without the value, so it can still reason about your configuration.
DATABASE_URL=postgres://user:supersecret@db.example.com:5432/prod
becomes:
DATABASE_URL=***REDACTED:db-connection***
Why only scan the request, not the response? If secrets are redacted before they reach the model, the model can never repeat them in a response. Input-side redaction is sufficient.
False positive mitigation is built in: placeholders like YOUR_API_KEY, xxx, REPLACE_... are skipped. Comment lines (#, //, --) are skipped entirely. And PEM private key headers (which start with -----) are correctly distinguished from SQL comments (which start with -- ).
Skill Scanner
Skills are powerful — a SKILL.md file enters the LLM context directly. If someone modifies a skill’s SKILL.md, they can inject arbitrary instructions into your agent. The skill scanner:
- Discovers all skills across standard directories (
~/.agents/skills/,~/.pi/agent/skills/,.pi/skills/,.agents/skills/) - Hashes each
SKILL.mdwith SHA-256 - Compares against
skill-approvals.json - New or changed skills → approval prompt (once)
- Previously unapproved → notification only (no blocking dialog)
Only SKILL.md is hashed — supporting scripts and assets are not. The bash gate covers script execution. This keeps verification fast (one file per skill) and focused on the actual LLM attack surface.
Installation and Configuration
One-Command Install
pi install git:github.com/mwolff44/pi-secured-setup
That’s it. On first run, the extension creates ~/.pi/agent/security/ with default configs and scans your skills for approval.
First-Run Experience
🔒 Skill Review: grill-me
Skill: grill-me
Source: ~/.agents/skills/
Path: /home/user/.agents/skills/grill-me/SKILL.md
🆕 New skill detected.
--- SKILL.md preview ---
# Grill Me
Interview the user relentlessly about a plan or design...
---
Approve
> Deny
Skip
Each skill gets one prompt. Subsequent sessions show only a notification for unapproved skills.
Three-Layer Configuration
Configuration is loaded from three layers, merged in priority order:
1. defaults/ — shipped with the package (don't edit)
2. ~/.pi/agent/security/ — machine-specific overrides
3. .pi/security/ — project-specific overrides
Pattern lists are additive. A ! prefix excludes an inherited pattern:
// .pi/security/protected-paths.json
{
"patterns": [
"!*secret*", // Remove the inherited *secret* pattern for this project
"config/prod.yaml" // Add a project-specific pattern
],
"readAction": "allow" // Don't confirm reads for protected files in this project
}
This means you can tighten security per-project without touching your global config, or relax specific rules where they don’t make sense.
Per-Project Example
mkdir -p .pi/security
// .pi/security/command-rules.json
{
"dangerous": [
"terraform destroy",
"kubectl delete namespace"
]
}
// .pi/security/allowed-external.json
{
"paths": [
"../shared-libs"
]
}
The /security Dashboard and Commands
/security — Dashboard
🔒 Security Dashboard — Session m5x8k2-abc123
This session:
🔴 Blocked: 3 actions
🟡 Confirmed: 5 actions
🔵 Auto-approved: 42 actions
⚠️ Secrets redacted: 2
Skill status:
✅ 18 approved, ⚠️ 0 pending, 🚫 0 denied
Recent events:
10:28 [BLOCKED] write → /home/user/other-project/file.ts (outside boundary)
10:25 [CONFIRMED] bash → curl https://api.example.com (external)
10:22 [REDACTED] secret (anthropic-key) in read → .env
Log file: ~/.pi/agent/security/audit.jsonl
Full Command Reference
| Command | Description |
|---|---|
/security | Dashboard with counts, recent events, skill status |
/security:skills | Re-trigger skill approval flow for all skills |
/security:trust <name> | Approve a skill by name, persist to config |
/security:allow <path> | Add an external path to the allowed list |
/security:clean [days] | Trim audit log entries older than N days (default: 30) |
Audit Log
Every action is recorded as JSONL in ~/.pi/agent/security/audit.jsonl:
{
"timestamp": "2026-05-07T10:28:15.123Z",
"sessionId": "m5x8k2-abc123",
"type": "boundary.block",
"severity": "warning",
"details": {
"tool": "write",
"path": "/home/user/other-project/file.ts",
"boundary": "/home/user/my-project",
"reason": "write outside project boundary"
}
}
The log rotates automatically — default is 10MB per file, 3 files retained. Configurable via ~/.pi/agent/security/audit-config.json.
Greywall + pi-secured-setup: Defense in Depth
These two tools solve different problems. They complement each other.
| Concern | Greywall | pi-secured-setup |
|---|---|---|
| Kernel-level containment | ✅ | ❌ |
| File boundary enforcement | ❌ | ✅ |
| Secret redaction in LLM context | ❌ (filters destinations, not content) | ✅ |
| Command classification | ✅ (kernel syscall level) | ✅ (regex patterns, user-facing dialogs) |
| Skill integrity verification | ❌ | ✅ |
| Audit trail | ❌ (violation logs) | ✅ (JSONL, dashboard, rotation) |
| Protected path glob matching | ❌ | ✅ |
| Network filtering | ✅ (SOCKS5 proxy) | ❌ |
| Credential substitution | ✅ (HTTP proxy) | ❌ (redacts in payload) |
| Requires kernel modules / root | ✅ | ❌ |
| Works on any OS | ❌ (Linux/macOS specific) | ✅ (runs inside pi, cross-platform) |
Recommended setup: Greywall for the outer wall — kernel-level containment, network filtering, credential protection. pi-secured-setup for the inner wall — application-level guards, secret redaction, skill verification, audit trail. Together they provide meaningful defense-in-depth without any single point of failure.
Conclusion
Security is layers, not silver bullets. Greywall contains pi at the kernel level. pi-secured-setup adds awareness inside the agent itself — blocking accidental damage, redacting secrets before they reach the LLM, verifying skill integrity, and keeping a complete audit trail.
The extension is open source, tested (98 unit tests), and installable with a single command. Every guard is a pure function with no pi dependency, making the core logic independently testable and auditable.
If you’re using pi daily — especially with access to sensitive projects — I’d argue this isn’t optional. It’s the minimum viable security posture.
pi install git:github.com/mwolff44/pi-secured-setup
The source is on GitHub. Contributions, feedback, and bug reports are welcome.
What security concerns do you have with AI coding agents? Are there threats I haven’t covered here? Let me know in the comments!