Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 6ce2f0058d | |||
| d35fefddd9 | |||
| e56ddd6a1b | |||
| 53407a77cd | |||
| 6577c04a1f | |||
| 7a469dc913 | |||
| be4e1a6123 |
@@ -21,10 +21,10 @@ jobs:
|
||||
extensions: pdo, pdo_pgsql, redis, mbstring, intl, bcmath
|
||||
coverage: none
|
||||
|
||||
- name: Setup Node 20
|
||||
- name: Setup Node 22
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
node-version: '22'
|
||||
cache: 'npm'
|
||||
|
||||
- name: Install root JS deps
|
||||
|
||||
@@ -0,0 +1,144 @@
|
||||
# Discipline-guard backlog — router-gate `tools/enforce-*.mjs`
|
||||
|
||||
**Worktree:** `.claude/worktrees/discipline-guard` (branch `worktree-discipline-guard`).
|
||||
**Date:** 2026-05-31. Owner-authorized backlog after quirk-2 + 1A closure (commit `b0cd18d7`).
|
||||
|
||||
## Context (already done — do NOT redo)
|
||||
|
||||
- **Quirk 2** — redirect detector is quote-aware (`stripQuotedSpans` in `tools/enforce-router-gate.mjs`): `>`/`2>` inside quotes no longer false-blocks. Commit `b0cd18d7`.
|
||||
- **1A** — removed advertising of dead override phrases (`findOverride` is a v4 stub) from `enforce-prompt-injection` + verify-before-push / coverage-verify / memory-coverage / tdd-gate. Locked by negative tests. Same commit.
|
||||
- Marketing MCP servers cut from `.mcp.json` (commit `63100dec`).
|
||||
|
||||
## Deliberately NOT doing (these are defense lines, not bugs)
|
||||
|
||||
- Calibration 6 of the judge (reading chat context) — weakens in-session defense.
|
||||
- Quirk 3 (loosen exact-match of git approval) — that exact-match is an anti-injection property.
|
||||
|
||||
## Backlog (by priority)
|
||||
|
||||
### A. `npm ci` in router-gate whitelist (`SAFE_EXACT` in `tools/enforce-router-gate.mjs`) ← current
|
||||
|
||||
Restoring locked dependencies is safe and closes worktree-setup friction. `npm ci` installs
|
||||
exactly the committed lockfile (deterministic, no version drift) — unlike `npm install`/`npm i`,
|
||||
which stay hard-blacklisted because they can pull new/updated versions.
|
||||
|
||||
**TDD:**
|
||||
1. RED — new describe block in `tools/enforce-router-gate.test.mjs`: allow `npm ci`,
|
||||
`npm ci --no-audit`, `npm ci --prefer-offline`; still block `npm install`/`npm i`/
|
||||
`npm install foo`/`npm i foo` (hard-blacklist), `npm cider` (word boundary → default-deny),
|
||||
`npm ci && rm x` (chain mutating).
|
||||
2. GREEN — add `/^npm\s+ci\b/` to `SAFE_EXACT` with rationale comment. `\b` prevents
|
||||
`npm cider`-style prefix matches. Blacklist runs before whitelist, so `npm install`/`npm i`
|
||||
stay blocked (the `i`-alternative needs `i` right after the space; `npm ci` has `c` there).
|
||||
3. tools-vitest full run (also the push sentinel).
|
||||
4. Commit via AskUserQuestion (label = exact command).
|
||||
|
||||
### B. Cosmetic path strings in gate messages
|
||||
|
||||
`c:/` vs `/c/`, unexpanded `$env:` in gate messages. Polish only.
|
||||
|
||||
### F. Parallel-session-lock false cross-worktree collision (2026-05-31, owner-raised)
|
||||
|
||||
Symptom: a session in worktree `discipline-guard` was blocked by
|
||||
`enforce-parallel-session-lock` (held by another session `7f6efd48`, pid changed
|
||||
12552→19044 across attempts → holder still active; pid is the transient hook-node pid,
|
||||
session_id is the stable identity).
|
||||
|
||||
**Investigation (read-only):**
|
||||
- Lock keyed by `computeWorkspaceHash(process.cwd())` = md5(cwd).slice(0,12); file
|
||||
`~/.claude/runtime/session-lock-<hash>.json`; release only on Stop; TTL 5 min.
|
||||
- 9 lock files accumulated → stale files leak when a session closes without a clean Stop.
|
||||
- `enforce-branch-switch` read branch "worktree-discipline-guard" via
|
||||
`git branch --show-current` from `process.cwd()` → the hook's cwd IS the worktree →
|
||||
**keying is already per-worktree** (NOT coarse main-dir). So the holder shared this
|
||||
worktree's hash → genuine same-worktree concurrency, the lock working as designed —
|
||||
NOT a false positive. Do NOT re-key (would weaken same-tree serialization).
|
||||
|
||||
**Genuinely-fixable part (no weakening):** leaked lock on close-without-Stop blocks the next
|
||||
same-worktree session for up to TTL. Fix: release on SessionEnd (not only Stop) + prune
|
||||
stale lock files on acquire. Ground-truth the lock JSON before coding.
|
||||
|
||||
**Closure (2026-05-31).** All keying/hygiene/UX parts done, no discipline weakened:
|
||||
- **A — keying by worktree root** (`resolveWorkspacePath`, commit `7a469dc9`): keys the
|
||||
lock on the session's stable `event.cwd` → git toplevel, not the volatile hook
|
||||
`process.cwd()` (which collapses to main on resume → cross-worktree false-blocks).
|
||||
Same-worktree serialization unchanged; fallback to `process.cwd()` if `event.cwd` absent.
|
||||
- **D — clearer block message**: identifies the holder by its STABLE `session_id`; marks
|
||||
the recorded pid as transient ("may change between attempts"). Chasing the pid was what
|
||||
led to closing the wrong session. Logic untouched (text only).
|
||||
- **B — `pruneStaleLocks`**: best-effort delete of leaked lock files that are ALREADY
|
||||
stale by the shared `isStale()` (now exported — single source of truth). Active
|
||||
within-TTL locks are never touched → serialization not weakened. Wired into the
|
||||
PreToolUse branch of `main()`, wrapped so hygiene can never break the gate.
|
||||
- **C — release on SessionEnd**: NO new code. The existing `!event.tool_name` branch
|
||||
already releases. To make release fire on session end (not only on Stop turns),
|
||||
**OWNER ACTION in `.claude/settings.json`**: add `enforce-parallel-session-lock.mjs`
|
||||
to the `SessionEnd` hook array (it already runs on `Stop`). Pure config; Claude cannot
|
||||
edit settings.json. Until added, leaked locks are still self-healing via B (prune) +
|
||||
the 5-min TTL takeover — so this is a reliability nicety, not a correctness gap.
|
||||
- **E/F — live**: fix is on branch `worktree-discipline-guard`; the live hook executes
|
||||
from `tools/` on **main**, so it is active only after merge to main. Runtime
|
||||
effectiveness of A depends on the PreToolUse payload carrying `cwd`; if absent, the
|
||||
safe fallback = prior behavior (no regression). Verify on main.
|
||||
|
||||
### C. TDD-gate cross-actor — chosen: **Z** (full, 2026-05-31; on hold behind F)
|
||||
|
||||
`enforce-tdd-gate` does not see test edits made by a subagent (scans only the controller's
|
||||
own turn; subagent test edit + RED live in `agent-<id>.jsonl`). **Z = Part 1 (close the
|
||||
projects/ Write hole — verified prerequisite) then Part 2 (read subagent transcript bound to
|
||||
a Task in this turn).** Condition 1 verified VIOLATED (no Write-tool gate covers
|
||||
`~/.claude/projects/`), so Variant 1 alone would weaken — safe only bundled with Part 1.
|
||||
|
||||
**Closure (2026-05-31, TDD, no discipline weakened — net strengthening):**
|
||||
- **Part 1** — `enforce-runtime-write-deny.mjs` extended with `TRANSCRIPT_RE`
|
||||
(`(^|/)\.claude/projects/.*\.jsonl$`): the Write tool can no longer create/overwrite any
|
||||
session/subagent transcript `.jsonl`. Memory files there are `.md` and stay writable
|
||||
(never match `.jsonl$`). Resolving normalizer blocks `.`/`..` evasion. This makes the
|
||||
agent-`<id>`.jsonl that Part 2 trusts unforgeable.
|
||||
- **Part 2** — `enforce-tdd-gate.mjs`: `decide()` now also credits a subagent's matching
|
||||
test edit + RED run via new `subagentEntriesList`. `turnTaskAgentIds(turn)` extracts the
|
||||
**hex** agentId from the harness-written `Task` tool_result ("agentId: <hex>") — the
|
||||
controller cannot forge its own tool_result, and the hex-only match blocks
|
||||
`agentId: ../../x` path-traversal. `subagentTranscriptPaths()` derives
|
||||
`<dir>/<controller-session>/subagents/agent-<id>.jsonl` (bound to the controller session).
|
||||
`main()` reads those transcripts best-effort (missing → no extra credit = stricter, never
|
||||
an error). No NEW weakening: a delegated subagent doing real TDD is legitimate; the only
|
||||
forgery vector (overwrite the agent jsonl) is closed by Part 1.
|
||||
- Full tools-vitest: **2027 passed / 2 skipped**.
|
||||
- **OWNER ACTION (settings.json, Claude can't edit it):** `enforce-tdd-gate.mjs` is already
|
||||
a registered PreToolUse hook → Part 2 goes live on merge. **Part 1 requires that
|
||||
`enforce-runtime-write-deny.mjs` be registered** on PreToolUse(Edit|Write|MultiEdit|
|
||||
NotebookEdit); if it is not yet registered, the transcript Write-deny is inert until added.
|
||||
|
||||
### G. Coverage line under-reports cross-turn active skill (2026-05-31, owner-raised)
|
||||
|
||||
Symptom: the `coverage: <channel>:<id>` line says `direct`/`chain` when a skill chosen in a
|
||||
PRIOR turn is still active in the current turn. Root cause: `enforce-coverage-verify.mjs`
|
||||
credits `channel=skill` only if the `Skill` tool was invoked in the CURRENT turn
|
||||
(`turnToolUses`). On a continuation turn (skill still active, not re-invoked) an honest
|
||||
`skill:X` line would be BLOCKED → so the controller learns to under-report as `direct`/`chain`.
|
||||
|
||||
**Fix (no weakening):** also credit `skill:X` if X was invoked anywhere earlier in THIS
|
||||
session (a real `Skill` tool_use in the transcript — still unforgeable). decide() gains a
|
||||
`priorSkillNames` param; main() collects session-wide Skill names via `sessionToolUses`.
|
||||
Residual: attribution may be stale (skill invoked long ago) — acceptable; the alternative
|
||||
(forced dishonest `direct`) is worse, and the owner wants cross-turn skills honored.
|
||||
|
||||
### D. Smoke 8 — live Workflow-gate F2 test
|
||||
|
||||
Needs a clean session (not code).
|
||||
|
||||
### E. H10 — auto-bootstrap worktree (junction node_modules) in `tools/subagent-prompt-prefix.mjs`
|
||||
|
||||
### (later) Layer 5 — VM + YubiKey — needs hardware.
|
||||
|
||||
## Environment working rules
|
||||
|
||||
- Tests / push sentinel: `npx vitest run --root app --config vitest.config.tools.mjs`
|
||||
(NOT `npm run test:tools` — breaks on keytar). From inside the worktree it's run as
|
||||
`--root app`; from the main checkout, point `--root` at the worktree app dir.
|
||||
- Commit: only via AskUserQuestion where the option label = the EXACT command (router-gate
|
||||
compares verbatim) + plain-language explanation; commit text via `-F` file in `.scratch/`;
|
||||
commit only explicit paths (parallel sessions).
|
||||
- Push: needs a fresh verify-sentinel (full run ≤30 min); override phrases are dead
|
||||
(`findOverride` is a stub) → the only path to push non-`.md` changes is to run the tests.
|
||||
@@ -26,6 +26,7 @@ import {
|
||||
lastAssistantText,
|
||||
parseCoverageLine,
|
||||
turnToolUses,
|
||||
sessionToolUses,
|
||||
findOverride,
|
||||
logOverride,
|
||||
exitDecision,
|
||||
@@ -38,7 +39,7 @@ const MUTATING_TOOLS = new Set([
|
||||
]);
|
||||
|
||||
export function decide({
|
||||
toolUses, assistantText, override,
|
||||
toolUses, assistantText, override, priorSkillNames = [],
|
||||
}) {
|
||||
// Pure conversational turn — skip.
|
||||
const hasMutating = toolUses.some((u) => MUTATING_TOOLS.has(u.name));
|
||||
@@ -59,12 +60,19 @@ export function decide({
|
||||
}
|
||||
|
||||
if (cov.channel === 'skill') {
|
||||
const found = toolUses.some((u) => u.name === 'Skill' && u.input && (u.input.skill === cov.id || u.input.skill === cov.id.replace(/^superpowers:/, '')));
|
||||
if (!found) {
|
||||
// Accept if the skill was invoked in THIS turn OR anywhere earlier in this
|
||||
// session (item G): a skill chosen in a prior turn stays active, so an honest
|
||||
// skill:X line on a continuation turn must not be punished into under-reporting.
|
||||
// Still unforgeable — a real Skill tool_use must exist in the transcript.
|
||||
const norm = (s) => String(s || '').replace(/^superpowers:/, '');
|
||||
const idNorm = norm(cov.id);
|
||||
const foundThisTurn = toolUses.some((u) => u.name === 'Skill' && u.input && norm(u.input.skill) === idNorm);
|
||||
const foundPrior = (priorSkillNames || []).some((n) => norm(n) === idNorm);
|
||||
if (!foundThisTurn && !foundPrior) {
|
||||
return {
|
||||
block: true,
|
||||
message: [
|
||||
`[enforce-coverage-verify] coverage says skill:${cov.id} but the Skill tool was never invoked with that name in this turn.`,
|
||||
`[enforce-coverage-verify] coverage says skill:${cov.id} but the Skill tool was never invoked with that name in this turn or any prior turn of this session.`,
|
||||
`Either invoke the skill via Skill tool, or switch coverage to direct:<role> with justification.`,
|
||||
].join('\n'),
|
||||
};
|
||||
@@ -87,8 +95,13 @@ async function main() {
|
||||
|
||||
const toolUses = turnToolUses(transcript);
|
||||
const assistantText = lastAssistantText(transcript);
|
||||
// Session-wide Skill invocations (item G): a skill chosen in a prior turn is
|
||||
// still active and may legitimately be named in this turn's coverage line.
|
||||
const priorSkillNames = sessionToolUses(transcript)
|
||||
.filter((u) => u.name === 'Skill' && u.input && u.input.skill)
|
||||
.map((u) => u.input.skill);
|
||||
|
||||
const result = decide({ toolUses, assistantText, override });
|
||||
const result = decide({ toolUses, assistantText, override, priorSkillNames });
|
||||
exitDecision(result);
|
||||
} catch {
|
||||
exitDecision({ block: false });
|
||||
|
||||
@@ -1,6 +1,40 @@
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { decide } from './enforce-coverage-verify.mjs';
|
||||
|
||||
// Cross-turn skill credit (backlog item G, 2026-05-31): a skill chosen in a PRIOR
|
||||
// turn stays active; an honest `skill:X` line on a continuation turn must NOT be
|
||||
// blocked just because the Skill tool was not re-invoked this turn. decide() takes
|
||||
// priorSkillNames (real Skill tool_uses from earlier in the session transcript).
|
||||
describe('enforce-coverage-verify / decide — cross-turn active skill (enforce-coverage-verify.mjs)', () => {
|
||||
it('credits skill:X when X was invoked in a PRIOR turn (priorSkillNames)', () => {
|
||||
const r = decide({
|
||||
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
|
||||
assistantText: 'coverage: skill:superpowers:test-driven-development\nработаю',
|
||||
priorSkillNames: ['superpowers:test-driven-development'],
|
||||
});
|
||||
expect(r.block).toBe(false);
|
||||
});
|
||||
|
||||
it('normalizes the superpowers: prefix for prior-turn skills too', () => {
|
||||
const r = decide({
|
||||
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
|
||||
assistantText: 'coverage: skill:superpowers:test-driven-development',
|
||||
priorSkillNames: ['test-driven-development'],
|
||||
});
|
||||
expect(r.block).toBe(false);
|
||||
});
|
||||
|
||||
it('still blocks skill:X when X is neither in this turn nor any prior turn', () => {
|
||||
const r = decide({
|
||||
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
|
||||
assistantText: 'coverage: skill:superpowers:test-driven-development',
|
||||
priorSkillNames: ['some-other-skill'],
|
||||
});
|
||||
expect(r.block).toBe(true);
|
||||
expect(r.message).toMatch(/never invoked/);
|
||||
});
|
||||
});
|
||||
|
||||
describe('enforce-coverage-verify / decide', () => {
|
||||
it('allows turn with no mutating tools (pure conversational)', () => {
|
||||
const r = decide({ toolUses: [{ name: 'Read', input: {} }], assistantText: 'just talking' });
|
||||
|
||||
@@ -11,10 +11,12 @@
|
||||
* Activation: settings.json registration is deferred to Phase H-α/H-β
|
||||
* batch step. main() is a no-op (exit 0) until then.
|
||||
*/
|
||||
import { acquire, release, computeWorkspaceHash } from './parallel-session-lock.mjs';
|
||||
import { readFileSync, writeFileSync, unlinkSync, mkdirSync } from 'node:fs';
|
||||
import { acquire, release, computeWorkspaceHash, isStale } from './parallel-session-lock.mjs';
|
||||
import { readFileSync, writeFileSync, unlinkSync, mkdirSync, readdirSync } from 'node:fs';
|
||||
import { execFileSync } from 'node:child_process';
|
||||
import { join, dirname } from 'node:path';
|
||||
import { readStdin, parseEventJson, exitDecision, runtimeDir } from './enforce-hook-helpers.mjs';
|
||||
import { classifyBashCommand } from './enforce-router-gate.mjs';
|
||||
|
||||
/**
|
||||
* Pure decision: given an acquire() result, decide block/allow.
|
||||
@@ -29,12 +31,41 @@ export function decide({ acquireResult, sessionId }) {
|
||||
if (!acquireResult || typeof acquireResult !== 'object') return { block: false };
|
||||
if (acquireResult.acquired) return { block: false };
|
||||
const holder = acquireResult.holder || {};
|
||||
// Identify the holder by its STABLE session id, not the pid: the recorded pid
|
||||
// is the transient hook-node pid and changes between attempts, so chasing it
|
||||
// leads to closing the wrong session. Surface the pid only as a triage hint.
|
||||
return {
|
||||
block: true,
|
||||
reason: `parallel session lock held by ${holder.session_id || 'unknown'} (pid ${holder.pid || '?'}) — wait or close that session first`,
|
||||
reason: `parallel session lock held by session ${holder.session_id || 'unknown'} (current pid ${holder.pid || '?'}, may change between attempts — identify the session by its id, not pid) — wait for the 5-min TTL or close THAT session`,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Calibration (2026-05-31, SCOPE fix, NOT a discipline drop). The lock's purpose
|
||||
* is to serialize concurrent FILE MUTATION between sessions on the same worktree.
|
||||
* A readonly Bash command (git status/log/diff, cat, grep, ls — "смотрелки")
|
||||
* mutates nothing, so a peer session's lock must NOT block it. Reuse the
|
||||
* router-gate Bash classifier: an allow-verdict whose reason mentions
|
||||
* readonly/reading is a no-state-change command. Mirrors the LLM-judge readonly
|
||||
* calibration. Everything that can mutate — file edits, git commit/push,
|
||||
* dangerous Bash, and every NON-Bash tool — still acquires/checks the lock, so
|
||||
* same-worktree mutation serialization is unchanged.
|
||||
*
|
||||
* @param {object} event
|
||||
* @returns {boolean}
|
||||
*/
|
||||
export function isReadonlyBashEvent(event) {
|
||||
if (!event || event.tool_name !== 'Bash') return false;
|
||||
const command = (event.tool_input && event.tool_input.command) || '';
|
||||
if (!command) return false;
|
||||
try {
|
||||
const c = classifyBashCommand(command, {});
|
||||
return !!c && c.result === 'allow' && /readonly|reading/i.test(c.reason || '');
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* PreToolUse wiring: acquire (or same-session refresh / stale takeover) the lock,
|
||||
* then decide block/allow. I/O injected for testability.
|
||||
@@ -60,6 +91,64 @@ export function runReleaseAction({ event, cwd, readLock, deleteLock }) {
|
||||
return { released: true };
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve the stable work-tree root used as the lock key. Keys on the SESSION's
|
||||
* cwd (`event.cwd`, stable across resume) resolved to the git work-tree root —
|
||||
* NOT the hook's `process.cwd()`, which collapses to the main repo dir after a
|
||||
* session resume and thereby false-blocks sessions in DIFFERENT worktrees.
|
||||
* Pure (I/O injected): `runGitToplevel(dir)` returns the toplevel or '' on failure.
|
||||
*
|
||||
* @param {object} p
|
||||
* @param {object} p.event
|
||||
* @param {string} p.processCwd
|
||||
* @param {(dir:string)=>string} p.runGitToplevel
|
||||
* @returns {string}
|
||||
*/
|
||||
export function resolveWorkspacePath({ event, processCwd, runGitToplevel }) {
|
||||
const dir = (event && typeof event.cwd === 'string' && event.cwd) ? event.cwd : processCwd;
|
||||
try {
|
||||
const top = runGitToplevel(dir);
|
||||
if (top && typeof top === 'string') return top;
|
||||
} catch { /* fall through to raw dir (fail-open) */ }
|
||||
return dir;
|
||||
}
|
||||
|
||||
/**
|
||||
* Disk hygiene: delete leaked lock files whose record is ALREADY stale by the
|
||||
* shared isStale() definition (so an active within-TTL lock is never touched).
|
||||
* Pure (I/O injected). Best-effort: a failed read counts the file as stale
|
||||
* (garbage), a failed delete is swallowed — hygiene must never break the gate.
|
||||
*
|
||||
* @param {object} p
|
||||
* @param {string[]} p.files - absolute lock-file paths
|
||||
* @param {(f:string)=>object|null} p.readRecord
|
||||
* @param {(f:string)=>void} p.deleteRecord
|
||||
* @param {(rec:object|null, now:number)=>boolean} p.isStaleFn
|
||||
* @param {number} p.now
|
||||
* @returns {{pruned: number}}
|
||||
*/
|
||||
export function pruneStaleLocks({ files, readRecord, deleteRecord, isStaleFn, now }) {
|
||||
let pruned = 0;
|
||||
for (const f of files || []) {
|
||||
let rec = null;
|
||||
try { rec = readRecord(f); } catch { rec = null; }
|
||||
if (isStaleFn(rec, now)) {
|
||||
try { deleteRecord(f); pruned++; } catch { /* best-effort */ }
|
||||
}
|
||||
}
|
||||
return { pruned };
|
||||
}
|
||||
|
||||
function realGitToplevel(dir) {
|
||||
try {
|
||||
return execFileSync('git', ['-C', dir, 'rev-parse', '--show-toplevel'], {
|
||||
encoding: 'utf-8',
|
||||
timeout: 1000,
|
||||
stdio: ['ignore', 'pipe', 'ignore'],
|
||||
}).trim();
|
||||
} catch { return ''; }
|
||||
}
|
||||
|
||||
function lockPathFor(cwd) {
|
||||
return join(runtimeDir(), `session-lock-${computeWorkspaceHash(cwd)}.json`);
|
||||
}
|
||||
@@ -82,7 +171,10 @@ async function main() {
|
||||
// a lock bug can NEVER wedge the user out of their own session.
|
||||
try {
|
||||
const event = parseEventJson(await readStdin());
|
||||
const cwd = process.cwd();
|
||||
// Key by the session's stable work-tree root (event.cwd → git toplevel),
|
||||
// not the volatile hook process.cwd() (collapses to main on resume → false
|
||||
// cross-worktree blocks). Fallback to process.cwd() keeps prior behavior.
|
||||
const cwd = resolveWorkspacePath({ event, processCwd: process.cwd(), runGitToplevel: realGitToplevel });
|
||||
const p = lockPathFor(cwd);
|
||||
|
||||
// Stop event carries no tool_name → release path.
|
||||
@@ -91,6 +183,31 @@ async function main() {
|
||||
return exitDecision({ block: false });
|
||||
}
|
||||
|
||||
// Calibration (2026-05-31): a readonly Bash command never mutates the
|
||||
// worktree, so it is outside the lock's mutation-serialization scope — allow
|
||||
// without acquiring/blocking. Mutating tools (and every non-Bash tool) fall
|
||||
// through to acquire/check below, so serialization is unchanged.
|
||||
if (isReadonlyBashEvent(event)) {
|
||||
return exitDecision({ block: false });
|
||||
}
|
||||
|
||||
// Best-effort disk hygiene (B): drop leaked stale lock files before acquiring.
|
||||
// isStale-gated → an active within-TTL lock is never pruned, so same-worktree
|
||||
// serialization is untouched. Wrapped so hygiene can never break the gate.
|
||||
try {
|
||||
const dir = runtimeDir();
|
||||
const files = readdirSync(dir)
|
||||
.filter((f) => /^session-lock-.*\.json$/.test(f))
|
||||
.map((f) => join(dir, f));
|
||||
pruneStaleLocks({
|
||||
files,
|
||||
readRecord: (fp) => realReadLock(fp),
|
||||
deleteRecord: (fp) => realDeleteLock(fp),
|
||||
isStaleFn: isStale,
|
||||
now: Date.now(),
|
||||
});
|
||||
} catch { /* hygiene is best-effort */ }
|
||||
|
||||
// PreToolUse on a mutating tool → acquire/refresh, then block/allow.
|
||||
const r = runAcquireDecision({
|
||||
event,
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
// tools/enforce-parallel-session-lock.test.mjs
|
||||
// Stream H Task 7 — wrapper tests around the pure parallel-session-lock module.
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { decide } from './enforce-parallel-session-lock.mjs';
|
||||
import { decide, isReadonlyBashEvent } from './enforce-parallel-session-lock.mjs';
|
||||
|
||||
describe('enforce-parallel-session-lock wrapper (Stream H Task 7)', () => {
|
||||
it('allow when acquire succeeded (fresh own-lock)', () => {
|
||||
@@ -43,6 +43,25 @@ describe('enforce-parallel-session-lock wrapper (Stream H Task 7)', () => {
|
||||
});
|
||||
});
|
||||
|
||||
// D (2026-05-31): the block message must steer the human to the STABLE identity
|
||||
// (session id), not the transient hook pid — chasing the pid was what caused the
|
||||
// owner to close the wrong session and deadlock the workspace.
|
||||
describe('decide() message clarity (D) — pid is transient, identify by session id', () => {
|
||||
const blocked = { acquired: false, holder: { session_id: 'sess-A', pid: 12552, acquired_at: 0 } };
|
||||
|
||||
it('names the holder session id as the stable identity', () => {
|
||||
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/sess-A/);
|
||||
});
|
||||
|
||||
it('marks the pid as changeable so the human does not chase it', () => {
|
||||
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/may change|transient/i);
|
||||
});
|
||||
|
||||
it('still surfaces the pid for triage', () => {
|
||||
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/12552/);
|
||||
});
|
||||
});
|
||||
|
||||
// Live wiring (point 2, 2026-05-31): PreToolUse acquires/refreshes the lock,
|
||||
// Stop releases it. I/O is injected (readLock/writeLock/deleteLock) so the
|
||||
// wiring stays pure and unit-testable; main() binds real fs.
|
||||
@@ -131,3 +150,147 @@ describe('runReleaseAction — Stop release wiring', () => {
|
||||
expect(deleted).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
// Cross-worktree false-block fix (2026-05-31). The lock must key on the session's
|
||||
// stable work-tree root (from event.cwd → git toplevel), NOT the hook process.cwd()
|
||||
// — which collapses to the main repo dir after a session resume, making sessions in
|
||||
// DIFFERENT worktrees share one lock and block each other.
|
||||
import { resolveWorkspacePath, pruneStaleLocks } from './enforce-parallel-session-lock.mjs';
|
||||
|
||||
describe('resolveWorkspacePath — stable worktree key', () => {
|
||||
it('keys on event.cwd (the session worktree), not the hook process.cwd()', () => {
|
||||
const r = resolveWorkspacePath({
|
||||
event: { cwd: '/repo/.claude/worktrees/wt-A' },
|
||||
processCwd: '/repo',
|
||||
runGitToplevel: (dir) => dir,
|
||||
});
|
||||
expect(r).toBe('/repo/.claude/worktrees/wt-A');
|
||||
});
|
||||
|
||||
it('gives different keys for two different worktrees (no cross-block)', () => {
|
||||
const opts = { processCwd: '/repo', runGitToplevel: (dir) => dir };
|
||||
const a = resolveWorkspacePath({ event: { cwd: '/repo/.claude/worktrees/wt-A' }, ...opts });
|
||||
const b = resolveWorkspacePath({ event: { cwd: '/repo/.claude/worktrees/wt-B' }, ...opts });
|
||||
expect(a).not.toBe(b);
|
||||
});
|
||||
|
||||
it('resolves to the git work-tree root (collapses subdir variance)', () => {
|
||||
const r = resolveWorkspacePath({
|
||||
event: { cwd: '/repo/.claude/worktrees/wt-A/tools' },
|
||||
processCwd: '/repo',
|
||||
runGitToplevel: () => '/repo/.claude/worktrees/wt-A',
|
||||
});
|
||||
expect(r).toBe('/repo/.claude/worktrees/wt-A');
|
||||
});
|
||||
|
||||
it('falls back to processCwd when event.cwd is absent', () => {
|
||||
const r = resolveWorkspacePath({
|
||||
event: { tool_name: 'Edit' },
|
||||
processCwd: '/repo',
|
||||
runGitToplevel: (dir) => dir,
|
||||
});
|
||||
expect(r).toBe('/repo');
|
||||
});
|
||||
|
||||
it('falls back to the raw dir when git toplevel resolution fails (fail-open)', () => {
|
||||
const r = resolveWorkspacePath({
|
||||
event: { cwd: '/some/dir' },
|
||||
processCwd: '/repo',
|
||||
runGitToplevel: () => '',
|
||||
});
|
||||
expect(r).toBe('/some/dir');
|
||||
});
|
||||
});
|
||||
|
||||
// B (2026-05-31): disk hygiene. Leaked lock files (session closed without a clean
|
||||
// Stop) pile up in ~/.claude/runtime. Pruning ONLY removes records that are
|
||||
// already stale by the SAME isStale() definition acquire() uses — so it can never
|
||||
// drop an active (within-TTL) lock and never weakens same-worktree serialization.
|
||||
describe('pruneStaleLocks — drops only already-stale leaked locks (B)', () => {
|
||||
const fresh = { schema_version: 1, session_id: 'A', pid: 1, acquired_at: 1000, ttl_ms: 300000 };
|
||||
const stale = { schema_version: 1, session_id: 'B', pid: 2, acquired_at: 0, ttl_ms: 100 };
|
||||
const isStaleFn = (rec, now) => !rec || (now - (rec && rec.acquired_at || 0)) > ((rec && rec.ttl_ms) || 300000);
|
||||
|
||||
it('deletes stale lock files and never the fresh (active) ones', () => {
|
||||
const records = { '/r/lock-fresh.json': fresh, '/r/lock-stale.json': stale };
|
||||
const deleted = [];
|
||||
const r = pruneStaleLocks({
|
||||
files: Object.keys(records),
|
||||
readRecord: (f) => records[f],
|
||||
deleteRecord: (f) => deleted.push(f),
|
||||
isStaleFn, now: 1000,
|
||||
});
|
||||
expect(deleted).toEqual(['/r/lock-stale.json']);
|
||||
expect(r.pruned).toBe(1);
|
||||
});
|
||||
|
||||
it('treats an unreadable/garbage lock file as stale and prunes it', () => {
|
||||
const deleted = [];
|
||||
pruneStaleLocks({
|
||||
files: ['/r/garbage.json'],
|
||||
readRecord: () => { throw new Error('bad json'); },
|
||||
deleteRecord: (f) => deleted.push(f),
|
||||
isStaleFn, now: 1000,
|
||||
});
|
||||
expect(deleted).toEqual(['/r/garbage.json']);
|
||||
});
|
||||
|
||||
it('never throws when a delete fails (best-effort hygiene)', () => {
|
||||
expect(() => pruneStaleLocks({
|
||||
files: ['/r/x.json'],
|
||||
readRecord: () => stale,
|
||||
deleteRecord: () => { throw new Error('locked'); },
|
||||
isStaleFn, now: 1000,
|
||||
})).not.toThrow();
|
||||
});
|
||||
|
||||
it('does nothing for an empty file list', () => {
|
||||
const r = pruneStaleLocks({ files: [], readRecord: () => null, deleteRecord: () => {}, isStaleFn, now: 1 });
|
||||
expect(r.pruned).toBe(0);
|
||||
});
|
||||
});
|
||||
|
||||
// ── Calibration (2026-05-31): readonly Bash is outside the lock scope ──
|
||||
// The lock serializes concurrent FILE MUTATION between sessions on the same
|
||||
// worktree. A readonly Bash command (git status/log/diff, cat, grep, ls)
|
||||
// mutates nothing, so a peer session's lock must NOT block it. This mirrors the
|
||||
// LLM-judge readonly calibration (isReadonlyBashEvent in enforce-llm-judge-per-tool).
|
||||
// Everything that can mutate — file edits, git commit/push, dangerous Bash, and
|
||||
// every NON-Bash tool — still acquires/checks the lock, so mutation
|
||||
// serialization is unchanged (scope fix, NOT a discipline drop).
|
||||
describe('isReadonlyBashEvent — readonly Bash bypasses the lock (calibration 2026-05-31)', () => {
|
||||
const ev = (command) => ({ tool_name: 'Bash', tool_input: { command } });
|
||||
|
||||
it('treats readonly git (status/log/diff) as readonly', () => {
|
||||
expect(isReadonlyBashEvent(ev('git status'))).toBe(true);
|
||||
expect(isReadonlyBashEvent(ev('git log --oneline -5'))).toBe(true);
|
||||
expect(isReadonlyBashEvent(ev('git diff'))).toBe(true);
|
||||
});
|
||||
|
||||
it('treats whitelisted reading commands (cat/grep/ls) as readonly', () => {
|
||||
expect(isReadonlyBashEvent(ev('ls -la'))).toBe(true);
|
||||
expect(isReadonlyBashEvent(ev('cat README.md'))).toBe(true);
|
||||
expect(isReadonlyBashEvent(ev('grep -n foo bar.txt'))).toBe(true);
|
||||
});
|
||||
|
||||
it('does NOT treat mutating Bash as readonly (still acquires/blocks)', () => {
|
||||
expect(isReadonlyBashEvent(ev('rm -rf x'))).toBe(false);
|
||||
expect(isReadonlyBashEvent(ev('git commit -m "x"'))).toBe(false);
|
||||
expect(isReadonlyBashEvent(ev('npm install foo'))).toBe(false);
|
||||
});
|
||||
|
||||
it('does NOT treat a chain with a mutating part as readonly (C13)', () => {
|
||||
expect(isReadonlyBashEvent(ev('git status && rm x'))).toBe(false);
|
||||
});
|
||||
|
||||
it('only applies to the Bash tool — other tools still acquire the lock', () => {
|
||||
expect(isReadonlyBashEvent({ tool_name: 'Edit', tool_input: { file_path: 'a.js' } })).toBe(false);
|
||||
expect(isReadonlyBashEvent({ tool_name: 'Write', tool_input: { file_path: 'a.js' } })).toBe(false);
|
||||
});
|
||||
|
||||
it('is safe on malformed input', () => {
|
||||
expect(isReadonlyBashEvent(null)).toBe(false);
|
||||
expect(isReadonlyBashEvent({ tool_name: 'Bash', tool_input: {} })).toBe(false);
|
||||
expect(isReadonlyBashEvent({ tool_name: 'Bash' })).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -21,13 +21,15 @@ import {
|
||||
parseEventJson,
|
||||
readRouterState,
|
||||
readRationalizationFlags,
|
||||
readTranscript,
|
||||
sessionToolUses,
|
||||
findOverride,
|
||||
loadOverrideVocab,
|
||||
} from './enforce-hook-helpers.mjs';
|
||||
|
||||
const SUPPRESS_RULE = 'classifier-mismatch';
|
||||
|
||||
export function buildReminder({ classification, recentFlags, override }) {
|
||||
export function buildReminder({ classification, recentFlags, override, activeSkills = [] }) {
|
||||
const lines = ['## §17 Coverage / Discipline Reminder', ''];
|
||||
if (override) {
|
||||
lines.push(`Override phrase detected: "${override.phrase}". The following rules are suppressed for THIS prompt only:`);
|
||||
@@ -38,6 +40,16 @@ export function buildReminder({ classification, recentFlags, override }) {
|
||||
lines.push(' `coverage: <channel>:<id>`');
|
||||
lines.push('Channels: skill, node, chain, hook, agent, direct.');
|
||||
lines.push('');
|
||||
// Item G (2026-05-31): a skill invoked in an EARLIER turn stays active. Remind
|
||||
// explicitly so the coverage line is not under-reported as direct/chain when the
|
||||
// work actually continues under that skill. (The verifier now accepts a prior-turn
|
||||
// skill, so this report is honest, not a violation.)
|
||||
if (Array.isArray(activeSkills) && activeSkills.length > 0) {
|
||||
lines.push('**Active skill(s) still in effect from earlier this session:**');
|
||||
for (const s of activeSkills) lines.push(` - ${s}`);
|
||||
lines.push('If your work continues under one of these, report `coverage: skill:<name>` (not direct/chain).');
|
||||
lines.push('');
|
||||
}
|
||||
if (classification) {
|
||||
lines.push(`**Classifier output:** task_type=${classification.task_type || 'unknown'}, confidence=${classification.confidence ?? 'n/a'}`);
|
||||
if (classification.recommended_node) {
|
||||
@@ -94,7 +106,21 @@ async function main() {
|
||||
|
||||
const flags = readRationalizationFlags(sessionId);
|
||||
|
||||
const reminder = buildReminder({ classification, recentFlags: flags, override });
|
||||
// Item G: detect skills invoked earlier this session (still active). The
|
||||
// transcript at UserPromptSubmit holds all prior turns. Best-effort.
|
||||
let activeSkills = [];
|
||||
try {
|
||||
const transcript = readTranscript(event.transcript_path);
|
||||
const seen = new Set();
|
||||
for (const u of sessionToolUses(transcript)) {
|
||||
if (u.name === 'Skill' && u.input && u.input.skill && !seen.has(u.input.skill)) {
|
||||
seen.add(u.input.skill);
|
||||
activeSkills.push(u.input.skill);
|
||||
}
|
||||
}
|
||||
} catch { activeSkills = []; }
|
||||
|
||||
const reminder = buildReminder({ classification, recentFlags: flags, override, activeSkills });
|
||||
|
||||
process.stdout.write(JSON.stringify({
|
||||
hookSpecificOutput: {
|
||||
|
||||
@@ -66,6 +66,22 @@ describe('enforce-prompt-injection / buildReminder', () => {
|
||||
expect(txt).toMatch(/verify-before-push/);
|
||||
});
|
||||
|
||||
it('reminds about active skills carried over from prior turns (item G)', () => {
|
||||
const txt = buildReminder({
|
||||
classification: null,
|
||||
recentFlags: [],
|
||||
activeSkills: ['superpowers:test-driven-development'],
|
||||
});
|
||||
expect(txt).toMatch(/Active skill/i);
|
||||
expect(txt).toMatch(/test-driven-development/);
|
||||
expect(txt).toMatch(/coverage: skill:/);
|
||||
});
|
||||
|
||||
it('omits the active-skill note when none are active', () => {
|
||||
const txt = buildReminder({ classification: null, recentFlags: [], activeSkills: [] });
|
||||
expect(txt).not.toMatch(/Active skill/i);
|
||||
});
|
||||
|
||||
it('does NOT advertise dead override-vocabulary phrases (v4 stub — 1A 2026-05-31)', () => {
|
||||
const txt = buildReminder({ classification: null, recentFlags: [] });
|
||||
// findOverride/loadOverrideVocab — заглушки (vocab removed in v4); реклама фраз
|
||||
|
||||
@@ -120,6 +120,12 @@ const READING_CMDS = new Set(['ls', 'pwd', 'wc', 'head', 'tail', 'file', 'stat',
|
||||
const SAFE_EXACT = [
|
||||
/^npx\s+vitest\s+(?:run|--version)\b/,
|
||||
/^npm\s+(?:test|run\s+test|run\s+lint(?::[\w-]+)?)\b/,
|
||||
// `npm ci` (2026-05-31, owner-authorized) — clean install from the committed
|
||||
// lockfile (deterministic, no version drift) to restore junction node_modules
|
||||
// in a fresh worktree. Distinct from `npm install`/`npm i`, which stay
|
||||
// hard-blacklisted (line ~60) because they can pull new/updated versions.
|
||||
// `\b` after `ci` prevents `npm cider`-style prefix matches.
|
||||
/^npm\s+ci\b/,
|
||||
/^php\s+artisan\s+(?:list|route:list|migrate:status)\b/,
|
||||
/^composer\s+(?:show|outdated)\b/,
|
||||
/^node\s+(?!.*(?:-e|--eval|-p|--print|-r|--require|--import|--experimental-loader)\b)/,
|
||||
|
||||
@@ -271,6 +271,39 @@ describe('SAFE_EXACT — narrow `cd app` whitelist (2026-05-31, owner-authorized
|
||||
});
|
||||
});
|
||||
|
||||
describe('SAFE_EXACT — npm ci (worktree dep restore, 2026-05-31)', () => {
|
||||
// Allowed: npm ci installs exactly the committed lockfile (deterministic, no
|
||||
// version drift) — needed to restore junction node_modules in a fresh worktree.
|
||||
it.each([
|
||||
'npm ci',
|
||||
'npm ci --no-audit',
|
||||
'npm ci --prefer-offline',
|
||||
])('allows %s', (cmd) => {
|
||||
expect(classifyBashCommand(cmd, {}).result).toBe('allow');
|
||||
});
|
||||
|
||||
// Critical: npm install / npm i remain hard-blacklisted (line 60) — they can
|
||||
// pull new/updated versions, unlike ci which pins to the lockfile.
|
||||
it.each([
|
||||
'npm install',
|
||||
'npm i',
|
||||
'npm install foo',
|
||||
'npm i foo',
|
||||
])('still blocks %s (hard-blacklist)', (cmd) => {
|
||||
expect(classifyBashCommand(cmd, {}).result).toBe('block');
|
||||
});
|
||||
|
||||
// Critical: word boundary — `npm cider` (or any ci-prefixed token) is NOT npm ci
|
||||
it('does not allow ci-prefixed token (word boundary)', () => {
|
||||
expect(classifyBashCommand('npm cider', {}).result).toBe('block');
|
||||
});
|
||||
|
||||
// Critical: chain semantics still enforced — npm ci && rm x → block (rm mutating)
|
||||
it('still blocks chain with mutating part after npm ci', () => {
|
||||
expect(classifyBashCommand('npm ci && rm x', {}).result).toBe('block');
|
||||
});
|
||||
});
|
||||
|
||||
import { stripQuotedSpans } from './enforce-router-gate.mjs';
|
||||
|
||||
describe('quote-aware redirect (quirk 2)', () => {
|
||||
|
||||
@@ -24,6 +24,11 @@ import { readStdin, parseEventJson, exitDecision } from './enforce-hook-helpers.
|
||||
|
||||
const WRITE_TOOLS = new Set(['Edit', 'Write', 'MultiEdit', 'NotebookEdit']);
|
||||
const RUNTIME_RE = /(^|\/)\.claude\/runtime(\/|$)/i;
|
||||
// Transcript protection (Z Part 1): any *.jsonl under ~/.claude/projects/** is a
|
||||
// session/subagent transcript. The tdd-gate credits a subagent's RED from its
|
||||
// agent-<id>.jsonl, so these must be unforgeable by the Write tool. Memory files
|
||||
// there are *.md and never match `.jsonl$`, so memory writes stay allowed.
|
||||
const TRANSCRIPT_RE = /(^|\/)\.claude\/projects\/.*\.jsonl$/i;
|
||||
|
||||
/**
|
||||
* Pure decision.
|
||||
@@ -39,12 +44,19 @@ export function decide({ toolName, filePath, normalizeImpl = pathNormalize }) {
|
||||
if (!fp) return { block: false };
|
||||
let norm;
|
||||
try { norm = normalizeImpl(fp); } catch { return { block: false }; } // cannot determine → fail-open
|
||||
if (RUNTIME_RE.test(String(norm || ''))) {
|
||||
const normStr = String(norm || '');
|
||||
if (RUNTIME_RE.test(normStr)) {
|
||||
return {
|
||||
block: true,
|
||||
reason: `Write to «${norm}» denied — ~/.claude/runtime is a protected side-channel (git-approval anchor). Hooks write it via Node fs, not the Write tool.`,
|
||||
};
|
||||
}
|
||||
if (TRANSCRIPT_RE.test(normStr)) {
|
||||
return {
|
||||
block: true,
|
||||
reason: `Write to «${norm}» denied — ~/.claude/projects/**/*.jsonl are session/subagent transcripts (tamper-protected; the tdd-gate trusts them). The harness writes transcripts, never the Write tool. Memory *.md there stays writable.`,
|
||||
};
|
||||
}
|
||||
return { block: false };
|
||||
}
|
||||
|
||||
|
||||
@@ -52,3 +52,47 @@ describe('enforce-runtime-write-deny decide()', () => {
|
||||
expect(r.block).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
// Part 1 of Z (2026-05-31): close the transcript Write hole. The tdd-gate will
|
||||
// (Part 2) credit a subagent's RED from its agent-<id>.jsonl; that transcript
|
||||
// must therefore be unforgeable. The Write tool was the last ungated channel
|
||||
// into ~/.claude/projects/**/*.jsonl (Bash/PowerShell/Read gates already cover
|
||||
// it). Memory files there are .md and stay writable (they never match .jsonl$).
|
||||
describe('enforce-runtime-write-deny — transcript .jsonl protection (Z Part 1)', () => {
|
||||
it('blocks a Write to a subagent transcript under ~/.claude/projects', () => {
|
||||
const p = join(HOME, '.claude', 'projects', 'slug', 'sess-uuid', 'subagents', 'agent-abc.jsonl');
|
||||
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(true);
|
||||
});
|
||||
|
||||
it('blocks a Write to the controller session transcript itself', () => {
|
||||
const p = join(HOME, '.claude', 'projects', 'slug', 'sess-uuid.jsonl');
|
||||
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(true);
|
||||
});
|
||||
|
||||
it('blocks Edit/MultiEdit/NotebookEdit on a transcript .jsonl too', () => {
|
||||
const p = join(HOME, '.claude', 'projects', 'slug', 'sess', 'subagents', 'agent-x.jsonl');
|
||||
expect(decide({ toolName: 'Edit', filePath: p }).block).toBe(true);
|
||||
expect(decide({ toolName: 'MultiEdit', filePath: p }).block).toBe(true);
|
||||
expect(decide({ toolName: 'NotebookEdit', filePath: p }).block).toBe(true);
|
||||
});
|
||||
|
||||
it('blocks the .-segment evasion into projects transcripts', () => {
|
||||
const evasion = `${HOME_FWD}/.claude/projects/slug/./sess/subagents/agent-x.jsonl`;
|
||||
expect(decide({ toolName: 'Write', filePath: evasion }).block).toBe(true);
|
||||
});
|
||||
|
||||
it('ALLOWS a memory .md under ~/.claude/projects (never a .jsonl)', () => {
|
||||
const p = join(HOME, '.claude', 'projects', 'slug', 'memory', 'feedback_x.md');
|
||||
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(false);
|
||||
});
|
||||
|
||||
it('ALLOWS a .jsonl OUTSIDE ~/.claude/projects (e.g. repo observer episodes)', () => {
|
||||
const p = join(HOME, 'repo', 'docs', 'observer', 'episodes-2026-05.jsonl');
|
||||
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(false);
|
||||
});
|
||||
|
||||
it('ignores non-write tools on a transcript path', () => {
|
||||
const p = join(HOME, '.claude', 'projects', 'slug', 'sess', 'subagents', 'agent-x.jsonl');
|
||||
expect(decide({ toolName: 'Read', filePath: p }).block).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -27,6 +27,7 @@ import {
|
||||
isProductionCodePath,
|
||||
readRouterState,
|
||||
} from './enforce-hook-helpers.mjs';
|
||||
import { join, dirname, basename } from 'node:path';
|
||||
|
||||
const RULE_KEY_TDD = 'tdd-gate';
|
||||
const RULE_KEY_PLAN = 'writing-plans-required';
|
||||
@@ -132,8 +133,56 @@ function hasPlanIndicator(turn) {
|
||||
return false;
|
||||
}
|
||||
|
||||
const AGENT_ID_RE = /agentId:\s*([0-9a-f]+)/i;
|
||||
|
||||
/**
|
||||
* Cross-actor (Z Part 2): extract agentIds of subagents spawned by a `Task`
|
||||
* tool in the controller's current turn. The agentId comes from the harness-
|
||||
* written Task tool_result text ("agentId: <hex>") — the controller cannot forge
|
||||
* a tool_result in its own transcript. Only hex ids are accepted, so a crafted
|
||||
* "agentId: ../../x" cannot become a path-traversal into an arbitrary file.
|
||||
*/
|
||||
export function turnTaskAgentIds(turn) {
|
||||
const taskUseIds = new Set();
|
||||
for (const e of turn || []) {
|
||||
const c = e && e.message && e.message.content;
|
||||
if (!Array.isArray(c)) continue;
|
||||
for (const b of c) {
|
||||
if (b && b.type === 'tool_use' && b.name === 'Task') taskUseIds.add(b.id);
|
||||
}
|
||||
}
|
||||
const ids = [];
|
||||
for (const e of turn || []) {
|
||||
const c = e && e.message && e.message.content;
|
||||
if (!Array.isArray(c)) continue;
|
||||
for (const b of c) {
|
||||
if (!b || b.type !== 'tool_result' || !taskUseIds.has(b.tool_use_id)) continue;
|
||||
const txt = typeof b.content === 'string' ? b.content
|
||||
: Array.isArray(b.content) ? b.content.map((p) => p && p.text).filter(Boolean).join('\n') : '';
|
||||
const m = txt.match(AGENT_ID_RE);
|
||||
if (m) ids.push(m[1]);
|
||||
}
|
||||
}
|
||||
return ids;
|
||||
}
|
||||
|
||||
/**
|
||||
* Derive subagent transcript paths from the controller transcript path and a
|
||||
* list of agentIds. Subagent transcripts live at
|
||||
* <projects>/<slug>/<controller-session>/subagents/agent-<agentId>.jsonl
|
||||
* i.e. nested under the controller session's own directory (bound to it), while
|
||||
* the controller transcript is <...>/<controller-session>.jsonl.
|
||||
*/
|
||||
export function subagentTranscriptPaths(controllerTranscriptPath, agentIds) {
|
||||
const p = String(controllerTranscriptPath || '');
|
||||
if (!p) return [];
|
||||
const dir = dirname(p);
|
||||
const base = basename(p).replace(/\.jsonl$/i, '');
|
||||
return (agentIds || []).map((id) => join(dir, base, 'subagents', `agent-${id}.jsonl`));
|
||||
}
|
||||
|
||||
export function decide({
|
||||
toolName, filePath, transcriptEntries, classification, override, overridePlan,
|
||||
toolName, filePath, transcriptEntries, classification, override, overridePlan, subagentEntriesList = [],
|
||||
}) {
|
||||
if (!['Edit', 'Write', 'MultiEdit'].includes(toolName)) return { block: false };
|
||||
if (!isProductionCodePath(filePath)) return { block: false };
|
||||
@@ -155,24 +204,31 @@ export function decide({
|
||||
}
|
||||
}
|
||||
|
||||
// Rule #3 — TDD gate.
|
||||
// Rule #3 — TDD gate. Credit the controller's own turn OR a subagent that was
|
||||
// spawned by a Task in this turn (cross-actor, Z Part 2). Subagent evidence is
|
||||
// read from its agent-<id>.jsonl, which is tamper-protected by the transcript
|
||||
// Write-deny (Z Part 1) — so crediting it does not open a forgery channel.
|
||||
if (override) return { block: false };
|
||||
const hasTest = hasMatchingTestEdit(turn, filePath);
|
||||
const subList = Array.isArray(subagentEntriesList) ? subagentEntriesList : [];
|
||||
const hasTest = hasMatchingTestEdit(turn, filePath) || subList.some((es) => hasMatchingTestEdit(es, filePath));
|
||||
if (!hasTest) {
|
||||
return {
|
||||
block: true,
|
||||
message: [
|
||||
`[enforce-tdd-gate] Production code edit on "${filePath}" without preceding test edit.`,
|
||||
`Write the failing test FIRST in the corresponding *.test.mjs / *.spec.ts / *Test.php.`,
|
||||
`Write the failing test FIRST in the corresponding *.test.mjs / *.spec.ts / *Test.php`,
|
||||
`(a subagent's test edit, if it was spawned by a Task in this turn, is also credited).`,
|
||||
`Then run vitest/pest to confirm RED, then return to this prod-code Edit.`,
|
||||
].join('\n'),
|
||||
};
|
||||
}
|
||||
if (!hasFailingTestRun(turn)) {
|
||||
const hasRed = hasFailingTestRun(turn) || subList.some((es) => hasFailingTestRun(es));
|
||||
if (!hasRed) {
|
||||
return {
|
||||
block: true,
|
||||
message: [
|
||||
`[enforce-tdd-gate] Test was edited but no vitest/pest run with RED output observed in this turn.`,
|
||||
`[enforce-tdd-gate] Test was edited but no vitest/pest run with RED output observed in this turn`,
|
||||
`(nor in any in-turn subagent transcript).`,
|
||||
`Run the test suite (vitest run <test-file> / composer test) to confirm RED before prod-code edit.`,
|
||||
].join('\n'),
|
||||
};
|
||||
@@ -199,7 +255,19 @@ async function main() {
|
||||
task_type: state.classification.task_type,
|
||||
} : null;
|
||||
|
||||
const result = decide({ toolName, filePath, transcriptEntries: transcript, classification, override, overridePlan });
|
||||
// Cross-actor (Z Part 2): read transcripts of subagents spawned by a Task in
|
||||
// this turn, bound to the controller session via the derived path. Best-effort
|
||||
// — a missing/unreadable subagent transcript just yields no extra credit
|
||||
// (stricter), never an error.
|
||||
let subagentEntriesList = [];
|
||||
try {
|
||||
const turn = lastTurnEntries(transcript);
|
||||
const agentIds = turnTaskAgentIds(turn);
|
||||
const paths = subagentTranscriptPaths(event.transcript_path, agentIds);
|
||||
subagentEntriesList = paths.map((p) => readTranscript(p)).filter((e) => Array.isArray(e) && e.length);
|
||||
} catch { subagentEntriesList = []; }
|
||||
|
||||
const result = decide({ toolName, filePath, transcriptEntries: transcript, classification, override, overridePlan, subagentEntriesList });
|
||||
exitDecision(result);
|
||||
} catch {
|
||||
exitDecision({ block: false });
|
||||
|
||||
@@ -1,5 +1,79 @@
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { decide } from './enforce-tdd-gate.mjs';
|
||||
import { decide, turnTaskAgentIds, subagentTranscriptPaths } from './enforce-tdd-gate.mjs';
|
||||
|
||||
// Z Part 2 (2026-05-31): the tdd-gate must credit a subagent's test edit + RED
|
||||
// when that subagent was spawned by a Task in the controller's current turn.
|
||||
// Pairs with the transcript Write-hole closed in enforce-runtime-write-deny.mjs
|
||||
// (Z Part 1) so the credited agent-<id>.jsonl cannot be forged.
|
||||
describe('enforce-tdd-gate Z cross-actor (pairs with enforce-runtime-write-deny Part 1)', () => {
|
||||
const subagentRedRun = [
|
||||
{ message: { role: 'user', content: 'write the failing test for foo and confirm RED' } },
|
||||
{ message: { role: 'assistant', content: [
|
||||
{ type: 'tool_use', id: 's1', name: 'Write', input: { file_path: 'tools/foo.test.mjs' } },
|
||||
{ type: 'tool_use', id: 's2', name: 'Bash', input: { command: 'npx vitest run tools/foo.test.mjs' } },
|
||||
] } },
|
||||
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 's2', content: 'Tests 1 failed | 0 passed' } ] } },
|
||||
];
|
||||
|
||||
it('credits a subagent test edit + RED for the controller prod edit', () => {
|
||||
const r = decide({
|
||||
toolName: 'Edit',
|
||||
filePath: 'tools/foo.mjs',
|
||||
transcriptEntries: [
|
||||
{ message: { role: 'user', content: 'delegate the test, then I implement' } },
|
||||
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 't1', name: 'Task', input: { subagent_type: 'tester' } } ] } },
|
||||
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 't1', content: 'done. agentId: a1234abcd' } ] } },
|
||||
],
|
||||
subagentEntriesList: [subagentRedRun],
|
||||
});
|
||||
expect(r.block).toBe(false);
|
||||
});
|
||||
|
||||
it('still blocks when subagent edited a test but NO RED exists anywhere', () => {
|
||||
const subNoRed = [
|
||||
{ message: { role: 'user', content: 'write test' } },
|
||||
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 's1', name: 'Write', input: { file_path: 'tools/foo.test.mjs' } } ] } },
|
||||
];
|
||||
const r = decide({
|
||||
toolName: 'Edit', filePath: 'tools/foo.mjs',
|
||||
transcriptEntries: [ { message: { role: 'user', content: 'go' } } ],
|
||||
subagentEntriesList: [subNoRed],
|
||||
});
|
||||
expect(r.block).toBe(true);
|
||||
expect(r.message).toMatch(/RED/);
|
||||
});
|
||||
|
||||
it('preserves old behavior when no subagent entries (blocks without test)', () => {
|
||||
const r = decide({
|
||||
toolName: 'Edit', filePath: 'tools/foo.mjs',
|
||||
transcriptEntries: [ { message: { role: 'user', content: 'go' } } ],
|
||||
subagentEntriesList: [],
|
||||
});
|
||||
expect(r.block).toBe(true);
|
||||
expect(r.message).toMatch(/without preceding test edit/);
|
||||
});
|
||||
|
||||
it('turnTaskAgentIds extracts a hex agentId from an in-turn Task tool_result', () => {
|
||||
const turn = [
|
||||
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 't1', name: 'Task', input: {} } ] } },
|
||||
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 't1', content: 'ok agentId: a1b2c3d4e5' } ] } },
|
||||
];
|
||||
expect(turnTaskAgentIds(turn)).toContain('a1b2c3d4e5');
|
||||
});
|
||||
|
||||
it('turnTaskAgentIds ignores non-Task results and rejects non-hex ids (no path traversal)', () => {
|
||||
const turn = [
|
||||
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 'b1', name: 'Bash', input: {} } ] } },
|
||||
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 'b1', content: 'agentId: ../../evil' } ] } },
|
||||
];
|
||||
expect(turnTaskAgentIds(turn)).toHaveLength(0);
|
||||
});
|
||||
|
||||
it('subagentTranscriptPaths derives <dir>/<sessbase>/subagents/agent-<id>.jsonl', () => {
|
||||
const paths = subagentTranscriptPaths('/p/projects/slug/sessUUID.jsonl', ['a1b2']);
|
||||
expect(paths[0].split('\\').join('/')).toBe('/p/projects/slug/sessUUID/subagents/agent-a1b2.jsonl');
|
||||
});
|
||||
});
|
||||
|
||||
function userMsg(text) {
|
||||
return { message: { role: 'user', content: text } };
|
||||
|
||||
@@ -24,7 +24,7 @@ export function computeWorkspaceHash(workspacePath) {
|
||||
return createHash('md5').update(String(workspacePath || ''), 'utf-8').digest('hex').slice(0, 12);
|
||||
}
|
||||
|
||||
function isStale(record, now) {
|
||||
export function isStale(record, now) {
|
||||
if (!record || typeof record !== 'object') return true;
|
||||
const ttl = typeof record.ttl_ms === 'number' ? record.ttl_ms : LOCK_DEFAULT_TTL_MS;
|
||||
return now - (record.acquired_at || 0) > ttl;
|
||||
|
||||
@@ -6,6 +6,7 @@ import {
|
||||
release,
|
||||
refresh,
|
||||
computeWorkspaceHash,
|
||||
isStale,
|
||||
LOCK_DEFAULT_TTL_MS,
|
||||
} from './parallel-session-lock.mjs';
|
||||
|
||||
@@ -91,6 +92,26 @@ describe('parallel-session-lock pure module (Stream H Task 7)', () => {
|
||||
});
|
||||
});
|
||||
|
||||
// isStale is exported (B, 2026-05-31) so the wrapper's prune step reuses the
|
||||
// EXACT same staleness definition — single source of truth, no divergence that
|
||||
// could ever prune a still-fresh (active) lock.
|
||||
describe('isStale (exported for prune support)', () => {
|
||||
it('true when now - acquired_at exceeds ttl_ms', () => {
|
||||
expect(isStale({ acquired_at: 0, ttl_ms: 100 }, 1000)).toBe(true);
|
||||
});
|
||||
it('false when still within ttl (active lock — never pruned)', () => {
|
||||
expect(isStale({ acquired_at: 900, ttl_ms: 1000 }, 1000)).toBe(false);
|
||||
});
|
||||
it('true for a malformed/missing record', () => {
|
||||
expect(isStale(null, 1000)).toBe(true);
|
||||
expect(isStale(undefined, 1000)).toBe(true);
|
||||
});
|
||||
it('uses the default TTL when ttl_ms is absent', () => {
|
||||
expect(isStale({ acquired_at: 0 }, LOCK_DEFAULT_TTL_MS + 1)).toBe(true);
|
||||
expect(isStale({ acquired_at: 0 }, LOCK_DEFAULT_TTL_MS - 1)).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe('computeWorkspaceHash (Stream H Task 7)', () => {
|
||||
it('returns 12 hex chars', () => {
|
||||
const h = computeWorkspaceHash('/some/path');
|
||||
|
||||
Reference in New Issue
Block a user