Compare commits

...

7 Commits

Author SHA1 Message Date
Дмитрий 6ce2f0058d fix(router-gate): session-lock skips readonly Bash (scope calibration)
The parallel-session-lock fired on every PreToolUse tool, blocking even
readonly Bash (git status/log/diff, cat, grep, ls) from a peer session.
The lock's purpose is to serialize concurrent FILE MUTATION on the same
worktree; readonly commands mutate nothing, so they are outside that scope.

isReadonlyBashEvent() reuses the router-gate Bash classifier (an allow-verdict
whose reason is readonly/reading), mirroring the LLM-judge readonly
calibration. main() short-circuits readonly Bash to allow without
acquiring/blocking. Mutating tools, git commit/push, dangerous Bash, and
every non-Bash tool still acquire/check the lock — same-worktree mutation
serialization is unchanged (scope fix, NOT a discipline drop).

TDD: +6 unit tests. Full tools-vitest 2038 passed / 2 skipped.
2026-06-01 07:46:26 +03:00
Дмитрий d35fefddd9 ci(a11y): bump Pa11y workflow Node 20 -> 22 (cspell@10 engine requirement)
The a11y (Pa11y live) PR check failed at "Install root JS deps": root `npm ci`
hits EBADENGINE because @cspell/cspell-*@10.0.0 require Node >=22.18.0 while the
workflow pinned Node 20. Pre-existing mismatch (cspell ^10 predates this branch
and fails identically on main), unrelated to the discipline-guard hook changes.
Node 22 satisfies both the repo engines (>=20) and cspell (>=22.18).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 19:00:05 +03:00
Дмитрий e56ddd6a1b fix(router-gate): coverage line honors cross-turn active skill (verify + remind)
Backlog item G. The `coverage:` line under-reported a skill chosen in a PRIOR turn:
enforce-coverage-verify credited channel=skill only if the Skill tool ran in the
CURRENT turn, so an honest `skill:X` continuation line was BLOCKED -> the controller
learned to under-report as direct/chain. Two-sided systemic fix, no weakening:

- enforce-coverage-verify: decide() also accepts skill:X when X was invoked anywhere
  earlier in THIS session (new priorSkillNames param; main() collects them via
  sessionToolUses). Still unforgeable -- a real Skill tool_use must exist in the
  transcript. The only residual is possibly-stale attribution, far better than the
  forced dishonest direct-reporting it replaces.
- enforce-prompt-injection: the §17 reminder now lists active skills carried over
  from earlier turns (read from the transcript) and tells the controller to report
  `coverage: skill:<name>` when work continues under one -- the proactive half, so
  the correct line is not merely allowed but prompted.

TDD: RED -> GREEN per behavior. tools-vitest 2032 passed / 2 skipped.
Plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md (item G).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:37:44 +03:00
Дмитрий 53407a77cd feat(router-gate): tdd-gate credits delegated (subagent) TDD + transcript write-deny
Closes the TDD-gate cross-actor gap: when a subagent (spawned by a Task in the
controller's current turn) writes the failing test and confirms RED, the
controller's subsequent production edit was falsely blocked because the gate only
scanned the controller's own turn. Net strengthening, no discipline weakened.

- Part 1 (enforce-runtime-write-deny): block the Write tool from any
  ~/.claude/projects/**/*.jsonl (session/subagent transcripts). Memory *.md there
  stays writable (never matches .jsonl$). Resolving normalizer defeats ./.. evasion.
  This makes the agent-<id>.jsonl that Part 2 trusts unforgeable (it was the last
  ungated write channel; Bash/PowerShell/Read gates already covered it).
- Part 2 (enforce-tdd-gate): decide() also credits a subagent's matching test edit
  + RED via a new subagentEntriesList. turnTaskAgentIds() reads the hex agentId from
  the harness-written Task tool_result (the controller cannot forge its own
  tool_result; hex-only match blocks "agentId: ../../x" path traversal).
  subagentTranscriptPaths() derives <dir>/<controller-session>/subagents/agent-<id>.jsonl.
  main() reads them best-effort (missing/unreadable -> no extra credit = stricter).

No new weakening: a delegated subagent doing real TDD is legitimate; the only
forgery vector (overwriting the agent jsonl) is closed by Part 1. Existing
controller-turn behaviour is preserved (empty subagent list == old logic).

OWNER (settings.json, Claude can't edit it): enforce-tdd-gate is already a
registered PreToolUse hook -> Part 2 goes live on merge. enforce-runtime-write-deny
must be registered on PreToolUse(Edit|Write|MultiEdit|NotebookEdit) for Part 1 to be live.

TDD: RED -> GREEN per behavior. tools-vitest 2027 passed / 2 skipped.
Backlog item C (=Z); plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:18:44 +03:00
Дмитрий 6577c04a1f fix(router-gate): session-lock hygiene — clearer block message + stale-lock prune
Closes the remaining parallel-session-lock remarks on top of the keying fix
(7a469dc9), with NO weakening of same-worktree serialization:

- D: the block message now identifies the holder by its STABLE session_id and
  marks the recorded pid as transient ("may change between attempts"). Chasing
  the pid is what led to closing the wrong session. Decision logic is unchanged
  (text only) — existing /pid N/ triage assertion still holds.
- B: pruneStaleLocks() best-effort deletes leaked lock files that are ALREADY
  stale by the shared isStale() definition (now exported from the pure module —
  single source of truth). Active within-TTL locks are never touched, so the
  serialization guarantee is not weakened. Wired into the PreToolUse branch of
  main(), wrapped so hygiene can never break the gate (fail-open).
- C (no code): release-on-SessionEnd needs only a settings.json registration
  (owner action) — the existing !tool_name branch already releases. Documented
  in the plan. Until then, leaked locks self-heal via B + the 5-min TTL takeover.

TDD: RED -> GREEN per behavior. tools-vitest 2014 passed / 2 skipped.
Backlog items B/C/D; plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 17:43:03 +03:00
Дмитрий 7a469dc913 fix(router-gate): key session-lock by session work-tree root, not hook cwd
enforce-parallel-session-lock keyed the lock on the hook's process.cwd(),
which collapses to the main repo dir after a session resume — so sessions in
DIFFERENT git worktrees shared one lock and false-blocked each other (observed:
a brainrepo-worktree session blocked launching agents by a discipline-guard
session). New resolveWorkspacePath() keys on the session's stable cwd
(event.cwd) resolved to the git work-tree root (git -C <cwd> rev-parse
--show-toplevel), with fallback to process.cwd() so behaviour never regresses
when event.cwd is absent. Same-worktree concurrency stays serialized
(unchanged) — discipline not weakened; only cross-worktree false-blocks fixed.

TDD: RED (5 resolveWorkspacePath cases) -> GREEN -> tools-vitest 2003 passed /
2 skipped. Backlog item F; plan
docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 17:02:32 +03:00
Дмитрий be4e1a6123 feat(router-gate): whitelist npm ci in SAFE_EXACT (worktree dep restore)
`npm ci` does a clean install strictly from the committed lockfile
(deterministic, no version drift) — needed to restore junction node_modules
in a fresh worktree. Distinct from `npm install`/`npm i`, which stay
hard-blacklisted because they can pull new/updated versions; the blacklist
runs before the whitelist, so they remain blocked. Word boundary after `ci`
prevents `npm cider`-style prefix matches; chain semantics still block
`npm ci && <mutating>`.

TDD: RED (3 allow-cases failed default-deny) -> GREEN (/^npm\s+ci\b/) ->
tools-vitest 1998 passed / 2 skipped (2000). Backlog item A; plan
docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 14:46:58 +03:00
16 changed files with 795 additions and 24 deletions
+2 -2
View File
@@ -21,10 +21,10 @@ jobs:
extensions: pdo, pdo_pgsql, redis, mbstring, intl, bcmath
coverage: none
- name: Setup Node 20
- name: Setup Node 22
uses: actions/setup-node@v4
with:
node-version: '20'
node-version: '22'
cache: 'npm'
- name: Install root JS deps
@@ -0,0 +1,144 @@
# Discipline-guard backlog — router-gate `tools/enforce-*.mjs`
**Worktree:** `.claude/worktrees/discipline-guard` (branch `worktree-discipline-guard`).
**Date:** 2026-05-31. Owner-authorized backlog after quirk-2 + 1A closure (commit `b0cd18d7`).
## Context (already done — do NOT redo)
- **Quirk 2** — redirect detector is quote-aware (`stripQuotedSpans` in `tools/enforce-router-gate.mjs`): `>`/`2>` inside quotes no longer false-blocks. Commit `b0cd18d7`.
- **1A** — removed advertising of dead override phrases (`findOverride` is a v4 stub) from `enforce-prompt-injection` + verify-before-push / coverage-verify / memory-coverage / tdd-gate. Locked by negative tests. Same commit.
- Marketing MCP servers cut from `.mcp.json` (commit `63100dec`).
## Deliberately NOT doing (these are defense lines, not bugs)
- Calibration 6 of the judge (reading chat context) — weakens in-session defense.
- Quirk 3 (loosen exact-match of git approval) — that exact-match is an anti-injection property.
## Backlog (by priority)
### A. `npm ci` in router-gate whitelist (`SAFE_EXACT` in `tools/enforce-router-gate.mjs`) ← current
Restoring locked dependencies is safe and closes worktree-setup friction. `npm ci` installs
exactly the committed lockfile (deterministic, no version drift) — unlike `npm install`/`npm i`,
which stay hard-blacklisted because they can pull new/updated versions.
**TDD:**
1. RED — new describe block in `tools/enforce-router-gate.test.mjs`: allow `npm ci`,
`npm ci --no-audit`, `npm ci --prefer-offline`; still block `npm install`/`npm i`/
`npm install foo`/`npm i foo` (hard-blacklist), `npm cider` (word boundary → default-deny),
`npm ci && rm x` (chain mutating).
2. GREEN — add `/^npm\s+ci\b/` to `SAFE_EXACT` with rationale comment. `\b` prevents
`npm cider`-style prefix matches. Blacklist runs before whitelist, so `npm install`/`npm i`
stay blocked (the `i`-alternative needs `i` right after the space; `npm ci` has `c` there).
3. tools-vitest full run (also the push sentinel).
4. Commit via AskUserQuestion (label = exact command).
### B. Cosmetic path strings in gate messages
`c:/` vs `/c/`, unexpanded `$env:` in gate messages. Polish only.
### F. Parallel-session-lock false cross-worktree collision (2026-05-31, owner-raised)
Symptom: a session in worktree `discipline-guard` was blocked by
`enforce-parallel-session-lock` (held by another session `7f6efd48`, pid changed
12552→19044 across attempts → holder still active; pid is the transient hook-node pid,
session_id is the stable identity).
**Investigation (read-only):**
- Lock keyed by `computeWorkspaceHash(process.cwd())` = md5(cwd).slice(0,12); file
`~/.claude/runtime/session-lock-<hash>.json`; release only on Stop; TTL 5 min.
- 9 lock files accumulated → stale files leak when a session closes without a clean Stop.
- `enforce-branch-switch` read branch "worktree-discipline-guard" via
`git branch --show-current` from `process.cwd()` → the hook's cwd IS the worktree →
**keying is already per-worktree** (NOT coarse main-dir). So the holder shared this
worktree's hash → genuine same-worktree concurrency, the lock working as designed —
NOT a false positive. Do NOT re-key (would weaken same-tree serialization).
**Genuinely-fixable part (no weakening):** leaked lock on close-without-Stop blocks the next
same-worktree session for up to TTL. Fix: release on SessionEnd (not only Stop) + prune
stale lock files on acquire. Ground-truth the lock JSON before coding.
**Closure (2026-05-31).** All keying/hygiene/UX parts done, no discipline weakened:
- **A — keying by worktree root** (`resolveWorkspacePath`, commit `7a469dc9`): keys the
lock on the session's stable `event.cwd` → git toplevel, not the volatile hook
`process.cwd()` (which collapses to main on resume → cross-worktree false-blocks).
Same-worktree serialization unchanged; fallback to `process.cwd()` if `event.cwd` absent.
- **D — clearer block message**: identifies the holder by its STABLE `session_id`; marks
the recorded pid as transient ("may change between attempts"). Chasing the pid was what
led to closing the wrong session. Logic untouched (text only).
- **B — `pruneStaleLocks`**: best-effort delete of leaked lock files that are ALREADY
stale by the shared `isStale()` (now exported — single source of truth). Active
within-TTL locks are never touched → serialization not weakened. Wired into the
PreToolUse branch of `main()`, wrapped so hygiene can never break the gate.
- **C — release on SessionEnd**: NO new code. The existing `!event.tool_name` branch
already releases. To make release fire on session end (not only on Stop turns),
**OWNER ACTION in `.claude/settings.json`**: add `enforce-parallel-session-lock.mjs`
to the `SessionEnd` hook array (it already runs on `Stop`). Pure config; Claude cannot
edit settings.json. Until added, leaked locks are still self-healing via B (prune) +
the 5-min TTL takeover — so this is a reliability nicety, not a correctness gap.
- **E/F — live**: fix is on branch `worktree-discipline-guard`; the live hook executes
from `tools/` on **main**, so it is active only after merge to main. Runtime
effectiveness of A depends on the PreToolUse payload carrying `cwd`; if absent, the
safe fallback = prior behavior (no regression). Verify on main.
### C. TDD-gate cross-actor — chosen: **Z** (full, 2026-05-31; on hold behind F)
`enforce-tdd-gate` does not see test edits made by a subagent (scans only the controller's
own turn; subagent test edit + RED live in `agent-<id>.jsonl`). **Z = Part 1 (close the
projects/ Write hole — verified prerequisite) then Part 2 (read subagent transcript bound to
a Task in this turn).** Condition 1 verified VIOLATED (no Write-tool gate covers
`~/.claude/projects/`), so Variant 1 alone would weaken — safe only bundled with Part 1.
**Closure (2026-05-31, TDD, no discipline weakened — net strengthening):**
- **Part 1** — `enforce-runtime-write-deny.mjs` extended with `TRANSCRIPT_RE`
(`(^|/)\.claude/projects/.*\.jsonl$`): the Write tool can no longer create/overwrite any
session/subagent transcript `.jsonl`. Memory files there are `.md` and stay writable
(never match `.jsonl$`). Resolving normalizer blocks `.`/`..` evasion. This makes the
agent-`<id>`.jsonl that Part 2 trusts unforgeable.
- **Part 2** — `enforce-tdd-gate.mjs`: `decide()` now also credits a subagent's matching
test edit + RED run via new `subagentEntriesList`. `turnTaskAgentIds(turn)` extracts the
**hex** agentId from the harness-written `Task` tool_result ("agentId: <hex>") — the
controller cannot forge its own tool_result, and the hex-only match blocks
`agentId: ../../x` path-traversal. `subagentTranscriptPaths()` derives
`<dir>/<controller-session>/subagents/agent-<id>.jsonl` (bound to the controller session).
`main()` reads those transcripts best-effort (missing → no extra credit = stricter, never
an error). No NEW weakening: a delegated subagent doing real TDD is legitimate; the only
forgery vector (overwrite the agent jsonl) is closed by Part 1.
- Full tools-vitest: **2027 passed / 2 skipped**.
- **OWNER ACTION (settings.json, Claude can't edit it):** `enforce-tdd-gate.mjs` is already
a registered PreToolUse hook → Part 2 goes live on merge. **Part 1 requires that
`enforce-runtime-write-deny.mjs` be registered** on PreToolUse(Edit|Write|MultiEdit|
NotebookEdit); if it is not yet registered, the transcript Write-deny is inert until added.
### G. Coverage line under-reports cross-turn active skill (2026-05-31, owner-raised)
Symptom: the `coverage: <channel>:<id>` line says `direct`/`chain` when a skill chosen in a
PRIOR turn is still active in the current turn. Root cause: `enforce-coverage-verify.mjs`
credits `channel=skill` only if the `Skill` tool was invoked in the CURRENT turn
(`turnToolUses`). On a continuation turn (skill still active, not re-invoked) an honest
`skill:X` line would be BLOCKED → so the controller learns to under-report as `direct`/`chain`.
**Fix (no weakening):** also credit `skill:X` if X was invoked anywhere earlier in THIS
session (a real `Skill` tool_use in the transcript — still unforgeable). decide() gains a
`priorSkillNames` param; main() collects session-wide Skill names via `sessionToolUses`.
Residual: attribution may be stale (skill invoked long ago) — acceptable; the alternative
(forced dishonest `direct`) is worse, and the owner wants cross-turn skills honored.
### D. Smoke 8 — live Workflow-gate F2 test
Needs a clean session (not code).
### E. H10 — auto-bootstrap worktree (junction node_modules) in `tools/subagent-prompt-prefix.mjs`
### (later) Layer 5 — VM + YubiKey — needs hardware.
## Environment working rules
- Tests / push sentinel: `npx vitest run --root app --config vitest.config.tools.mjs`
(NOT `npm run test:tools` — breaks on keytar). From inside the worktree it's run as
`--root app`; from the main checkout, point `--root` at the worktree app dir.
- Commit: only via AskUserQuestion where the option label = the EXACT command (router-gate
compares verbatim) + plain-language explanation; commit text via `-F` file in `.scratch/`;
commit only explicit paths (parallel sessions).
- Push: needs a fresh verify-sentinel (full run ≤30 min); override phrases are dead
(`findOverride` is a stub) → the only path to push non-`.md` changes is to run the tests.
+18 -5
View File
@@ -26,6 +26,7 @@ import {
lastAssistantText,
parseCoverageLine,
turnToolUses,
sessionToolUses,
findOverride,
logOverride,
exitDecision,
@@ -38,7 +39,7 @@ const MUTATING_TOOLS = new Set([
]);
export function decide({
toolUses, assistantText, override,
toolUses, assistantText, override, priorSkillNames = [],
}) {
// Pure conversational turn — skip.
const hasMutating = toolUses.some((u) => MUTATING_TOOLS.has(u.name));
@@ -59,12 +60,19 @@ export function decide({
}
if (cov.channel === 'skill') {
const found = toolUses.some((u) => u.name === 'Skill' && u.input && (u.input.skill === cov.id || u.input.skill === cov.id.replace(/^superpowers:/, '')));
if (!found) {
// Accept if the skill was invoked in THIS turn OR anywhere earlier in this
// session (item G): a skill chosen in a prior turn stays active, so an honest
// skill:X line on a continuation turn must not be punished into under-reporting.
// Still unforgeable — a real Skill tool_use must exist in the transcript.
const norm = (s) => String(s || '').replace(/^superpowers:/, '');
const idNorm = norm(cov.id);
const foundThisTurn = toolUses.some((u) => u.name === 'Skill' && u.input && norm(u.input.skill) === idNorm);
const foundPrior = (priorSkillNames || []).some((n) => norm(n) === idNorm);
if (!foundThisTurn && !foundPrior) {
return {
block: true,
message: [
`[enforce-coverage-verify] coverage says skill:${cov.id} but the Skill tool was never invoked with that name in this turn.`,
`[enforce-coverage-verify] coverage says skill:${cov.id} but the Skill tool was never invoked with that name in this turn or any prior turn of this session.`,
`Either invoke the skill via Skill tool, or switch coverage to direct:<role> with justification.`,
].join('\n'),
};
@@ -87,8 +95,13 @@ async function main() {
const toolUses = turnToolUses(transcript);
const assistantText = lastAssistantText(transcript);
// Session-wide Skill invocations (item G): a skill chosen in a prior turn is
// still active and may legitimately be named in this turn's coverage line.
const priorSkillNames = sessionToolUses(transcript)
.filter((u) => u.name === 'Skill' && u.input && u.input.skill)
.map((u) => u.input.skill);
const result = decide({ toolUses, assistantText, override });
const result = decide({ toolUses, assistantText, override, priorSkillNames });
exitDecision(result);
} catch {
exitDecision({ block: false });
+34
View File
@@ -1,6 +1,40 @@
import { describe, it, expect } from 'vitest';
import { decide } from './enforce-coverage-verify.mjs';
// Cross-turn skill credit (backlog item G, 2026-05-31): a skill chosen in a PRIOR
// turn stays active; an honest `skill:X` line on a continuation turn must NOT be
// blocked just because the Skill tool was not re-invoked this turn. decide() takes
// priorSkillNames (real Skill tool_uses from earlier in the session transcript).
describe('enforce-coverage-verify / decide — cross-turn active skill (enforce-coverage-verify.mjs)', () => {
it('credits skill:X when X was invoked in a PRIOR turn (priorSkillNames)', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
assistantText: 'coverage: skill:superpowers:test-driven-development\nработаю',
priorSkillNames: ['superpowers:test-driven-development'],
});
expect(r.block).toBe(false);
});
it('normalizes the superpowers: prefix for prior-turn skills too', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
assistantText: 'coverage: skill:superpowers:test-driven-development',
priorSkillNames: ['test-driven-development'],
});
expect(r.block).toBe(false);
});
it('still blocks skill:X when X is neither in this turn nor any prior turn', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: { file_path: 'foo.mjs' } }],
assistantText: 'coverage: skill:superpowers:test-driven-development',
priorSkillNames: ['some-other-skill'],
});
expect(r.block).toBe(true);
expect(r.message).toMatch(/never invoked/);
});
});
describe('enforce-coverage-verify / decide', () => {
it('allows turn with no mutating tools (pure conversational)', () => {
const r = decide({ toolUses: [{ name: 'Read', input: {} }], assistantText: 'just talking' });
+121 -4
View File
@@ -11,10 +11,12 @@
* Activation: settings.json registration is deferred to Phase H-α/H-β
* batch step. main() is a no-op (exit 0) until then.
*/
import { acquire, release, computeWorkspaceHash } from './parallel-session-lock.mjs';
import { readFileSync, writeFileSync, unlinkSync, mkdirSync } from 'node:fs';
import { acquire, release, computeWorkspaceHash, isStale } from './parallel-session-lock.mjs';
import { readFileSync, writeFileSync, unlinkSync, mkdirSync, readdirSync } from 'node:fs';
import { execFileSync } from 'node:child_process';
import { join, dirname } from 'node:path';
import { readStdin, parseEventJson, exitDecision, runtimeDir } from './enforce-hook-helpers.mjs';
import { classifyBashCommand } from './enforce-router-gate.mjs';
/**
* Pure decision: given an acquire() result, decide block/allow.
@@ -29,12 +31,41 @@ export function decide({ acquireResult, sessionId }) {
if (!acquireResult || typeof acquireResult !== 'object') return { block: false };
if (acquireResult.acquired) return { block: false };
const holder = acquireResult.holder || {};
// Identify the holder by its STABLE session id, not the pid: the recorded pid
// is the transient hook-node pid and changes between attempts, so chasing it
// leads to closing the wrong session. Surface the pid only as a triage hint.
return {
block: true,
reason: `parallel session lock held by ${holder.session_id || 'unknown'} (pid ${holder.pid || '?'}) — wait or close that session first`,
reason: `parallel session lock held by session ${holder.session_id || 'unknown'} (current pid ${holder.pid || '?'}, may change between attempts — identify the session by its id, not pid) — wait for the 5-min TTL or close THAT session`,
};
}
/**
* Calibration (2026-05-31, SCOPE fix, NOT a discipline drop). The lock's purpose
* is to serialize concurrent FILE MUTATION between sessions on the same worktree.
* A readonly Bash command (git status/log/diff, cat, grep, ls — "смотрелки")
* mutates nothing, so a peer session's lock must NOT block it. Reuse the
* router-gate Bash classifier: an allow-verdict whose reason mentions
* readonly/reading is a no-state-change command. Mirrors the LLM-judge readonly
* calibration. Everything that can mutate — file edits, git commit/push,
* dangerous Bash, and every NON-Bash tool — still acquires/checks the lock, so
* same-worktree mutation serialization is unchanged.
*
* @param {object} event
* @returns {boolean}
*/
export function isReadonlyBashEvent(event) {
if (!event || event.tool_name !== 'Bash') return false;
const command = (event.tool_input && event.tool_input.command) || '';
if (!command) return false;
try {
const c = classifyBashCommand(command, {});
return !!c && c.result === 'allow' && /readonly|reading/i.test(c.reason || '');
} catch {
return false;
}
}
/**
* PreToolUse wiring: acquire (or same-session refresh / stale takeover) the lock,
* then decide block/allow. I/O injected for testability.
@@ -60,6 +91,64 @@ export function runReleaseAction({ event, cwd, readLock, deleteLock }) {
return { released: true };
}
/**
* Resolve the stable work-tree root used as the lock key. Keys on the SESSION's
* cwd (`event.cwd`, stable across resume) resolved to the git work-tree root —
* NOT the hook's `process.cwd()`, which collapses to the main repo dir after a
* session resume and thereby false-blocks sessions in DIFFERENT worktrees.
* Pure (I/O injected): `runGitToplevel(dir)` returns the toplevel or '' on failure.
*
* @param {object} p
* @param {object} p.event
* @param {string} p.processCwd
* @param {(dir:string)=>string} p.runGitToplevel
* @returns {string}
*/
export function resolveWorkspacePath({ event, processCwd, runGitToplevel }) {
const dir = (event && typeof event.cwd === 'string' && event.cwd) ? event.cwd : processCwd;
try {
const top = runGitToplevel(dir);
if (top && typeof top === 'string') return top;
} catch { /* fall through to raw dir (fail-open) */ }
return dir;
}
/**
* Disk hygiene: delete leaked lock files whose record is ALREADY stale by the
* shared isStale() definition (so an active within-TTL lock is never touched).
* Pure (I/O injected). Best-effort: a failed read counts the file as stale
* (garbage), a failed delete is swallowed — hygiene must never break the gate.
*
* @param {object} p
* @param {string[]} p.files - absolute lock-file paths
* @param {(f:string)=>object|null} p.readRecord
* @param {(f:string)=>void} p.deleteRecord
* @param {(rec:object|null, now:number)=>boolean} p.isStaleFn
* @param {number} p.now
* @returns {{pruned: number}}
*/
export function pruneStaleLocks({ files, readRecord, deleteRecord, isStaleFn, now }) {
let pruned = 0;
for (const f of files || []) {
let rec = null;
try { rec = readRecord(f); } catch { rec = null; }
if (isStaleFn(rec, now)) {
try { deleteRecord(f); pruned++; } catch { /* best-effort */ }
}
}
return { pruned };
}
function realGitToplevel(dir) {
try {
return execFileSync('git', ['-C', dir, 'rev-parse', '--show-toplevel'], {
encoding: 'utf-8',
timeout: 1000,
stdio: ['ignore', 'pipe', 'ignore'],
}).trim();
} catch { return ''; }
}
function lockPathFor(cwd) {
return join(runtimeDir(), `session-lock-${computeWorkspaceHash(cwd)}.json`);
}
@@ -82,7 +171,10 @@ async function main() {
// a lock bug can NEVER wedge the user out of their own session.
try {
const event = parseEventJson(await readStdin());
const cwd = process.cwd();
// Key by the session's stable work-tree root (event.cwd → git toplevel),
// not the volatile hook process.cwd() (collapses to main on resume → false
// cross-worktree blocks). Fallback to process.cwd() keeps prior behavior.
const cwd = resolveWorkspacePath({ event, processCwd: process.cwd(), runGitToplevel: realGitToplevel });
const p = lockPathFor(cwd);
// Stop event carries no tool_name → release path.
@@ -91,6 +183,31 @@ async function main() {
return exitDecision({ block: false });
}
// Calibration (2026-05-31): a readonly Bash command never mutates the
// worktree, so it is outside the lock's mutation-serialization scope — allow
// without acquiring/blocking. Mutating tools (and every non-Bash tool) fall
// through to acquire/check below, so serialization is unchanged.
if (isReadonlyBashEvent(event)) {
return exitDecision({ block: false });
}
// Best-effort disk hygiene (B): drop leaked stale lock files before acquiring.
// isStale-gated → an active within-TTL lock is never pruned, so same-worktree
// serialization is untouched. Wrapped so hygiene can never break the gate.
try {
const dir = runtimeDir();
const files = readdirSync(dir)
.filter((f) => /^session-lock-.*\.json$/.test(f))
.map((f) => join(dir, f));
pruneStaleLocks({
files,
readRecord: (fp) => realReadLock(fp),
deleteRecord: (fp) => realDeleteLock(fp),
isStaleFn: isStale,
now: Date.now(),
});
} catch { /* hygiene is best-effort */ }
// PreToolUse on a mutating tool → acquire/refresh, then block/allow.
const r = runAcquireDecision({
event,
+164 -1
View File
@@ -1,7 +1,7 @@
// tools/enforce-parallel-session-lock.test.mjs
// Stream H Task 7 — wrapper tests around the pure parallel-session-lock module.
import { describe, it, expect } from 'vitest';
import { decide } from './enforce-parallel-session-lock.mjs';
import { decide, isReadonlyBashEvent } from './enforce-parallel-session-lock.mjs';
describe('enforce-parallel-session-lock wrapper (Stream H Task 7)', () => {
it('allow when acquire succeeded (fresh own-lock)', () => {
@@ -43,6 +43,25 @@ describe('enforce-parallel-session-lock wrapper (Stream H Task 7)', () => {
});
});
// D (2026-05-31): the block message must steer the human to the STABLE identity
// (session id), not the transient hook pid — chasing the pid was what caused the
// owner to close the wrong session and deadlock the workspace.
describe('decide() message clarity (D) — pid is transient, identify by session id', () => {
const blocked = { acquired: false, holder: { session_id: 'sess-A', pid: 12552, acquired_at: 0 } };
it('names the holder session id as the stable identity', () => {
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/sess-A/);
});
it('marks the pid as changeable so the human does not chase it', () => {
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/may change|transient/i);
});
it('still surfaces the pid for triage', () => {
expect(decide({ acquireResult: blocked, sessionId: 's1' }).reason).toMatch(/12552/);
});
});
// Live wiring (point 2, 2026-05-31): PreToolUse acquires/refreshes the lock,
// Stop releases it. I/O is injected (readLock/writeLock/deleteLock) so the
// wiring stays pure and unit-testable; main() binds real fs.
@@ -131,3 +150,147 @@ describe('runReleaseAction — Stop release wiring', () => {
expect(deleted).toBe(false);
});
});
// Cross-worktree false-block fix (2026-05-31). The lock must key on the session's
// stable work-tree root (from event.cwd → git toplevel), NOT the hook process.cwd()
// — which collapses to the main repo dir after a session resume, making sessions in
// DIFFERENT worktrees share one lock and block each other.
import { resolveWorkspacePath, pruneStaleLocks } from './enforce-parallel-session-lock.mjs';
describe('resolveWorkspacePath — stable worktree key', () => {
it('keys on event.cwd (the session worktree), not the hook process.cwd()', () => {
const r = resolveWorkspacePath({
event: { cwd: '/repo/.claude/worktrees/wt-A' },
processCwd: '/repo',
runGitToplevel: (dir) => dir,
});
expect(r).toBe('/repo/.claude/worktrees/wt-A');
});
it('gives different keys for two different worktrees (no cross-block)', () => {
const opts = { processCwd: '/repo', runGitToplevel: (dir) => dir };
const a = resolveWorkspacePath({ event: { cwd: '/repo/.claude/worktrees/wt-A' }, ...opts });
const b = resolveWorkspacePath({ event: { cwd: '/repo/.claude/worktrees/wt-B' }, ...opts });
expect(a).not.toBe(b);
});
it('resolves to the git work-tree root (collapses subdir variance)', () => {
const r = resolveWorkspacePath({
event: { cwd: '/repo/.claude/worktrees/wt-A/tools' },
processCwd: '/repo',
runGitToplevel: () => '/repo/.claude/worktrees/wt-A',
});
expect(r).toBe('/repo/.claude/worktrees/wt-A');
});
it('falls back to processCwd when event.cwd is absent', () => {
const r = resolveWorkspacePath({
event: { tool_name: 'Edit' },
processCwd: '/repo',
runGitToplevel: (dir) => dir,
});
expect(r).toBe('/repo');
});
it('falls back to the raw dir when git toplevel resolution fails (fail-open)', () => {
const r = resolveWorkspacePath({
event: { cwd: '/some/dir' },
processCwd: '/repo',
runGitToplevel: () => '',
});
expect(r).toBe('/some/dir');
});
});
// B (2026-05-31): disk hygiene. Leaked lock files (session closed without a clean
// Stop) pile up in ~/.claude/runtime. Pruning ONLY removes records that are
// already stale by the SAME isStale() definition acquire() uses — so it can never
// drop an active (within-TTL) lock and never weakens same-worktree serialization.
describe('pruneStaleLocks — drops only already-stale leaked locks (B)', () => {
const fresh = { schema_version: 1, session_id: 'A', pid: 1, acquired_at: 1000, ttl_ms: 300000 };
const stale = { schema_version: 1, session_id: 'B', pid: 2, acquired_at: 0, ttl_ms: 100 };
const isStaleFn = (rec, now) => !rec || (now - (rec && rec.acquired_at || 0)) > ((rec && rec.ttl_ms) || 300000);
it('deletes stale lock files and never the fresh (active) ones', () => {
const records = { '/r/lock-fresh.json': fresh, '/r/lock-stale.json': stale };
const deleted = [];
const r = pruneStaleLocks({
files: Object.keys(records),
readRecord: (f) => records[f],
deleteRecord: (f) => deleted.push(f),
isStaleFn, now: 1000,
});
expect(deleted).toEqual(['/r/lock-stale.json']);
expect(r.pruned).toBe(1);
});
it('treats an unreadable/garbage lock file as stale and prunes it', () => {
const deleted = [];
pruneStaleLocks({
files: ['/r/garbage.json'],
readRecord: () => { throw new Error('bad json'); },
deleteRecord: (f) => deleted.push(f),
isStaleFn, now: 1000,
});
expect(deleted).toEqual(['/r/garbage.json']);
});
it('never throws when a delete fails (best-effort hygiene)', () => {
expect(() => pruneStaleLocks({
files: ['/r/x.json'],
readRecord: () => stale,
deleteRecord: () => { throw new Error('locked'); },
isStaleFn, now: 1000,
})).not.toThrow();
});
it('does nothing for an empty file list', () => {
const r = pruneStaleLocks({ files: [], readRecord: () => null, deleteRecord: () => {}, isStaleFn, now: 1 });
expect(r.pruned).toBe(0);
});
});
// ── Calibration (2026-05-31): readonly Bash is outside the lock scope ──
// The lock serializes concurrent FILE MUTATION between sessions on the same
// worktree. A readonly Bash command (git status/log/diff, cat, grep, ls)
// mutates nothing, so a peer session's lock must NOT block it. This mirrors the
// LLM-judge readonly calibration (isReadonlyBashEvent in enforce-llm-judge-per-tool).
// Everything that can mutate — file edits, git commit/push, dangerous Bash, and
// every NON-Bash tool — still acquires/checks the lock, so mutation
// serialization is unchanged (scope fix, NOT a discipline drop).
describe('isReadonlyBashEvent — readonly Bash bypasses the lock (calibration 2026-05-31)', () => {
const ev = (command) => ({ tool_name: 'Bash', tool_input: { command } });
it('treats readonly git (status/log/diff) as readonly', () => {
expect(isReadonlyBashEvent(ev('git status'))).toBe(true);
expect(isReadonlyBashEvent(ev('git log --oneline -5'))).toBe(true);
expect(isReadonlyBashEvent(ev('git diff'))).toBe(true);
});
it('treats whitelisted reading commands (cat/grep/ls) as readonly', () => {
expect(isReadonlyBashEvent(ev('ls -la'))).toBe(true);
expect(isReadonlyBashEvent(ev('cat README.md'))).toBe(true);
expect(isReadonlyBashEvent(ev('grep -n foo bar.txt'))).toBe(true);
});
it('does NOT treat mutating Bash as readonly (still acquires/blocks)', () => {
expect(isReadonlyBashEvent(ev('rm -rf x'))).toBe(false);
expect(isReadonlyBashEvent(ev('git commit -m "x"'))).toBe(false);
expect(isReadonlyBashEvent(ev('npm install foo'))).toBe(false);
});
it('does NOT treat a chain with a mutating part as readonly (C13)', () => {
expect(isReadonlyBashEvent(ev('git status && rm x'))).toBe(false);
});
it('only applies to the Bash tool — other tools still acquire the lock', () => {
expect(isReadonlyBashEvent({ tool_name: 'Edit', tool_input: { file_path: 'a.js' } })).toBe(false);
expect(isReadonlyBashEvent({ tool_name: 'Write', tool_input: { file_path: 'a.js' } })).toBe(false);
});
it('is safe on malformed input', () => {
expect(isReadonlyBashEvent(null)).toBe(false);
expect(isReadonlyBashEvent({ tool_name: 'Bash', tool_input: {} })).toBe(false);
expect(isReadonlyBashEvent({ tool_name: 'Bash' })).toBe(false);
});
});
+28 -2
View File
@@ -21,13 +21,15 @@ import {
parseEventJson,
readRouterState,
readRationalizationFlags,
readTranscript,
sessionToolUses,
findOverride,
loadOverrideVocab,
} from './enforce-hook-helpers.mjs';
const SUPPRESS_RULE = 'classifier-mismatch';
export function buildReminder({ classification, recentFlags, override }) {
export function buildReminder({ classification, recentFlags, override, activeSkills = [] }) {
const lines = ['## §17 Coverage / Discipline Reminder', ''];
if (override) {
lines.push(`Override phrase detected: "${override.phrase}". The following rules are suppressed for THIS prompt only:`);
@@ -38,6 +40,16 @@ export function buildReminder({ classification, recentFlags, override }) {
lines.push(' `coverage: <channel>:<id>`');
lines.push('Channels: skill, node, chain, hook, agent, direct.');
lines.push('');
// Item G (2026-05-31): a skill invoked in an EARLIER turn stays active. Remind
// explicitly so the coverage line is not under-reported as direct/chain when the
// work actually continues under that skill. (The verifier now accepts a prior-turn
// skill, so this report is honest, not a violation.)
if (Array.isArray(activeSkills) && activeSkills.length > 0) {
lines.push('**Active skill(s) still in effect from earlier this session:**');
for (const s of activeSkills) lines.push(` - ${s}`);
lines.push('If your work continues under one of these, report `coverage: skill:<name>` (not direct/chain).');
lines.push('');
}
if (classification) {
lines.push(`**Classifier output:** task_type=${classification.task_type || 'unknown'}, confidence=${classification.confidence ?? 'n/a'}`);
if (classification.recommended_node) {
@@ -94,7 +106,21 @@ async function main() {
const flags = readRationalizationFlags(sessionId);
const reminder = buildReminder({ classification, recentFlags: flags, override });
// Item G: detect skills invoked earlier this session (still active). The
// transcript at UserPromptSubmit holds all prior turns. Best-effort.
let activeSkills = [];
try {
const transcript = readTranscript(event.transcript_path);
const seen = new Set();
for (const u of sessionToolUses(transcript)) {
if (u.name === 'Skill' && u.input && u.input.skill && !seen.has(u.input.skill)) {
seen.add(u.input.skill);
activeSkills.push(u.input.skill);
}
}
} catch { activeSkills = []; }
const reminder = buildReminder({ classification, recentFlags: flags, override, activeSkills });
process.stdout.write(JSON.stringify({
hookSpecificOutput: {
+16
View File
@@ -66,6 +66,22 @@ describe('enforce-prompt-injection / buildReminder', () => {
expect(txt).toMatch(/verify-before-push/);
});
it('reminds about active skills carried over from prior turns (item G)', () => {
const txt = buildReminder({
classification: null,
recentFlags: [],
activeSkills: ['superpowers:test-driven-development'],
});
expect(txt).toMatch(/Active skill/i);
expect(txt).toMatch(/test-driven-development/);
expect(txt).toMatch(/coverage: skill:/);
});
it('omits the active-skill note when none are active', () => {
const txt = buildReminder({ classification: null, recentFlags: [], activeSkills: [] });
expect(txt).not.toMatch(/Active skill/i);
});
it('does NOT advertise dead override-vocabulary phrases (v4 stub — 1A 2026-05-31)', () => {
const txt = buildReminder({ classification: null, recentFlags: [] });
// findOverride/loadOverrideVocab — заглушки (vocab removed in v4); реклама фраз
+6
View File
@@ -120,6 +120,12 @@ const READING_CMDS = new Set(['ls', 'pwd', 'wc', 'head', 'tail', 'file', 'stat',
const SAFE_EXACT = [
/^npx\s+vitest\s+(?:run|--version)\b/,
/^npm\s+(?:test|run\s+test|run\s+lint(?::[\w-]+)?)\b/,
// `npm ci` (2026-05-31, owner-authorized) — clean install from the committed
// lockfile (deterministic, no version drift) to restore junction node_modules
// in a fresh worktree. Distinct from `npm install`/`npm i`, which stay
// hard-blacklisted (line ~60) because they can pull new/updated versions.
// `\b` after `ci` prevents `npm cider`-style prefix matches.
/^npm\s+ci\b/,
/^php\s+artisan\s+(?:list|route:list|migrate:status)\b/,
/^composer\s+(?:show|outdated)\b/,
/^node\s+(?!.*(?:-e|--eval|-p|--print|-r|--require|--import|--experimental-loader)\b)/,
+33
View File
@@ -271,6 +271,39 @@ describe('SAFE_EXACT — narrow `cd app` whitelist (2026-05-31, owner-authorized
});
});
describe('SAFE_EXACT — npm ci (worktree dep restore, 2026-05-31)', () => {
// Allowed: npm ci installs exactly the committed lockfile (deterministic, no
// version drift) — needed to restore junction node_modules in a fresh worktree.
it.each([
'npm ci',
'npm ci --no-audit',
'npm ci --prefer-offline',
])('allows %s', (cmd) => {
expect(classifyBashCommand(cmd, {}).result).toBe('allow');
});
// Critical: npm install / npm i remain hard-blacklisted (line 60) — they can
// pull new/updated versions, unlike ci which pins to the lockfile.
it.each([
'npm install',
'npm i',
'npm install foo',
'npm i foo',
])('still blocks %s (hard-blacklist)', (cmd) => {
expect(classifyBashCommand(cmd, {}).result).toBe('block');
});
// Critical: word boundary — `npm cider` (or any ci-prefixed token) is NOT npm ci
it('does not allow ci-prefixed token (word boundary)', () => {
expect(classifyBashCommand('npm cider', {}).result).toBe('block');
});
// Critical: chain semantics still enforced — npm ci && rm x → block (rm mutating)
it('still blocks chain with mutating part after npm ci', () => {
expect(classifyBashCommand('npm ci && rm x', {}).result).toBe('block');
});
});
import { stripQuotedSpans } from './enforce-router-gate.mjs';
describe('quote-aware redirect (quirk 2)', () => {
+13 -1
View File
@@ -24,6 +24,11 @@ import { readStdin, parseEventJson, exitDecision } from './enforce-hook-helpers.
const WRITE_TOOLS = new Set(['Edit', 'Write', 'MultiEdit', 'NotebookEdit']);
const RUNTIME_RE = /(^|\/)\.claude\/runtime(\/|$)/i;
// Transcript protection (Z Part 1): any *.jsonl under ~/.claude/projects/** is a
// session/subagent transcript. The tdd-gate credits a subagent's RED from its
// agent-<id>.jsonl, so these must be unforgeable by the Write tool. Memory files
// there are *.md and never match `.jsonl$`, so memory writes stay allowed.
const TRANSCRIPT_RE = /(^|\/)\.claude\/projects\/.*\.jsonl$/i;
/**
* Pure decision.
@@ -39,12 +44,19 @@ export function decide({ toolName, filePath, normalizeImpl = pathNormalize }) {
if (!fp) return { block: false };
let norm;
try { norm = normalizeImpl(fp); } catch { return { block: false }; } // cannot determine → fail-open
if (RUNTIME_RE.test(String(norm || ''))) {
const normStr = String(norm || '');
if (RUNTIME_RE.test(normStr)) {
return {
block: true,
reason: `Write to «${norm}» denied — ~/.claude/runtime is a protected side-channel (git-approval anchor). Hooks write it via Node fs, not the Write tool.`,
};
}
if (TRANSCRIPT_RE.test(normStr)) {
return {
block: true,
reason: `Write to «${norm}» denied — ~/.claude/projects/**/*.jsonl are session/subagent transcripts (tamper-protected; the tdd-gate trusts them). The harness writes transcripts, never the Write tool. Memory *.md there stays writable.`,
};
}
return { block: false };
}
+44
View File
@@ -52,3 +52,47 @@ describe('enforce-runtime-write-deny decide()', () => {
expect(r.block).toBe(true);
});
});
// Part 1 of Z (2026-05-31): close the transcript Write hole. The tdd-gate will
// (Part 2) credit a subagent's RED from its agent-<id>.jsonl; that transcript
// must therefore be unforgeable. The Write tool was the last ungated channel
// into ~/.claude/projects/**/*.jsonl (Bash/PowerShell/Read gates already cover
// it). Memory files there are .md and stay writable (they never match .jsonl$).
describe('enforce-runtime-write-deny — transcript .jsonl protection (Z Part 1)', () => {
it('blocks a Write to a subagent transcript under ~/.claude/projects', () => {
const p = join(HOME, '.claude', 'projects', 'slug', 'sess-uuid', 'subagents', 'agent-abc.jsonl');
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(true);
});
it('blocks a Write to the controller session transcript itself', () => {
const p = join(HOME, '.claude', 'projects', 'slug', 'sess-uuid.jsonl');
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(true);
});
it('blocks Edit/MultiEdit/NotebookEdit on a transcript .jsonl too', () => {
const p = join(HOME, '.claude', 'projects', 'slug', 'sess', 'subagents', 'agent-x.jsonl');
expect(decide({ toolName: 'Edit', filePath: p }).block).toBe(true);
expect(decide({ toolName: 'MultiEdit', filePath: p }).block).toBe(true);
expect(decide({ toolName: 'NotebookEdit', filePath: p }).block).toBe(true);
});
it('blocks the .-segment evasion into projects transcripts', () => {
const evasion = `${HOME_FWD}/.claude/projects/slug/./sess/subagents/agent-x.jsonl`;
expect(decide({ toolName: 'Write', filePath: evasion }).block).toBe(true);
});
it('ALLOWS a memory .md under ~/.claude/projects (never a .jsonl)', () => {
const p = join(HOME, '.claude', 'projects', 'slug', 'memory', 'feedback_x.md');
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(false);
});
it('ALLOWS a .jsonl OUTSIDE ~/.claude/projects (e.g. repo observer episodes)', () => {
const p = join(HOME, 'repo', 'docs', 'observer', 'episodes-2026-05.jsonl');
expect(decide({ toolName: 'Write', filePath: p }).block).toBe(false);
});
it('ignores non-write tools on a transcript path', () => {
const p = join(HOME, '.claude', 'projects', 'slug', 'sess', 'subagents', 'agent-x.jsonl');
expect(decide({ toolName: 'Read', filePath: p }).block).toBe(false);
});
});
+75 -7
View File
@@ -27,6 +27,7 @@ import {
isProductionCodePath,
readRouterState,
} from './enforce-hook-helpers.mjs';
import { join, dirname, basename } from 'node:path';
const RULE_KEY_TDD = 'tdd-gate';
const RULE_KEY_PLAN = 'writing-plans-required';
@@ -132,8 +133,56 @@ function hasPlanIndicator(turn) {
return false;
}
const AGENT_ID_RE = /agentId:\s*([0-9a-f]+)/i;
/**
* Cross-actor (Z Part 2): extract agentIds of subagents spawned by a `Task`
* tool in the controller's current turn. The agentId comes from the harness-
* written Task tool_result text ("agentId: <hex>") the controller cannot forge
* a tool_result in its own transcript. Only hex ids are accepted, so a crafted
* "agentId: ../../x" cannot become a path-traversal into an arbitrary file.
*/
export function turnTaskAgentIds(turn) {
const taskUseIds = new Set();
for (const e of turn || []) {
const c = e && e.message && e.message.content;
if (!Array.isArray(c)) continue;
for (const b of c) {
if (b && b.type === 'tool_use' && b.name === 'Task') taskUseIds.add(b.id);
}
}
const ids = [];
for (const e of turn || []) {
const c = e && e.message && e.message.content;
if (!Array.isArray(c)) continue;
for (const b of c) {
if (!b || b.type !== 'tool_result' || !taskUseIds.has(b.tool_use_id)) continue;
const txt = typeof b.content === 'string' ? b.content
: Array.isArray(b.content) ? b.content.map((p) => p && p.text).filter(Boolean).join('\n') : '';
const m = txt.match(AGENT_ID_RE);
if (m) ids.push(m[1]);
}
}
return ids;
}
/**
* Derive subagent transcript paths from the controller transcript path and a
* list of agentIds. Subagent transcripts live at
* <projects>/<slug>/<controller-session>/subagents/agent-<agentId>.jsonl
* i.e. nested under the controller session's own directory (bound to it), while
* the controller transcript is <...>/<controller-session>.jsonl.
*/
export function subagentTranscriptPaths(controllerTranscriptPath, agentIds) {
const p = String(controllerTranscriptPath || '');
if (!p) return [];
const dir = dirname(p);
const base = basename(p).replace(/\.jsonl$/i, '');
return (agentIds || []).map((id) => join(dir, base, 'subagents', `agent-${id}.jsonl`));
}
export function decide({
toolName, filePath, transcriptEntries, classification, override, overridePlan,
toolName, filePath, transcriptEntries, classification, override, overridePlan, subagentEntriesList = [],
}) {
if (!['Edit', 'Write', 'MultiEdit'].includes(toolName)) return { block: false };
if (!isProductionCodePath(filePath)) return { block: false };
@@ -155,24 +204,31 @@ export function decide({
}
}
// Rule #3 — TDD gate.
// Rule #3 — TDD gate. Credit the controller's own turn OR a subagent that was
// spawned by a Task in this turn (cross-actor, Z Part 2). Subagent evidence is
// read from its agent-<id>.jsonl, which is tamper-protected by the transcript
// Write-deny (Z Part 1) — so crediting it does not open a forgery channel.
if (override) return { block: false };
const hasTest = hasMatchingTestEdit(turn, filePath);
const subList = Array.isArray(subagentEntriesList) ? subagentEntriesList : [];
const hasTest = hasMatchingTestEdit(turn, filePath) || subList.some((es) => hasMatchingTestEdit(es, filePath));
if (!hasTest) {
return {
block: true,
message: [
`[enforce-tdd-gate] Production code edit on "${filePath}" without preceding test edit.`,
`Write the failing test FIRST in the corresponding *.test.mjs / *.spec.ts / *Test.php.`,
`Write the failing test FIRST in the corresponding *.test.mjs / *.spec.ts / *Test.php`,
`(a subagent's test edit, if it was spawned by a Task in this turn, is also credited).`,
`Then run vitest/pest to confirm RED, then return to this prod-code Edit.`,
].join('\n'),
};
}
if (!hasFailingTestRun(turn)) {
const hasRed = hasFailingTestRun(turn) || subList.some((es) => hasFailingTestRun(es));
if (!hasRed) {
return {
block: true,
message: [
`[enforce-tdd-gate] Test was edited but no vitest/pest run with RED output observed in this turn.`,
`[enforce-tdd-gate] Test was edited but no vitest/pest run with RED output observed in this turn`,
`(nor in any in-turn subagent transcript).`,
`Run the test suite (vitest run <test-file> / composer test) to confirm RED before prod-code edit.`,
].join('\n'),
};
@@ -199,7 +255,19 @@ async function main() {
task_type: state.classification.task_type,
} : null;
const result = decide({ toolName, filePath, transcriptEntries: transcript, classification, override, overridePlan });
// Cross-actor (Z Part 2): read transcripts of subagents spawned by a Task in
// this turn, bound to the controller session via the derived path. Best-effort
// — a missing/unreadable subagent transcript just yields no extra credit
// (stricter), never an error.
let subagentEntriesList = [];
try {
const turn = lastTurnEntries(transcript);
const agentIds = turnTaskAgentIds(turn);
const paths = subagentTranscriptPaths(event.transcript_path, agentIds);
subagentEntriesList = paths.map((p) => readTranscript(p)).filter((e) => Array.isArray(e) && e.length);
} catch { subagentEntriesList = []; }
const result = decide({ toolName, filePath, transcriptEntries: transcript, classification, override, overridePlan, subagentEntriesList });
exitDecision(result);
} catch {
exitDecision({ block: false });
+75 -1
View File
@@ -1,5 +1,79 @@
import { describe, it, expect } from 'vitest';
import { decide } from './enforce-tdd-gate.mjs';
import { decide, turnTaskAgentIds, subagentTranscriptPaths } from './enforce-tdd-gate.mjs';
// Z Part 2 (2026-05-31): the tdd-gate must credit a subagent's test edit + RED
// when that subagent was spawned by a Task in the controller's current turn.
// Pairs with the transcript Write-hole closed in enforce-runtime-write-deny.mjs
// (Z Part 1) so the credited agent-<id>.jsonl cannot be forged.
describe('enforce-tdd-gate Z cross-actor (pairs with enforce-runtime-write-deny Part 1)', () => {
const subagentRedRun = [
{ message: { role: 'user', content: 'write the failing test for foo and confirm RED' } },
{ message: { role: 'assistant', content: [
{ type: 'tool_use', id: 's1', name: 'Write', input: { file_path: 'tools/foo.test.mjs' } },
{ type: 'tool_use', id: 's2', name: 'Bash', input: { command: 'npx vitest run tools/foo.test.mjs' } },
] } },
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 's2', content: 'Tests 1 failed | 0 passed' } ] } },
];
it('credits a subagent test edit + RED for the controller prod edit', () => {
const r = decide({
toolName: 'Edit',
filePath: 'tools/foo.mjs',
transcriptEntries: [
{ message: { role: 'user', content: 'delegate the test, then I implement' } },
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 't1', name: 'Task', input: { subagent_type: 'tester' } } ] } },
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 't1', content: 'done. agentId: a1234abcd' } ] } },
],
subagentEntriesList: [subagentRedRun],
});
expect(r.block).toBe(false);
});
it('still blocks when subagent edited a test but NO RED exists anywhere', () => {
const subNoRed = [
{ message: { role: 'user', content: 'write test' } },
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 's1', name: 'Write', input: { file_path: 'tools/foo.test.mjs' } } ] } },
];
const r = decide({
toolName: 'Edit', filePath: 'tools/foo.mjs',
transcriptEntries: [ { message: { role: 'user', content: 'go' } } ],
subagentEntriesList: [subNoRed],
});
expect(r.block).toBe(true);
expect(r.message).toMatch(/RED/);
});
it('preserves old behavior when no subagent entries (blocks without test)', () => {
const r = decide({
toolName: 'Edit', filePath: 'tools/foo.mjs',
transcriptEntries: [ { message: { role: 'user', content: 'go' } } ],
subagentEntriesList: [],
});
expect(r.block).toBe(true);
expect(r.message).toMatch(/without preceding test edit/);
});
it('turnTaskAgentIds extracts a hex agentId from an in-turn Task tool_result', () => {
const turn = [
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 't1', name: 'Task', input: {} } ] } },
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 't1', content: 'ok agentId: a1b2c3d4e5' } ] } },
];
expect(turnTaskAgentIds(turn)).toContain('a1b2c3d4e5');
});
it('turnTaskAgentIds ignores non-Task results and rejects non-hex ids (no path traversal)', () => {
const turn = [
{ message: { role: 'assistant', content: [ { type: 'tool_use', id: 'b1', name: 'Bash', input: {} } ] } },
{ message: { role: 'user', content: [ { type: 'tool_result', tool_use_id: 'b1', content: 'agentId: ../../evil' } ] } },
];
expect(turnTaskAgentIds(turn)).toHaveLength(0);
});
it('subagentTranscriptPaths derives <dir>/<sessbase>/subagents/agent-<id>.jsonl', () => {
const paths = subagentTranscriptPaths('/p/projects/slug/sessUUID.jsonl', ['a1b2']);
expect(paths[0].split('\\').join('/')).toBe('/p/projects/slug/sessUUID/subagents/agent-a1b2.jsonl');
});
});
function userMsg(text) {
return { message: { role: 'user', content: text } };
+1 -1
View File
@@ -24,7 +24,7 @@ export function computeWorkspaceHash(workspacePath) {
return createHash('md5').update(String(workspacePath || ''), 'utf-8').digest('hex').slice(0, 12);
}
function isStale(record, now) {
export function isStale(record, now) {
if (!record || typeof record !== 'object') return true;
const ttl = typeof record.ttl_ms === 'number' ? record.ttl_ms : LOCK_DEFAULT_TTL_MS;
return now - (record.acquired_at || 0) > ttl;
+21
View File
@@ -6,6 +6,7 @@ import {
release,
refresh,
computeWorkspaceHash,
isStale,
LOCK_DEFAULT_TTL_MS,
} from './parallel-session-lock.mjs';
@@ -91,6 +92,26 @@ describe('parallel-session-lock pure module (Stream H Task 7)', () => {
});
});
// isStale is exported (B, 2026-05-31) so the wrapper's prune step reuses the
// EXACT same staleness definition — single source of truth, no divergence that
// could ever prune a still-fresh (active) lock.
describe('isStale (exported for prune support)', () => {
it('true when now - acquired_at exceeds ttl_ms', () => {
expect(isStale({ acquired_at: 0, ttl_ms: 100 }, 1000)).toBe(true);
});
it('false when still within ttl (active lock — never pruned)', () => {
expect(isStale({ acquired_at: 900, ttl_ms: 1000 }, 1000)).toBe(false);
});
it('true for a malformed/missing record', () => {
expect(isStale(null, 1000)).toBe(true);
expect(isStale(undefined, 1000)).toBe(true);
});
it('uses the default TTL when ttl_ms is absent', () => {
expect(isStale({ acquired_at: 0 }, LOCK_DEFAULT_TTL_MS + 1)).toBe(true);
expect(isStale({ acquired_at: 0 }, LOCK_DEFAULT_TTL_MS - 1)).toBe(false);
});
});
describe('computeWorkspaceHash (Stream H Task 7)', () => {
it('returns 12 hex chars', () => {
const h = computeWorkspaceHash('/some/path');