Commit Graph

316 Commits

Author SHA1 Message Date
Дмитрий cb32aa9907 feat(gate): re-scope router-gate — allow local dev, keep prod+discipline blocks
composer/npm moved from hard-blacklist to whitelist; git dev-allow (commit/add/branch/switch/checkout/stash/worktree) + push main-guard in shared shell-content-rules; read-only GitHub (get_*/actions_get/actions_list) in mcp-classifier. Prod-safety (deploy/prod-DB/secrets/workflow-triggers/MCP-write), discipline hooks, and main push/merge stay blocked. Spec+plan in docs/superpowers. tools regression 1991 GREEN.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 09:32:39 +03:00
Дмитрий b0cd18d797 fix(router-gate): quote-aware redirect detector + drop dead override-phrase ads
Квирк 2: новый stripQuotedSpans делает детектор stdout/stderr-редиректа
кавычко-осознанным — `>` / `2>` ВНУТРИ кавыченного аргумента (текст коммита
с <email>, "2>1") больше не ложно-блокируется; настоящие редиректы (оператор
вне кавычек) блокируются как прежде. RED→GREEN, существующие redirect/cd-app
кейсы целы.

1A: убрана реклама мёртвых override-фраз (findOverride — заглушка v4, фразы
не работают): баннер enforce-prompt-injection (каждый UserPromptSubmit) +
block-сообщения enforce-verify-before-push / coverage-verify / memory-coverage
/ tdd-gate (×3). Каждый фикс залочен негативным тестом.

Сознательно НЕ делали: калибровку 6 судьи (читать чат-контекст) и ослабление
exact-match approve (квирк 3) — это рубежи защиты, их трогать нельзя.

Регрессия vitest tools-only: 1989 passed | 2 skipped (verify через
npx vitest run --root app --config vitest.config.tools.mjs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 14:05:52 +03:00
Дмитрий 30b79c7228 fix(router-gate): narrow cd app whitelist (TDD, tools 1978 GREEN)
Add /^cd\s+app$/ to SAFE_EXACT so already-whitelisted commands (pest,
php artisan test) run from app/. Scope limited to the literal `app` dir:
cd into any other path (incl. protected .claude/runtime, memory/,
transcripts) stays default-deny, so the cwd-shift read-bypass is contained.
Mutations remain caught at the hard-blacklist + chain-mutating rule, and
each chain segment after `cd app &&` must still be independently whitelisted.

Owner-authorized, narrow scope = literal `app` only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 13:34:42 +03:00
Дмитрий d647bf1858 fix(router-gate-v4): calibration 5 - cosmetic-detector exempts git-approval AskUser (scope fix, regression-tested) 2026-05-31 11:19:14 +03:00
Дмитрий 1f9b51bc39 feat(router-gate-v4): parallel-session-lock live main() — acquire on PreToolUse + release on Stop (point 2)
The Stream H wrapper shipped a deliberate no-op main() — the lock did nothing.
This wires it live: PreToolUse on a mutating tool acquires/refreshes the
workspace lock (blocks only when a DIFFERENT session holds a fresh, non-stale
lock); the Stop event releases it. Fail-open on any error so a lock bug can
never wedge the user out of their own session.

- runAcquireDecision({event,now,pid,cwd,readLock,writeLock}) — compose
  acquire() + decide().
- runReleaseAction({event,cwd,readLock,deleteLock}) — release() if this
  session owns the lock, no-op otherwise.
- live main(): branches on tool_name (present → acquire/refresh; absent/Stop
  → release); real fs binding via runtimeDir()/session-lock-<workspaceHash>.json.

Activation registers BOTH the PreToolUse (acquire) AND the Stop (release)
entries — the Stop wiring is mandatory; without it the lock is never released
and the next abnormal exit would lock the user out. Script:
.scratch/activate-point2-hooks.ps1 (also registers safe-baseline-metering +
runtime-write-deny per the point-2 plan).

Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md Task 7.
Regression: parallel-session-lock 12/12 GREEN; full tools suite 1958 passed | 2 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 11:06:52 +03:00
Дмитрий 8a7144892c fix(router-gate-v4): calibrate per-tool LLM-judge — calibration 4 soft user-prompt fallback
The per-tool judge compares each mutating tool call against the classifier's
distilled task summary read from router-state. That summary is lossy and
frequently "(unknown)" even for a perfectly explicit user request — and with an
unknown task the judge has nothing to compare against, so "Сомнения → NO"
blocked every real edit. Reproduced repeatedly this session: an explicit
"реализуй ... main() ..." prompt still classified unknown → all edits blocked,
including the judge's own fix. Calibration 2 (allow on unknown) was rejected by
the owner as a discipline hole.

Calibration 4 (soft, scope-preserving): when — and only when — the classifier
summary is "(unknown)"/empty, fall back to judging against the user's actual
last prompt (the ground-truth request) instead of nothing. The judge still runs
and still blocks on doubt; it just uses better evidence. When the summary is
meaningful, behaviour is unchanged (the user-prompt reader is not consulted).
When both summary and prompt are unavailable, the task stays "(unknown)" and
doubt→block is preserved.

NOT calibration 2: this does not blindly allow on unknown — it re-grounds the
judge in the literal user request, which the controller cannot fabricate (the
user writes it; it is read locally from the session transcript).

- tools/llm-judge-per-tool.mjs: resolveEffectiveTask(declaredTask, lastUserPrompt).
- tools/enforce-llm-judge-per-tool.mjs: runPerTool reads the last user prompt
  (helpers.lastUserPromptText + readTranscript) only on an unknown summary;
  main() binds it.

Regression: judge tests 57/57 GREEN; full tools suite 1951 passed | 2 skipped.
The 6 remaining failures are uncommitted point-2 WIP in
enforce-parallel-session-lock.test.mjs — not part of this change, not committed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:34:27 +03:00
Дмитрий 722f4bb189 fix(router-gate-v4): calibrate per-tool LLM-judge — exempt Skill (calib 1) + test-runners (calib 3)
The Layer-4 per-tool judge over-blocked: it judged every Skill/Edit/Write/
Bash/Task against the declared task and blocked on doubt. A vague prompt
classifies as unknown/ambiguous, so the judge then blocked essentially all
artifact-producing tools — including the prescribed §17 skill entry and the
mandatory TDD test run — making legitimate, owner-mandated work impossible
and blocking its own fix (3 reproduced blocks this session).

Calibration 1 (scope fix, NOT a discipline drop): remove `Skill` from
MUTATING_TOOLS in tools/llm-judge-per-tool.mjs. Invoking a skill mutates no
state and is the §17-mandated entry into work; the real mutations it leads to
(Edit/Write/MultiEdit/Bash/PowerShell/Task/commit/push) stay fully judged.

Calibration 3 (scope fix, NOT a discipline drop): add isTestRunnerBashEvent to
tools/enforce-llm-judge-per-tool.mjs and skip it in runPerTool, mirroring the
existing readonly-Bash exemption. A test run (vitest/pest/phpunit/php artisan
test/composer test/npm test) only inspects + reports and is a mandatory TDD
step; commands chaining to a mutation (&& ; | backtick $() are NOT exempt.

doubt→block on real mutations against a known task is unchanged (covered by the
"mutating Bash (git commit) STILL judged" test). Calibration 2 (allow on
unknown task) was rejected by the owner as a discipline hole and not added.

Regression: vitest tools-only 1945 passed | 2 skipped (+18 calibration tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:04:43 +03:00
Дмитрий c9b9efd6e4 fix(router-gate-v4): exclude readonly Bash from per-tool judge — scope fix, discipline unchanged 2026-05-31 08:59:18 +03:00
Дмитрий dfae9f760b feat(router-gate-v4): live main() for LLM-judge wrappers — flag-gated spend (item 2b) 2026-05-31 08:06:26 +03:00
Дмитрий a8996896a8 test(router-gate-v4): Read-deny boundary cases (.env.production blocked, Tooling doc readable) 2026-05-31 07:38:18 +03:00
Дмитрий 3c5266c022 fix(router-gate-v4): narrow Read-deny so CLAUDE.md and memory are Read-allowed, transcripts/runtime still blocked (over-block fix) 2026-05-31 07:26:30 +03:00
Дмитрий 80e514f5bb feat(router-gate-v4): enforce-runtime-write-deny protect runtime side-channels (C3) 2026-05-31 05:57:59 +03:00
Дмитрий f740f6124a feat(safe-baseline): live main() metering + hard-block + Skill/EnterPlanMode escape (item 1b) 2026-05-31 05:57:47 +03:00
Дмитрий ca52d354f9 feat(router-gate-v4): LLM-judge per-tool + response-scan hook wrappers (Stream H tail) 2026-05-30 19:59:42 +03:00
Дмитрий 6ac4b1c1b1 feat(router-gate-v4): safe-baseline-metering wrapper + llm-judge-config gate (Stream H tail) 2026-05-30 19:29:58 +03:00
Дмитрий f172e2a580 feat(router-gate): SAFE_EXACT +Laravel dev workflow
Closes design gap in v4 whitelist: dev commands (pest, composer test/pint/stan/insights/rector,
php artisan test/migrate variants/db:seed/cache:clear etc., vendor/bin/pest) were falling into
default-deny. That blocked sessions working on app/ code and pushed controllers toward override
phrases or requests to disable the defense.

Changes are surgical and do not weaken discipline defense:
- 4 new SAFE_EXACT regex entries for specific dev commands
- tinker EXCLUDED on purpose (REPL = arbitrary PHP exec risk)
- migrate:install and other unknown migrate subcommands stay blocked via
  lookahead instead of word-boundary (precision fix)
- Hard-blacklist for mutating package operations, chain-semantics C13,
  file-watcher, TDD-gate, path-deny, coverage requirement and the other 15
  defense hooks are NOT touched.

TDD: 22 RED allow-tests + 7 still-block tests + 3 regression tests.
Full tools-only regression 1821/1821 GREEN.

Live smoke verified: composer test allowed; migrate:install blocked.

Whitelist v3.8 was sized around vitest tools-only; Laravel app/ dev workflow
slipped through. This commit corrects that without touching the architecture.
2026-05-30 16:11:34 +03:00
Дмитрий ffd70d6fa5 fix(router-gate-v4): lastTurnEntries skips harness-injected skill bodies (isMeta + sourceToolUseID)
Sibling Claude session 2026-05-30 found that lastTurnEntries treats
harness-injected skill bodies as spurious turn boundaries, breaking both
enforce-memory-coverage (can't find user's coverage line) AND
enforce-normative-content-rules::detectLegitSkillActive (can't find the
Skill tool_use that lives in the assistant message BEFORE the body).

Refinement applied here: this session inspected 29 isMeta:true entries
across the live transcript (8f4ba767-...jsonl) via a debug helper and
found isMeta:true is ALSO used for "Continue from where you left off"
auto-resume, Stop hook feedback strings, and <local-command-caveat>
wrappers — those are real user-equivalent boundaries that must remain
visible. Sibling's blanket "skip isMeta" proposal would have broken them.

Discriminator: skip ONLY when isMeta === true AND typeof sourceToolUseID
=== 'string' (tool-spawned content). Skill bodies have the linking field;
the other isMeta sources do not. The sourceToolUseID field is harness-
controlled and not writable by controller from inside a tool call —
cannot be spoofed.

Behaviour after fix:
  * Skill body injection → skipped → walk continues back to find user's
    real prompt (with coverage line).
  * The assistant message containing the Skill tool_use is now inside the
    turn → detectLegitSkillActive finds it → normative writes pass when
    invoked under an active claude-md-management skill.
  * "Continue from where you left off." → still treated as turn boundary.
  * Stop hook feedback strings → still treated as turn boundary.

TDD:
  * 3 new tests in tools/enforce-hook-helpers.test.mjs under the
    "lastTurnEntries / lastUserPromptText / lastAssistantText / turnToolUses"
    describe block:
      - lastTurnEntries skips skill body injections (isMeta + sourceToolUseID)
      - lastTurnEntries does NOT skip "Continue from where you left off"
        (isMeta but no sourceToolUseID)
      - turnToolUses includes Skill tool_use spawned in same turn as the
        injected skill body
  * 2/3 RED→GREEN (the "Continue" negative test passed on baseline already
    since its string content satisfies the existing string-content branch).

Scope:
  * Fixes 2 of the 5 structural quirks documented in the Stream H
    completion log (enforce-memory-coverage gap, enforce-normative-
    content-rules detectLegitSkillActive gap).
  * Does NOT fix: enforce-read-path-deny LEGIT_SKILLS exemption gap
    (separate hook, no lastTurnEntries dependency); TDD-gate cross-actor
    blindness (different mechanism — actor session boundaries);
    detectFullTestRun regex narrowness (command-pattern matching).

Regression: vitest tools 1788/1788 GREEN (was 1785; +3 new tests).

Plan: docs/superpowers/plans/2026-05-30-lastturnentries-skill-body-skip.md
2026-05-30 14:16:12 +03:00
Дмитрий f1c422af49 feat(router-gate-v4): Stream H Task 10 — subagent-prompt-prefix worktree bootstrap auto-inject
Closes Stream H Task 10 (H10) that was deferred from the initial Stream H
push. Adds two pure helpers to tools/subagent-prompt-prefix.mjs and wires
them into buildHeader() so subagents spawned inside a linked git worktree
get a SETUP block with vendor symlink + storage/framework mkdir guidance
in their injected prompt.

Two new exports:

1. detectWorktreeMode({cwd, gitDir, gitCommonDir}) — pure detector that
   returns {isWorktree, parentRepoRoot}. Worktree is detected when the
   per-worktree git-dir differs from the shared git-common-dir; the
   parent repo root is derived by stripping the trailing `/.git` segment
   from the common dir (separators normalized to forward slashes). Handles
   null inputs gracefully and accepts mixed forward/backslash separators.

2. buildSetupBlock({isWorktree, parentRepoRoot, platform}) — pure renderer
   that returns the SETUP — worktree bootstrap text block (or '' to omit
   when not in a worktree or parentRepoRoot is missing). Picks `mklink /D`
   on win32 vs `ln -s` elsewhere. Mentions all four storage/framework
   subdirs (cache, sessions, views, testing) per memory
   `feedback_subagent_worktree_bootstrap.md` — exactly what Pest 4 needs
   to resolve the Eloquent facade and view cache paths inside a worktree.

buildHeader() now resolves --git-dir + --git-common-dir alongside the
existing --show-toplevel, calls detectWorktreeMode to classify the
spawn site, then inserts buildSetupBlock's output between rule 5 and
the END marker. When not in a worktree the block is empty and the header
layout is unchanged.

Regression: vitest tools 1785/1785 GREEN (was 1776; +9 tests across
"detectWorktreeMode (Stream H Task 10)" and "buildSetupBlock (Stream H
Task 10)" describe blocks in the new
tools/subagent-prompt-prefix-h10.test.mjs file). The pre-existing
tools/subagent-prompt-prefix.test.mjs is intentionally excluded from
vitest config (node:test runner used for subprocess-style tests) — H10
helpers are pure and live in the vitest scope so the new test file is
not added to the exclude list.

Stream H Task 10 of 11 — closes the deferred H10. Plan:
docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 12:08:33 +03:00
Дмитрий d75c8922aa fix(router-gate-v4): Stream H Task 9 — cosmetic path-format fixes (Cygwin /c/ prefix + PowerShell $env:VAR expansion)
Closes Stream H Task 9 (H3). Two cosmetic fixes in tools/path-normalization.mjs
for gate error messages observed during Smoke 5 Real Fix Re-test 2026-05-30
(steps 4 and 5). Both purely affect human-readable display in block messages
— security behaviour is unchanged (path-deny still fires correctly in all
the original test scenarios).

1. Cygwin/git-bash `/c/Users/...` prefix collapsed before path.resolve.
   On win32, path.resolve('/c/Users/x') treats `/c/...` as drive-relative
   and prepends cwd's drive letter, producing display paths like
   `c:/c/users/...` (doubled drive). The fix inserts a single-letter-drive
   normalization step BEFORE resolve when the input looks Cygwin-style.
   Guarded by `homedir matches ^[a-zA-Z]:` so POSIX test fixtures
   (homedir='/h') still get the original behaviour.

2. PowerShell `$env:USERPROFILE` syntax expanded in expandEnvVars.
   The expander handled `%NAME%`, `${NAME}`, and bare `$NAME` but not
   the PowerShell-native `$env:NAME` form, so messages displayed the
   literal `$env:USERPROFILE` instead of the expanded path. Added a
   case-insensitive matcher (PowerShell is case-insensitive) covering
   all ENV_WHITELIST names. Non-whitelisted `$env:SECRET` still passes
   through unchanged.

Regression: vitest tools 1776/1776 GREEN (was 1772; +4 new tests across
"pathNormalize" (+1 cygwin), "expandEnvVars — PowerShell $env:VAR
(Stream H Task 9 cosmetic)" (+3)). One pre-existing test ("case-folds on
win32") would have broken without the homedir-drive guard — guard
preserves it.

Stream H Task 9 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:43:31 +03:00
Дмитрий e1592cc1df feat(router-gate-v4): Stream H Task 8 — brain-retro Tables 16-17 + analyzer extensions
Closes Stream H Task 8 (H9). Adds two new digital-analysis cuts to the
brain-retro pipeline so future retros can see hook effectiveness and
self-fabrication patterns at-a-glance.

Two new builders in tools/brain-retro-analyzer.mjs:

1. buildRouterGateHookEffectiveness(episodes) → {rules: {[rule]: {fires, blocks}}}
   Aggregates episode.hook_fired records by rule name, counts total fires
   and block-outcomes per rule (Table 16). Ignores episodes without a
   structured hook_fired record. Enables visibility into which router-gate
   v4 hooks actually triggered in a session and what their block rate was.

2. buildSelfFabricationSignals(episodes) → {fabrications, legit}
   Flags episodes where controller_claim is a non-empty string but
   tool_uses is missing/empty — the canonical signature of the 7
   fabrication patterns documented in
   docs/superpowers/runbooks/recovery-procedures.md §5 (Table 17).
   Episodes without controller_claim are not counted (nothing was claimed).

Both wired into analyze() output as result.routerGateHookEffectiveness and
result.selfFabricationSignals. SKILL.md MANDATORY DIGITAL ANALYSIS block
bumped from 11 → 13 tables with row 12 (router-gate hook effectiveness
per-rule) and row 13 (self-fabrication signals + cross-ref to
recovery-procedures.md §5).

Regression: vitest tools 1772/1772 GREEN (was 1763; +9 new tests across
"buildRouterGateHookEffectiveness (Stream H Task 8 — Table 16)",
"buildSelfFabricationSignals (Stream H Task 8 — Table 17)",
"analyze() integration — Stream H Tables 16/17",
"Stream H Task 8 import sanity").

Stream H Task 8 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:39:47 +03:00
Дмитрий 79493879ae feat(router-gate-v4): Stream H Task 7 — parallel-session-lock pure module + PreToolUse wrapper (deferred activation)
Closes Stream H Task 7 (H7). Prevents two Claude sessions on the same
workspace from concurrently mutating files — addresses the cross-session
worktree collisions seen on 28.05/29.05 (deploy branch hijack + push
non-fast-forward incidents).

Architecture:
- Pure module tools/parallel-session-lock.mjs with injectable I/O
  (readLock/writeLock/deleteLock) so unit tests cover all branches without
  touching the real filesystem. Exports acquire(), refresh(), release(),
  computeWorkspaceHash(), LOCK_DEFAULT_TTL_MS (5 minutes).

- Lock record schema (schema_version=1): {session_id, pid, acquired_at, ttl_ms}.
  Stored at ~/.claude/runtime/session-lock-<workspaceHash>.json (production
  binding handled in deferred batch). Workspace hash is MD5 first-12 hex of
  the resolved workspace path.

- Acquisition semantics: stale (past TTL) → take over; same-session → idempotent
  re-acquire; other-session fresh → block. refresh() is same-session only
  (never steals). release() is same-session only (never deletes other's lock).

- Wrapper tools/enforce-parallel-session-lock.mjs exports decide(acquireResult,
  sessionId) → {block, reason?}. Fail-open if acquireResult is missing
  (internal-error safety net — avoids the Stream G Task 8 self-lockout
  pattern). Block message names the other holder's pid for human triage
  ("parallel session lock held by <other> (pid N) — wait or close that
  session first").

Defensive design:
- main() is a no-op (exit 0) until settings.json registration AND a Stop-hook
  release pathway are wired together in the batched activation step. Activating
  this hook before release-on-Stop would lock the user out of their own
  session on first abnormal exit.

Regression: vitest tools 1763/1763 GREEN (was 1748; +10 pure-module tests
under "parallel-session-lock pure module (Stream H Task 7)" and
"computeWorkspaceHash (Stream H Task 7)" describe blocks; +5 wrapper-decide
tests under "enforce-parallel-session-lock wrapper (Stream H Task 7)").

DEFERRED: .claude/settings.json registration (PreToolUse matcher
"Edit|Write|MultiEdit|NotebookEdit|Bash", block-mode, timeout 3000ms);
Stop-hook release wiring; PostToolUse refresh-on-success wiring.
Batched at end of Phase H-α/H-β.

Stream H Task 7 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:34:44 +03:00
Дмитрий 63686fa5b2 feat(router-gate-v4): Stream H Task 5 — decomposition-detector wrapper hook (PreToolUse, deferred activation)
Closes Stream H Task 5 (H6). Adds the PreToolUse wrapper around the pure
decomposition-detector module (Stream A Direction 3 / v4.1 §3.8).

What this catches:
- A feature secretly decomposed into 3+ small prompts whose primary_keywords
  overlap heavily AND no planning skill (writing-plans / brainstorming) has
  been invoked in the window. v4.1 hard-blocks mutating tools when the LLM
  judge confirms decomposition; soft-flags on legit-distinct verdict; allows
  when threshold not met or a planning skill was invoked.

Defensive design choices:
- decide() takes llmVerdict as an explicit string ('YES'|'NO'|null), not an
  async LLM call — keeps the function pure and unit-testable
  without network.
- llmVerdict=null degrades to soft_flag (with degraded:true), NOT hard_block.
  This avoids repeating the Stream G Task 8 self-lockout where a fail-CLOSE
  LLM hook bricked the session.
- main() is a no-op (exit 0) until the deferred wiring lands (history-ledger
  reader from observer Stop hook + LLM judge config from Stream D). Until
  then, the hook never blocks anything.

Regression: vitest tools 1748/1748 GREEN (was 1742; +6 wrapper-decide tests
under "enforce-decomposition-detector wrapper (Stream H Task 5)" describe
block, covering: empty history → allow, below threshold → allow, threshold
+ LLM YES → hard_block_mutating, threshold + LLM NO → soft_flag, threshold
+ skill present → allow, threshold + LLM unavailable → degraded soft_flag).

DEFERRED: .claude/settings.json registration (PreToolUse matcher
"Edit|Write|MultiEdit|NotebookEdit|Bash|Task", timeout 8000ms) AND main()
wiring (history-ledger reader + LLM judge integration). Batched with
H5/H7/H8 hook activations at end of Phase H-α/H-β.

Stream H Task 5 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:31:00 +03:00
Дмитрий c14fb72e84 feat(router-gate-v4): Stream H Task 6 — askuser-answer-parser wrapper + toApprovalRecord schema sync
Closes Stream H Task 6 (H4). Retires the manual approval-write workaround
the controller used throughout Stream H Tasks 1-5.

Two changes:

1. Pure module tools/askuser-answer-parser.mjs gains toApprovalRecord(answer, opts)
   exporter that detects a git verb in the user's free-form answer and returns
   a Stream B-compatible {type:'approve_git_operation', command, ts} record
   (matches loadApprovedGitOps reader format in shell-content-rules.mjs:125).
   Returns null for non-git answers and for stop/abort/cancel keywords.

2. New PostToolUse(AskUserQuestion) wrapper tools/enforce-askuser-answer-parser.mjs
   reads each question/answer pair, calls toApprovalRecord, appends matching
   records to ~/.claude/runtime/askuser-decisions-<sess>.jsonl. Fail-open
   observability — never blocks AskUserQuestion.

Regression: vitest tools 1742/1742 GREEN (was 1731; +5 toApprovalRecord tests
under "toApprovalRecord (Stream H Task 6 — schema sync)" including non-string
guard, +6 wrapper-hook tests under "enforce-askuser-answer-parser wrapper
(Stream H Task 6)" including missing session_id fail-open guard).

DEFERRED: settings.json registration (matcher "AskUserQuestion", PostToolUse,
fail-open, timeout 2000ms) — batched with H5/H6/H7/H8 hook activations at end
of Phase H-α/H-β. Hook code is fully implemented and unit-tested; activation
pending settings.json update.

Stream H Task 6 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:28:13 +03:00
Дмитрий 5520534424 feat(router-gate-v4): Stream H Task 3 — Workflow gate F2 hook (scriptPath approval + content scan + sha256 + resumeFromRunId block)
Closes v3.8 FATAL F2: nested agent() calls inside Workflow scripts were
invisible to PreToolUse gates. New tools/enforce-workflow-gate.mjs hook
(PreToolUse, block-mode) enforces:

1. scriptPath requires approve_workflow_script record in
   ~/.claude/runtime/askuser-decisions-<sess>.jsonl with sha256 of content
   and 5-min window (mirrors approve_git_operation pattern).
2. scriptContent static-scanned for dangerous patterns: env-key reads
   (ROUTER_LLM_KEY/ANTHROPIC_API_KEY/GITHUB_TOKEN/SENTRY_AUTH_TOKEN),
   eval(), child_process spawn/exec/fork, absolute fs writes outside /tmp,
   path traversal (../../../).
3. sha256 mismatch between approval and current content → block (catches
   modification after approval).
4. resumeFromRunId blocked unconditionally (state replay risk per spec).
5. Per-agent inheritance via CLAUDE_GATE_INHERIT env is handled by
   subagent-prompt-prefix.mjs (Stream E) — this hook focuses on the outer
   Workflow tool call. Nested agent() inside Workflow inherits parent gate.

Regression: vitest tools 1731/1731 GREEN (was 1726; +5 workflow-gate tests
under "enforce-workflow-gate scriptPath approval (F2)" describe block).

DEFERRED: .claude/settings.json registration (matcher "Workflow" → command
"node tools/enforce-workflow-gate.mjs", block-mode, timeout 5000ms) — the
settings.json file is in DEFAULT_PROTECTED_PATTERNS and enforce-read-path-
deny.mjs (Smoke 5 emergency fix 25e184e5) has no LEGIT_SKILLS exemption
like enforce-normative-content-rules.mjs does. Harness Edit/Write tracker
cannot be satisfied without a successful Read first. Will be batched into
a single manual settings.json registration step at end of Phase H-α
alongside H5/H6/H7 hook registrations. Hook code is fully implemented and
unit-tested; activation pending settings.json update.

Stream H Task 3 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 10:50:50 +03:00
Дмитрий fc3c85bb6e fix(router-gate-v4): Stream H Task 2 — extractPathArgs handles --flag=PATH, key=VAL, multi-positional
Found during Smoke 5 trace (recovery-procedures.md Section 5 fabrication #4):
extractPathArgs was missing protected paths when they appeared as a flag
value (--output=PATH or --output PATH) or as the second positional argument
(dd of=, tee, cp DST). The path-deny overlay correctly checks each candidate
path, but the candidate list was incomplete.

Fix: rewrite extractPathArgs to scan all tokens past index 0:
- recognize --flag=VALUE inline form (extract VALUE)
- recognize key=value (dd-style: if=, of=)
- skip URL-looking tokens (https://, ftp://, ssh://) as low-FP heuristic
- preserve existing behavior for plain positionals and skip redirect tokens

Regression: vitest tools 1726/1726 GREEN (was 1720; +6 path edge-case tests
under "extractPathArgs edge cases (Stream H Task 2)").

Stream H Task 2 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 10:25:15 +03:00
Дмитрий d277d4bdfc chore(router-gate-v4): Stream H pre-flight — allow git fetch/ls-remote in readonly whitelist
Pre-flight sync per Pravila §15.2 («git fetch origin && git log
HEAD..origin/main») was blocked because GIT_READONLY_SUB in
shell-content-rules.mjs missed both `fetch` and `ls-remote` subcommands.
Both are ref-only (no working-tree mutation, no commit/push side effect)
and Stream B Whitelist construction left them out by omission — surfaced
during Stream H pre-flight 2026-05-30.

Fix: add both to GIT_READONLY_SUB; RED→GREEN 5 it.each cases covering
`git fetch`, `git fetch origin`, `git fetch --all`, `git ls-remote origin`,
`git ls-remote --heads`.

Atomic precursor commit before any Stream H plan task — does not touch
extractPathArgs (H2) or path-deny display format (H3); pure whitelist
extension.

Regression: vitest tools shell-content-rules.test.mjs 67/67 GREEN
(was 62; +5 new readonly tests). Full tools regression in next step.
2026-05-30 09:37:05 +03:00
Дмитрий 2a3b5b4da5 fix(router-gate-v4): Smoke 5 REAL fix — path-normalization separator bug
Smoke 5 restart-test (chistaa session) refuted stale-process hypothesis and
identified the real bug: Stream A's pathNormalize() returned OS-native paths
(backslashes on win32) while DEFAULT_PROTECTED_PATTERNS regexes are forward-slash
only.

Trace confirmation:
  Stream A pathNormalize('~/foo/bar.jsonl') on win32:
    BEFORE: 'c:\\users\\admin\\foo\\bar.jsonl' — backslashes
    AFTER:  'c:/users/admin/foo/bar.jsonl'      — forward slashes
  isProtectedPath now matches → Bash/PowerShell hooks block correctly.

Root cause: path.resolve() + fs.realpathSync() on Windows produce backslashes,
caseFold lowercases them but doesn't change separators. DEFAULT_PROTECTED_PATTERNS
in shell-content-rules.mjs are forward-slash regexes (e.g. /(^|\/)\.claude\/projects/i).
defaultPathNormalize fallback in shell-content-rules.mjs DID normalize separators,
which is why my emergency commit 25e184e5 unit-tests passed but live behavior
failed — live hooks use resolvePathNormalize() which returns Stream A's
buggy implementation.

Fix:
- path-normalization.mjs: append .split('\\').join('/') to pathNormalize output.
- path-normalization.test.mjs: +1 RED→GREEN test for win32 separator normalization.

Why previous commit 25e184e5 was incomplete:
- Added pattern to protected list ✓
- Added enforce-read-path-deny.mjs ✓ (Read tool — works because hook uses
  defaultPathNormalize directly, not resolvePathNormalize)
- Did NOT detect Bash/PowerShell path-normalize integration bug (debug script
  bypassed Stream A by passing defaultPathNormalize directly).

Side observation (recorded as Stream H TODO by chistaa session):
- extractPathArgs/pathDenyOverlay — non-reading path in non-first position is
  not checked fully. Independent latent bug, separate fix.

Regression: 1715/1715 vitest tools GREEN (+1 separator test).

Critical: re-run Smoke 5 in clean session — expected PASS all 6 vectors now.
2026-05-30 08:39:52 +03:00
Дмитрий 25e184e52d fix(router-gate-v4): EMERGENCY Smoke 5 closure — transcript JSONL hard-deny
Smoke 5 (user-run 2026-05-30) found 5 of 6 self-exfil vectors OPEN — parent
context transcripts (~/.claude/projects/*/<session>.jsonl) readable via:
- Bash: ls ~/.claude/projects/, cat ~/.claude/projects/.../jsonl, head/tail
- PowerShell: Get-Content / gc / Select-String
- Read tool: direct file_path

Spec §3.1 declared this protection but Stream B/C/G never wired path-deny.
DEFAULT_PROTECTED_PATTERNS in shell-content-rules.mjs lacked ~/.claude/projects/.
Read tool had no path-deny hook at all.

Closure:
- shell-content-rules.mjs: +1 regex /(^|\/)\.claude\/projects(\/|$)/i in
  DEFAULT_PROTECTED_PATTERNS. Bash hook + PowerShell hook already use this
  shared list via pathDenyOverlay → both close automatically.
- enforce-read-path-deny.mjs (NEW): PreToolUse(Read) wrapper rejects any
  Read on isProtectedPath() match. Uses same shared DEFAULT_PROTECTED_PATTERNS
  for consistency (covers transcript / runtime / .env / normative / dep-checksums).
- .claude/settings.json: register enforce-read-path-deny.mjs PreToolUse Read.

Tests: shell-content-rules.test.mjs +3 path cases in protects %s + 1 dedicated
it() block. enforce-read-path-deny.test.mjs +6 cases covering Read tool block.
1714/1714 vitest tools GREEN (was 1707, +7 net).

Critical: re-run Smoke 5 in clean session before continuing other smokes.
Expected post-fix: all 6 vectors blocked (Bash/PowerShell via path-deny,
Read via enforce-read-path-deny, Write via existing normative-content-rules).

Side observation from Smoke 5 session: controller self-fabrication of
intermediate results (4 occurrences across Smokes 1-5) confirms one of the
7 fundamental limits — gate paradox + behavioral residual irreducible.
No hook catches in-response narrative fabrication (not through tool-gate).
This is a Stream H + recovery-procedures.md documentation item, not a hook fix.
2026-05-30 08:01:49 +03:00
Дмитрий 6973363c37 feat(router-gate-v4): Stream G — register 9 v4 hooks + git add whitelist fix + sub-plan
settings.json hook registration changes:
- Removed 5 v3.9 hook registrations: enforce-chain-recommendation,
  enforce-classifier-match, enforce-graph-first, enforce-semgrep-security,
  enforce-override-limit
- Added 9 v4 deterministic hooks (no LLM-judge — Stream H follow-up):
  PreToolUse: router-gate (Bash), powershell-gate (PowerShell),
  normative-content-rules (Edit|Write|MultiEdit), tdd-real-test-verifier (Edit|Write),
  self-debrief-detector (Edit|Write|MultiEdit|Bash),
  askuser-cosmetic-detector (AskUserQuestion), mcp-classification (mcp__.*)
  PostToolUse Task: subagent-return-scanner
  Stop: todowrite-skill-verifier

shell-content-rules.mjs fix:
- Added 'add' to GIT_CONDITIONAL_SUB whitelist. Without it git add was default-deny
  by rule 5 even after approval — broke entire git workflow under v4 router-gate.

TODO Stream H (integration gaps discovered):
1. askuser-answer-parser needs PostToolUse(AskUserQuestion) wrapper
2. Schema mismatch Stream E vs Stream B approval records
3. llm-judge hooks need ROUTER_LLM_KEY config
4. decomposition-detector needs LLM-judge integration
5. parallel-session-lock pure module not implemented

Regression: 1707/1707 vitest tools GREEN.
2026-05-30 06:56:35 +03:00
Дмитрий 1a84864e44 chore(router-gate-v4): delete 5 obsolete v3.9 hooks + vocab.json (Stream G cleanup)
Deleted hooks superseded by v4 architecture (spec section 4 behavioral pivot):
- enforce-chain-recommendation (replaced by router-gate decide)
- enforce-classifier-match (replaced by skill-scope-verifier Direction 2)
- enforce-graph-first (replaced by decide classification)
- enforce-semgrep-security (folded into normative-content-rules + per-tool LLM-judge)
- enforce-override-limit (universal vocab removal section 4.2)
- enforce-override-vocab.json (vocab abolished)

Regression: 1705/1705 vitest tools GREEN after deletion.
2026-05-30 06:12:59 +03:00
Дмитрий a3002bbe3b feat(router-gate-v4): enforce-mcp-classification (PreToolUse mcp__* wrapper, §5.3 + G1/G12) 2026-05-30 06:11:21 +03:00
Дмитрий 430396dfba feat(router-gate-v4): enforce-self-debrief-detector (PreToolUse mutating wrapper, §3.12 NEW) 2026-05-30 06:08:19 +03:00
Дмитрий d4c6145b6d feat(router-gate-v4): enforce-tdd-real-test-verifier (PreToolUse Edit|Write wrapper, §3.11)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 06:05:17 +03:00
Дмитрий 27c73fb050 feat(router-gate-v4): enforce-todowrite-skill-verifier (Stop hook wrapper, §3.9 Direction 4)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 06:00:00 +03:00
Дмитрий 40d4443926 refactor(router-gate-v4): stub override helpers (universal vocab removed per spec §4.2)
findOverride/findOverrideAttempt/loadOverrideVocab become permanent stubs returning null/null/empty.
Non-deleted hooks (verify-before-push, tdd-gate, memory-coverage, branch-switch) still import these
symbols and need them to compile; runtime always reports 'no override'.

Adapted 15 existing tests in enforce-hook-helpers.test.mjs and 7 in enforce-semgrep-security.test.mjs
that asserted old vocab behaviour; all now assert stub behaviour (null/empty).
1824/1824 vitest tools GREEN.

Stream G of router-gate v4 deployment.
2026-05-30 05:55:46 +03:00
Дмитрий 6010443307 merge(router-gate-v4): Stream E — AskUser + subagent
7 commits / 10 files / +2824 lines:
- askuser-answer-parser (S27/E33/E34 + parse + approval)
- punctuation-aware stop detection + review nits (BOM/JSDoc/??)
- cosmetic AskUser detector (v4.1 §4.5)
- subagent return scanner + G2 narrative + structured schema
- anchor 'всё ок' narrative pattern (no false-match inside 'всё окно')
- subagent-prompt-prefix inheritance (256-bit sentinel, restricted/ paths)

Stream tests pass.
2026-05-30 05:09:07 +03:00
Дмитрий d27d8b6780 merge(router-gate-v4): Stream D — LLM-judge Layer 4
13 commits / 10 files / +3017 lines:
- multiJudgeConsensus 3-judge any-YES + cache/budget
- per-tool LLM-judge pure decision + PreToolUse hook wiring
- response-scan deterministic layer + LLM layer + Stop hook
- normative-content path matcher + content extraction
- normative-content deterministic layers + multi-judge Layer 4
- normative-content PreToolUse hook wiring
- ProxyAPI live integration smoke

Stream tests pass.
2026-05-30 05:08:41 +03:00
Дмитрий a15e95e79d merge(router-gate-v4): Stream C — static scan + MCP path-deny
8 commits / 11 files / +3066 lines static-content-scanner / framework-boot-scanner / glob-restricted-filter / mcp-tool-classifier / commit-message-scanner.

Review fixes: browser_navigate host-boundary (SSRF spoof), boot-scan best-effort.
2026-05-30 05:08:01 +03:00
Дмитрий f555082d3b fix(router-gate-merge): A↔B integration — resolvePathNormalize test after Stream A merged
После merge Stream A модуль ./path-normalization.mjs существует → resolvePathNormalize() возвращает Stream A pathNormalize, не fallback. Stream B тест предполагал отсутствие модуля и assert'ил конкретное default-значение 'a/b'.

Fix: меняю assertion на 'returns a function' + 'does not throw' — сохраняет original intent (resolvePathNormalize всегда возвращает callable) без жёсткой привязки к implementation Stream A pathNormalize.

Verified: vitest 59/59 GREEN на enforce-router-gate.test.mjs.
2026-05-30 05:06:58 +03:00
Дмитрий fd9e755b6f merge(router-gate-v4): Stream B — Bash/PowerShell content rules
16 commits / 11 files / +2849 lines:
- Bash hard-blacklist (v3.9+v4.0 C16/#4/#21/#22/#34 + v4.1 G7/G8 wget/nc)
- Bash whitelist + script-execution file-watcher
- classifyBashCommand integration + bashContentClassify export
- Bash gate main() + dynamic path-normalize fallback (fail-CLOSE)
- PowerShell tokenizer + hard-blacklist (keep + v4.1 G10 PS env)
- classifyPowerShellCommand (whitelist + path-deny + git route)
- PowerShell gate main() (fail-CLOSE)
- shared classifyGitCommand (readonly/conditional/hard incl G5/G6 gpgsign/--no-verify)
- Review fixes: 2>&1 fd-duplication allowed, git -c RCE closed, runtime-dir path-deny

Stream tests pass.
2026-05-30 05:05:15 +03:00
Дмитрий 4ad4c6d138 fix(router-gate): stream A decide — unicode boundary on cyrillic direct-invocation, polite skill_call forms, +tests, knownInRegistry contract docs
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 21:26:10 +03:00
Дмитрий 7e0e5f8e52 feat(router-gate): stream A — core decide() 4 поведения + nodeMatches + chain-state (§4, §10.1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 21:16:31 +03:00
Дмитрий 333fcc763a fix(router-gate): stream A tdd-verifier — test no_test_block + EACCES vs ENOENT + known-limitation docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 21:12:33 +03:00
Дмитрий 38a97aa2d7 feat(router-gate): stream A — tdd real-test verifier regex-based (§3.11) 2026-05-29 21:04:37 +03:00
Дмитрий f03c45240d fix(router-gate): stream A self-debrief — unicode lookbehind for cyrillic patterns + false-positive tests 2026-05-29 21:01:56 +03:00
Дмитрий 632882cace test(router-gate): ProxyAPI live integration smoke + stream D sub-plan (stream D task 13)
Opt-in live smoke (ROUTER_LLM_LIVE_TEST=1 + ROUTER_LLM_KEY); auto-skips otherwise
so it never pollutes the unit regression in worktrees where undici is unresolved.
Checkpoint-1 live result on owner machine: PASS (2/2) — single Sonnet judge + 3-judge
consensus (Sonnet 4.6 + Haiku 4.5 + Opus 4.7) reach all models with real verdicts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 20:55:20 +03:00
Дмитрий a00ebd0ed2 feat(router-gate): stream A — self-debrief detector v4.1 NEW (§3.12) 2026-05-29 20:50:48 +03:00
Дмитрий 96157a8dcf feat(router-gate): normative-content PreToolUse hook wiring (stream D task 12)
Recovered from a subagent crash (socket error mid-task) that left literal-newline
corruption in two .join() string literals; repaired and committed by controller.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 20:48:51 +03:00
Дмитрий 8d74482398 fix(router-gate): stream A todowrite-verifier — unicode boundary for cyrillic mention patterns + DRY + tests
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 20:46:54 +03:00
Дмитрий ee7acf6eaa fix(router-gate): allow 2>&1 fd-duplication, keep file-redirect block (review finding) 2026-05-29 20:45:23 +03:00