Commit Graph

831 Commits

Author SHA1 Message Date
Дмитрий e56ddd6a1b fix(router-gate): coverage line honors cross-turn active skill (verify + remind)
Backlog item G. The `coverage:` line under-reported a skill chosen in a PRIOR turn:
enforce-coverage-verify credited channel=skill only if the Skill tool ran in the
CURRENT turn, so an honest `skill:X` continuation line was BLOCKED -> the controller
learned to under-report as direct/chain. Two-sided systemic fix, no weakening:

- enforce-coverage-verify: decide() also accepts skill:X when X was invoked anywhere
  earlier in THIS session (new priorSkillNames param; main() collects them via
  sessionToolUses). Still unforgeable -- a real Skill tool_use must exist in the
  transcript. The only residual is possibly-stale attribution, far better than the
  forced dishonest direct-reporting it replaces.
- enforce-prompt-injection: the §17 reminder now lists active skills carried over
  from earlier turns (read from the transcript) and tells the controller to report
  `coverage: skill:<name>` when work continues under one -- the proactive half, so
  the correct line is not merely allowed but prompted.

TDD: RED -> GREEN per behavior. tools-vitest 2032 passed / 2 skipped.
Plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md (item G).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:37:44 +03:00
Дмитрий 53407a77cd feat(router-gate): tdd-gate credits delegated (subagent) TDD + transcript write-deny
Closes the TDD-gate cross-actor gap: when a subagent (spawned by a Task in the
controller's current turn) writes the failing test and confirms RED, the
controller's subsequent production edit was falsely blocked because the gate only
scanned the controller's own turn. Net strengthening, no discipline weakened.

- Part 1 (enforce-runtime-write-deny): block the Write tool from any
  ~/.claude/projects/**/*.jsonl (session/subagent transcripts). Memory *.md there
  stays writable (never matches .jsonl$). Resolving normalizer defeats ./.. evasion.
  This makes the agent-<id>.jsonl that Part 2 trusts unforgeable (it was the last
  ungated write channel; Bash/PowerShell/Read gates already covered it).
- Part 2 (enforce-tdd-gate): decide() also credits a subagent's matching test edit
  + RED via a new subagentEntriesList. turnTaskAgentIds() reads the hex agentId from
  the harness-written Task tool_result (the controller cannot forge its own
  tool_result; hex-only match blocks "agentId: ../../x" path traversal).
  subagentTranscriptPaths() derives <dir>/<controller-session>/subagents/agent-<id>.jsonl.
  main() reads them best-effort (missing/unreadable -> no extra credit = stricter).

No new weakening: a delegated subagent doing real TDD is legitimate; the only
forgery vector (overwriting the agent jsonl) is closed by Part 1. Existing
controller-turn behaviour is preserved (empty subagent list == old logic).

OWNER (settings.json, Claude can't edit it): enforce-tdd-gate is already a
registered PreToolUse hook -> Part 2 goes live on merge. enforce-runtime-write-deny
must be registered on PreToolUse(Edit|Write|MultiEdit|NotebookEdit) for Part 1 to be live.

TDD: RED -> GREEN per behavior. tools-vitest 2027 passed / 2 skipped.
Backlog item C (=Z); plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:18:44 +03:00
Дмитрий 6577c04a1f fix(router-gate): session-lock hygiene — clearer block message + stale-lock prune
Closes the remaining parallel-session-lock remarks on top of the keying fix
(7a469dc9), with NO weakening of same-worktree serialization:

- D: the block message now identifies the holder by its STABLE session_id and
  marks the recorded pid as transient ("may change between attempts"). Chasing
  the pid is what led to closing the wrong session. Decision logic is unchanged
  (text only) — existing /pid N/ triage assertion still holds.
- B: pruneStaleLocks() best-effort deletes leaked lock files that are ALREADY
  stale by the shared isStale() definition (now exported from the pure module —
  single source of truth). Active within-TTL locks are never touched, so the
  serialization guarantee is not weakened. Wired into the PreToolUse branch of
  main(), wrapped so hygiene can never break the gate (fail-open).
- C (no code): release-on-SessionEnd needs only a settings.json registration
  (owner action) — the existing !tool_name branch already releases. Documented
  in the plan. Until then, leaked locks self-heal via B + the 5-min TTL takeover.

TDD: RED -> GREEN per behavior. tools-vitest 2014 passed / 2 skipped.
Backlog items B/C/D; plan docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 17:43:03 +03:00
Дмитрий 7a469dc913 fix(router-gate): key session-lock by session work-tree root, not hook cwd
enforce-parallel-session-lock keyed the lock on the hook's process.cwd(),
which collapses to the main repo dir after a session resume — so sessions in
DIFFERENT git worktrees shared one lock and false-blocked each other (observed:
a brainrepo-worktree session blocked launching agents by a discipline-guard
session). New resolveWorkspacePath() keys on the session's stable cwd
(event.cwd) resolved to the git work-tree root (git -C <cwd> rev-parse
--show-toplevel), with fallback to process.cwd() so behaviour never regresses
when event.cwd is absent. Same-worktree concurrency stays serialized
(unchanged) — discipline not weakened; only cross-worktree false-blocks fixed.

TDD: RED (5 resolveWorkspacePath cases) -> GREEN -> tools-vitest 2003 passed /
2 skipped. Backlog item F; plan
docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 17:02:32 +03:00
Дмитрий be4e1a6123 feat(router-gate): whitelist npm ci in SAFE_EXACT (worktree dep restore)
`npm ci` does a clean install strictly from the committed lockfile
(deterministic, no version drift) — needed to restore junction node_modules
in a fresh worktree. Distinct from `npm install`/`npm i`, which stay
hard-blacklisted because they can pull new/updated versions; the blacklist
runs before the whitelist, so they remain blocked. Word boundary after `ci`
prevents `npm cider`-style prefix matches; chain semantics still block
`npm ci && <mutating>`.

TDD: RED (3 allow-cases failed default-deny) -> GREEN (/^npm\s+ci\b/) ->
tools-vitest 1998 passed / 2 skipped (2000). Backlog item A; plan
docs/superpowers/plans/2026-05-31-discipline-guard-backlog.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 14:46:58 +03:00
Дмитрий f6421fd61c docs(router-gate-v4): calibration 5 plan - cosmetic-detector git-approval exemption 2026-05-31 11:39:20 +03:00
Дмитрий 417cfcbc37 docs(router-gate-v4): CLAUDE.md v2.44 — item 2b judge live + activated + readonly calibration 2026-05-31 09:04:09 +03:00
Дмитрий dfae9f760b feat(router-gate-v4): live main() for LLM-judge wrappers — flag-gated spend (item 2b) 2026-05-31 08:06:26 +03:00
Дмитрий 9280c48025 docs(router-gate-v4): remaining-holes checklist update + CLAUDE.md insertion draft (item 1b tails) 2026-05-31 07:04:27 +03:00
Дмитрий 84dcf4aab3 docs(router-gate-v4): safe-baseline spec v4 + plan + handoff (item 1b) 2026-05-31 05:58:13 +03:00
Дмитрий c86fdfc9eb docs(router-gate-v4): safe-baseline spec v3 — fold 2nd adversarial review (V2-1/V2-2/V2-4) (item 1b) 2026-05-30 20:44:26 +03:00
Дмитрий 9f84d9ef09 docs(router-gate-v4): safe-baseline spec v2 — close C1/C2/C3/H1 from adversarial review (item 1b) 2026-05-30 20:31:23 +03:00
Дмитрий 6d512f5cf3 docs(router-gate-v4): safe-baseline live-wiring design spec (item 1b) 2026-05-30 20:12:39 +03:00
Дмитрий c805988085 docs(observer): router-gate v4 remaining-holes checklist (Stream H follow-up) 2026-05-30 19:38:51 +03:00
Дмитрий 6ac4b1c1b1 feat(router-gate-v4): safe-baseline-metering wrapper + llm-judge-config gate (Stream H tail) 2026-05-30 19:29:58 +03:00
Дмитрий 4686b36571 docs(region): lead-region-resolution spec v0.5 + 6-session plan 2026-05-30 15:38:54 +03:00
Дмитрий ffd70d6fa5 fix(router-gate-v4): lastTurnEntries skips harness-injected skill bodies (isMeta + sourceToolUseID)
Sibling Claude session 2026-05-30 found that lastTurnEntries treats
harness-injected skill bodies as spurious turn boundaries, breaking both
enforce-memory-coverage (can't find user's coverage line) AND
enforce-normative-content-rules::detectLegitSkillActive (can't find the
Skill tool_use that lives in the assistant message BEFORE the body).

Refinement applied here: this session inspected 29 isMeta:true entries
across the live transcript (8f4ba767-...jsonl) via a debug helper and
found isMeta:true is ALSO used for "Continue from where you left off"
auto-resume, Stop hook feedback strings, and <local-command-caveat>
wrappers — those are real user-equivalent boundaries that must remain
visible. Sibling's blanket "skip isMeta" proposal would have broken them.

Discriminator: skip ONLY when isMeta === true AND typeof sourceToolUseID
=== 'string' (tool-spawned content). Skill bodies have the linking field;
the other isMeta sources do not. The sourceToolUseID field is harness-
controlled and not writable by controller from inside a tool call —
cannot be spoofed.

Behaviour after fix:
  * Skill body injection → skipped → walk continues back to find user's
    real prompt (with coverage line).
  * The assistant message containing the Skill tool_use is now inside the
    turn → detectLegitSkillActive finds it → normative writes pass when
    invoked under an active claude-md-management skill.
  * "Continue from where you left off." → still treated as turn boundary.
  * Stop hook feedback strings → still treated as turn boundary.

TDD:
  * 3 new tests in tools/enforce-hook-helpers.test.mjs under the
    "lastTurnEntries / lastUserPromptText / lastAssistantText / turnToolUses"
    describe block:
      - lastTurnEntries skips skill body injections (isMeta + sourceToolUseID)
      - lastTurnEntries does NOT skip "Continue from where you left off"
        (isMeta but no sourceToolUseID)
      - turnToolUses includes Skill tool_use spawned in same turn as the
        injected skill body
  * 2/3 RED→GREEN (the "Continue" negative test passed on baseline already
    since its string content satisfies the existing string-content branch).

Scope:
  * Fixes 2 of the 5 structural quirks documented in the Stream H
    completion log (enforce-memory-coverage gap, enforce-normative-
    content-rules detectLegitSkillActive gap).
  * Does NOT fix: enforce-read-path-deny LEGIT_SKILLS exemption gap
    (separate hook, no lastTurnEntries dependency); TDD-gate cross-actor
    blindness (different mechanism — actor session boundaries);
    detectFullTestRun regex narrowness (command-pattern matching).

Regression: vitest tools 1788/1788 GREEN (was 1785; +3 new tests).

Plan: docs/superpowers/plans/2026-05-30-lastturnentries-skill-body-skip.md
2026-05-30 14:16:12 +03:00
Дмитрий 612b3a3382 docs(router-gate-v4): Stream H final — Layer 4 LLM-judge verified live via integration smoke
Closes Stream H completely. Appends a "Final activation — Layer 4 verified
live" section to the completion log documenting:

- User completed Action 2 (.claude/settings.json batch replacement) via
  .scratch/activate-stream-h.ps1 on 2026-05-30 ~12:38 МСК. Backup at
  .claude/settings.json.backup-20260530-123741. 7 new hook entries appended.

- User completed Action 1 (keytar install + ROUTER_LLM_KEY in user env)
  with --legacy-peer-deps to resolve the histoire/vite peer conflict
  (memory quirk 74). ROUTER_LLM_KEY (35 chars) exported user-level. Base
  URL left at Anthropic default — no ProxyAPI middleware.

- Live verification via .scratch/verify-layer-4.ps1 → both opt-in
  integration tests under ROUTER_LLM_LIVE_TEST=1 PASS on real API calls:
    * single Sonnet judge returns a parseable YES/NO — 1950 ms
    * 3-judge consensus reaches all three models with real (non-null)
      verdicts — 2021 ms (Sonnet 4.6 + Haiku 4.5 + Opus 4.7 each returned
      a real YES/NO; no fallback to doubt)
  Total duration 4.54 s. 4 real API calls. Cost ~$0.01-0.05.

Layer 4 LLM-judge now active on live traffic. Router-gate v4 reaches the
master-plan target ~0.5-0.8% bypass rate. Architectural floor ~0.5%
irreducible per the 7 fundamental limits documented in memory
`feedback_asymptote_floor_irreducible.md`.

Carry-over: PowerShell 5.1 mojibake on em-dashes inside .scratch/ helper
scripts is cosmetic only; affects the final summary banner, not the
verification itself. Non-blocking.

Docs-only change; covered by docs-only short-circuit in
enforce-verify-before-push (§5 п.13 CLAUDE.md).

Stream H closed. No further follow-ups required.
2026-05-30 13:30:34 +03:00
Дмитрий 0ff2053ae0 docs(router-gate-v4): Stream H Task 11 — completion log with deferred batch actions for user
Closes Stream H. Adds the canonical completion artifact at
docs/observer/notes/2026-05-30-stream-h-completion.md documenting:

- All 10 commits landed in this Stream H push (2a3b5b4d..d75c8922 main).
- Per-task summary linking each H<N> to its commit SHA + 1-line rationale.
- Two manual actions the user needs to perform outside Claude to activate
  the new hooks: (1) npm install keytar + store ROUTER_LLM_KEY in keychain,
  (2) append 7 hook entries to .claude/settings.json (verbatim JSON
  provided). Both are blocked from in-Claude execution by structural
  router-gate hooks (read-path-deny on settings.json without LEGIT_SKILLS
  exemption; npm install in router-gate hard-blacklist).
- 5 defects/quirks discovered during execution with follow-up direction
  (read-path-deny skill exemption gap, TDD-gate cross-actor blindness,
  detectFullTestRun regex narrowness, findOverride stub, subagent vitest
  output misread).
- 5 intentional deferrals listed (H10 worktree bootstrap; full LLM-judge
  activation pending Action 1; Smoke 8 live test pending Action 2; no
  normative bump because Stream H is infrastructure not Tooling-canon;
  worktree cleanup conditional on local presence).
- Cumulative state after Stream H: 1776/1776 vitest tools GREEN, 6 hooks
  ready to activate, 2 brain-retro analyzer extensions live, recovery
  runbook published with 7 fabrication patterns.

Docs-only change; covered by docs-only short-circuit in
enforce-verify-before-push (§5 п.13 CLAUDE.md).

Stream H Task 11 of 11 — final consolidation.
2026-05-30 11:46:32 +03:00
Дмитрий cebd6bcebb docs(router-gate-v4): Stream H Task 1 fix — correct module references in recovery-procedures.md (code-quality review)
Code-quality reviewer flagged 2 IMPORTANT factual inaccuracies in
recovery-procedures.md (commit 3ce73a68):

1. Section 6 RECOMMENDED code example imported resolvePathNormalize from
   the wrong module path (tools/shell-content-rules.mjs). Actual exporter
   is tools/enforce-router-gate.mjs (verified via Grep at line 174).
   shell-content-rules.mjs only exports defaultPathNormalize. A future
   reader copying the RECOMMENDED pattern would get an import error.
   Also corrected the call signature: resolvePathNormalize() takes no
   arguments and is async — returns the normalize function directly.

2. Section 4 (Stale-process) cited tools/enforce-bash-content-gate.mjs —
   no such file exists in tools/ (verified via Glob). Correct hook
   filenames are enforce-router-gate.mjs (Bash) and
   enforce-powershell-gate.mjs (PowerShell).

Fix: replace both module references with the verified correct filenames
(Grep'd against tools/ exports + Glob'd file existence). Also includes
the lefthook MD032 blank-lines-around-lists auto-format diff carried
over from the previous commit's post-commit hook.

Surgical edit — no new content, no restructuring.
2026-05-30 10:13:16 +03:00
Дмитрий 3ce73a68ff docs(router-gate-v4): Stream H Task 1 — recovery-procedures.md (3 levels + stale-process + 7 fabrications + test methodology + smoke methodology)
Adds first-time recovery runbook with:
- 3 self-recovery levels (Level 1 ≤5min sentinel reset, Level 2 ≤15min VS Code
  restart, Level 3 destructive workspace rebuild)
- Stale-process / hook reload trap (Smoke 5 chistaa-session hypothesis +
  refutation method); key takeaway: live restart-test is the only way to
  confirm a hook-modifying fix landed
- Self-fabrication patterns — 7 cases enumerated from Smokes 3/4/5/7 with
  pattern signature, detection signal, mitigation for each
- Test methodology lesson — Smoke 5 root cause showed unit tests with inline
  mocks can give false-green if they bypass the live resolver function; debug
  scripts have the same trap
- Smoke methodology — statusline-setup system prompt overrides user tasks
  (Smoke 9 Run 1); use semgrep-scanner for echo-probes, statusline-setup OK
  for gate-inheritance smokes

Docs-only change; verified via docs-only short-circuit in enforce-verify-
before-push (§5 п.13 CLAUDE.md).

Stream H Task 1 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 09:58:38 +03:00
Дмитрий 6973363c37 feat(router-gate-v4): Stream G — register 9 v4 hooks + git add whitelist fix + sub-plan
settings.json hook registration changes:
- Removed 5 v3.9 hook registrations: enforce-chain-recommendation,
  enforce-classifier-match, enforce-graph-first, enforce-semgrep-security,
  enforce-override-limit
- Added 9 v4 deterministic hooks (no LLM-judge — Stream H follow-up):
  PreToolUse: router-gate (Bash), powershell-gate (PowerShell),
  normative-content-rules (Edit|Write|MultiEdit), tdd-real-test-verifier (Edit|Write),
  self-debrief-detector (Edit|Write|MultiEdit|Bash),
  askuser-cosmetic-detector (AskUserQuestion), mcp-classification (mcp__.*)
  PostToolUse Task: subagent-return-scanner
  Stop: todowrite-skill-verifier

shell-content-rules.mjs fix:
- Added 'add' to GIT_CONDITIONAL_SUB whitelist. Without it git add was default-deny
  by rule 5 even after approval — broke entire git workflow under v4 router-gate.

TODO Stream H (integration gaps discovered):
1. askuser-answer-parser needs PostToolUse(AskUserQuestion) wrapper
2. Schema mismatch Stream E vs Stream B approval records
3. llm-judge hooks need ROUTER_LLM_KEY config
4. decomposition-detector needs LLM-judge integration
5. parallel-session-lock pure module not implemented

Regression: 1707/1707 vitest tools GREEN.
2026-05-30 06:56:35 +03:00
Дмитрий 1a84864e44 chore(router-gate-v4): delete 5 obsolete v3.9 hooks + vocab.json (Stream G cleanup)
Deleted hooks superseded by v4 architecture (spec section 4 behavioral pivot):
- enforce-chain-recommendation (replaced by router-gate decide)
- enforce-classifier-match (replaced by skill-scope-verifier Direction 2)
- enforce-graph-first (replaced by decide classification)
- enforce-semgrep-security (folded into normative-content-rules + per-tool LLM-judge)
- enforce-override-limit (universal vocab removal section 4.2)
- enforce-override-vocab.json (vocab abolished)

Regression: 1705/1705 vitest tools GREEN after deletion.
2026-05-30 06:12:59 +03:00
Дмитрий a3002bbe3b feat(router-gate-v4): enforce-mcp-classification (PreToolUse mcp__* wrapper, §5.3 + G1/G12) 2026-05-30 06:11:21 +03:00
Дмитрий 430396dfba feat(router-gate-v4): enforce-self-debrief-detector (PreToolUse mutating wrapper, §3.12 NEW) 2026-05-30 06:08:19 +03:00
Дмитрий 27c73fb050 feat(router-gate-v4): enforce-todowrite-skill-verifier (Stop hook wrapper, §3.9 Direction 4)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 06:00:00 +03:00
Дмитрий 40d4443926 refactor(router-gate-v4): stub override helpers (universal vocab removed per spec §4.2)
findOverride/findOverrideAttempt/loadOverrideVocab become permanent stubs returning null/null/empty.
Non-deleted hooks (verify-before-push, tdd-gate, memory-coverage, branch-switch) still import these
symbols and need them to compile; runtime always reports 'no override'.

Adapted 15 existing tests in enforce-hook-helpers.test.mjs and 7 in enforce-semgrep-security.test.mjs
that asserted old vocab behaviour; all now assert stub behaviour (null/empty).
1824/1824 vitest tools GREEN.

Stream G of router-gate v4 deployment.
2026-05-30 05:55:46 +03:00
Дмитрий 6010443307 merge(router-gate-v4): Stream E — AskUser + subagent
7 commits / 10 files / +2824 lines:
- askuser-answer-parser (S27/E33/E34 + parse + approval)
- punctuation-aware stop detection + review nits (BOM/JSDoc/??)
- cosmetic AskUser detector (v4.1 §4.5)
- subagent return scanner + G2 narrative + structured schema
- anchor 'всё ок' narrative pattern (no false-match inside 'всё окно')
- subagent-prompt-prefix inheritance (256-bit sentinel, restricted/ paths)

Stream tests pass.
2026-05-30 05:09:07 +03:00
Дмитрий d27d8b6780 merge(router-gate-v4): Stream D — LLM-judge Layer 4
13 commits / 10 files / +3017 lines:
- multiJudgeConsensus 3-judge any-YES + cache/budget
- per-tool LLM-judge pure decision + PreToolUse hook wiring
- response-scan deterministic layer + LLM layer + Stop hook
- normative-content path matcher + content extraction
- normative-content deterministic layers + multi-judge Layer 4
- normative-content PreToolUse hook wiring
- ProxyAPI live integration smoke

Stream tests pass.
2026-05-30 05:08:41 +03:00
Дмитрий a15e95e79d merge(router-gate-v4): Stream C — static scan + MCP path-deny
8 commits / 11 files / +3066 lines static-content-scanner / framework-boot-scanner / glob-restricted-filter / mcp-tool-classifier / commit-message-scanner.

Review fixes: browser_navigate host-boundary (SSRF spoof), boot-scan best-effort.
2026-05-30 05:08:01 +03:00
Дмитрий fd9e755b6f merge(router-gate-v4): Stream B — Bash/PowerShell content rules
16 commits / 11 files / +2849 lines:
- Bash hard-blacklist (v3.9+v4.0 C16/#4/#21/#22/#34 + v4.1 G7/G8 wget/nc)
- Bash whitelist + script-execution file-watcher
- classifyBashCommand integration + bashContentClassify export
- Bash gate main() + dynamic path-normalize fallback (fail-CLOSE)
- PowerShell tokenizer + hard-blacklist (keep + v4.1 G10 PS env)
- classifyPowerShellCommand (whitelist + path-deny + git route)
- PowerShell gate main() (fail-CLOSE)
- shared classifyGitCommand (readonly/conditional/hard incl G5/G6 gpgsign/--no-verify)
- Review fixes: 2>&1 fd-duplication allowed, git -c RCE closed, runtime-dir path-deny

Stream tests pass.
2026-05-30 05:05:15 +03:00
Дмитрий 632882cace test(router-gate): ProxyAPI live integration smoke + stream D sub-plan (stream D task 13)
Opt-in live smoke (ROUTER_LLM_LIVE_TEST=1 + ROUTER_LLM_KEY); auto-skips otherwise
so it never pollutes the unit regression in worktrees where undici is unresolved.
Checkpoint-1 live result on owner machine: PASS (2/2) — single Sonnet judge + 3-judge
consensus (Sonnet 4.6 + Haiku 4.5 + Opus 4.7) reach all models with real verdicts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 20:55:20 +03:00
Дмитрий 2e7f0c9ac7 docs(plans): router-gate v4 Stream E sub-plan (AskUser + subagent) 2026-05-29 20:21:43 +03:00
Дмитрий 6f438df18b docs(plans): sync Stream C plan with review fixes (browser_navigate boundary + base64 fixture) 2026-05-29 19:57:12 +03:00
Дмитрий 7ebe6c5bcc docs(plans): router-gate v4 Stream C sub-plan (static scan + MCP path-deny) 2026-05-29 19:30:21 +03:00
Дмитрий 5b8109ea55 docs(plans): router-gate v4 Stream B sub-plan (shell content parsing) 2026-05-29 19:29:17 +03:00
Дмитрий c6a4748398 docs(incidents): record actual cleanup execution + Laravel permission gap
Дополняет handoff чем фактически произошло 29.05.2026:
- 3 партиции (не 1) пришлось чинить: activity_log_y2026_m05 (id=599),
  balance_transactions_y2026_m05 (id=462), pd_processing_log_y2026_m05 (id=191).
  Race condition бил по всем 3 tenant-scoped audit-таблицам.
  Всего 18 mismatches → 0, 9 tenant-scopes, 679 rows rebuilt.
- Laravel AuditRebuildChain не работает на проде: crm_supplier_worker не
  может SET session_replication_role (требуется SUPERUSER). Tests проходят
  потому что используют postgres superuser. Это first-ever rebuild attempt
  на проде раскрыл gap.
- Workaround использован .github/workflows/sql-rebuild-audit-chain.yml
  через sudo -u postgres psql. Future fix — отдельный план.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 19:14:29 +03:00
Дмитрий 4e15fa70ff docs(plans): router-gate v4 handoff instructions (5 prompts + merge + deploy)
Handoff document for non-programmer user — how to launch 5 parallel
Claude sessions, monitor progress, merge results, and activate v4.0+v4.1+v4.2.

Contains:
- Ready-to-copy prompts for Streams A, B, C, D, E
- VM Sandbox hands-on guide pointer (Stream F)
- Checkpoint 1 merge instructions
- Stream G (cleanup + register) prompt
- User-run Smokes guide
- Stream H (brain-retro + docs sync) prompt
- Final verification + worktree cleanup

+ cspell vocab additions (промты, мониторьте) for Russian content.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 18:55:38 +03:00
Дмитрий 480649db30 fix(rationalization-audit): skip quoted citations to remove false-positives 2026-05-29 18:47:21 +03:00
Дмитрий c4c2afd111 docs(plans): router-gate v4 master coordination plan (9 streams, parallel sessions)
Master plan orchestrates 9 streams (A-H + checkpoints) для параллельного
multi-session запуска. Каждый stream работает над disjoint set файлов
в tools/ или docs/ — 0 conflicts по конструкции.

Streams:
- A: Pure decision modules (8 файлов, ~250 unit tests) — independent
- B: Bash/PowerShell content rules — independent (stub path-norm)
- C: Static scan + framework boot + Glob F8 + MCP classifier — independent
- D: LLM-judge Layer 4 (multi-judge + per-tool + response scan) — independent
- E: AskUser parser + subagent return scanner — independent
- F: VM-sandbox setup (user hands-on) — independent
- G: Cleanup 5 v3.9 hooks + settings.json register — sequential after A-E
- Smokes 1-9 user-run — sequential after G
- H: Brain-retro Table 16-17 + recovery docs + Pravila/PSR/Tooling sync — sequential

Wall-clock: 16-23h parallel (vs 49-65h sequential).

User chose Subagent-Driven execution в параллельных сессиях.
Each parallel session invokes writing-plans для своего stream sub-plan'а,
затем subagent-driven-development для реализации.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 18:47:21 +03:00
Дмитрий 0c3552393a docs(incidents): handoff для cleanup activity_log_y2026_m05 после ADR-018 fix
Task 7 плана 2026-05-29-audit-rebuild-per-tenant-fix.md.
Шаги выкатки cleanup'а 6 mismatches в activity_log_y2026_m05 через
исправленный audit:rebuild-chain (per-tenant per ADR-018):

1. Pre-flight: deploy success + verify baseline (6 mismatches expected).
2. Dry-run через artisan-run workflow (НЕ confirm_apply) — verify Scope =
   "PARTITION BY tenant_id" в output (sanity check Task 4 deploy reached prod).
3. Apply через artisan-run --force + confirm_apply=true.
4. Verify ещё раз: 6 партиций intact.
5. Post: закрыть incident в incidents_log, обновить memory.
6. Rollback: бэкап PG + audit_block_mutation охрана.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 18:14:41 +03:00
Дмитрий 575f7a1f59 docs(adr): ADR-018 enforcement активирован (Tasks 2+4 завершены)
Task 5 плана 2026-05-29-audit-rebuild-per-tenant-fix.md.
Активированы 2 декларативных правила в ADR-018:

- rebuild-must-use-shared-config: AuditRebuildChain.php должен читать
  partition_clause из AuditChainConfig (require_pattern matches существующему
  коду после Task 4 fix).
- verify-must-use-shared-config: VerifyAuditChains.php должен читать TABLES из
  AuditChainConfig (require_pattern matches коду после Task 2 refactor).

llm_judge=false (declarative only, zero cost).

adr-judge на staged diff: 0 violations / 0 advisories.

Ref: docs/superpowers/plans/2026-05-29-audit-rebuild-per-tenant-fix.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 18:14:40 +03:00
Дмитрий 0e768f9aa0 docs(spec): router-gate v4.0+v4.1+v4.2 design (behavioral + max-judges + VM sandbox)
Three-spec package for router-gate next-generation:

- v4.0 base (2249 lines) — поведенческий разворот: removes 16 защит-призраков
  v3.9, adds Round 8 closures, adds 5 Directions (safe-baseline metering,
  skill scope verifier, decomposition detector, TodoWrite verifier,
  brain-retro Table 16). Target aggregate bypass ~15-25%.

- v4.1 max-closure amendment (1051 lines) — Layer 1-4: 8 Round 8 GAPs +
  behavioral hard-blocks + universal vocab removal + 3-judge consensus
  (Sonnet+Haiku+Opus) + per-tool LLM-judge + response scan. Target ~2-5%.
  Cost +$300-1500/month ongoing.

- v4.2 VM-sandbox amendment (411 lines) — Layer 5 phased for solo developer:
  Phase 1 VirtualBox isolation сейчас (~$0, 10-12h), Phase 2 biometric +
  Phase 3 HSM via single YubiKey ($50-150) когда захотите. Two-person rule
  removed (solo dev). Target ~0.5-0.8%.

Combined v4.0+v4.1+v4.2 full: ~0.5-0.8% aggregate bypass (close to
theoretical floor ~0.5% per §1.1 7 fundamental limits).

Implementation: ~49-65h sequential / 30-40h parallel through
subagent-driven-development. User wants parallel multi-session execution
for speed; writing-plans skill next.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 18:13:08 +03:00
Дмитрий e964d70c28 docs(plans): ADR-018 Stage 5 follow-up — AuditRebuildChain per-tenant fix
8 TDD tasks (~день кода): extract shared AuditChainConfig, refactor VerifyAuditChains (regression-safe), failing tests для multi-tenant/BYPASSRLS/single-row, rewrite AuditRebuildChain через LAG OVER (partition_clause ORDER BY id) симметрично verify, активация ADR-018 enforcement rules, Pint/Larastan/Pest --parallel smoke, handoff для прод-cleanup activity_log_y2026_m05 через gh workflow run artisan-run.yml. Self-review GREEN на spec coverage / placeholders / типы. Execution mode: subagent-driven.
2026-05-29 15:56:35 +03:00
Дмитрий 0098db6628 docs(adr): ADR-018 audit hash-chain per-tenant semantics canonical
29.05 disk-full incident выявил несогласованность между trigger (per-tenant
через RLS), VerifyAuditChains (per-tenant через PARTITION BY tenant_id) и
AuditRebuildChain (global). 6 mismatches в activity_log_y2026_m05 -
следствие неправильного rebuild'а, не оригинальной порчи.

Decision (User: Дмитрий): per-tenant canonical через RLS scope. Trigger и
verify уже согласованы; AuditRebuildChain - bug, переделать в Stage 5
follow-up (отдельный plan). После фикса re-run на activity_log_y2026_m05 -
6 mismatches исчезнут.

Альтернатива global semantics + переписать trigger SECURITY DEFINER + миграция
БД отвергнута: ослабляет 152-ФЗ tamper-detection + рискованная миграция.

Cross-links: ADR-002 RLS multi-tenancy, incidents/2026-05-29-disk-full-pg-recovery.md,
F1 advisory-lock migration 2026_05_30_000001.

Enforcement-block declarative (require_pattern AuditChainConfig::TABLES) -
активируется после имплементации Stage 5 follow-up.

cspell-words.txt: +партиционированы
2026-05-29 15:32:46 +03:00
Дмитрий a6bde2125a spec(router-gate): concentrate v3.9 — убрать audit-trail и version-history overhead
Заказчик: «перепиши спек, убери все лишние оставь только то что необходимо для
создания плана, но сам план не делай. Только помни нельзя потерять в качестве и
объеме ни в коем случае!»

После 10 раундов adversarial audit спек вырос до 2964 строк / 288KB. Большая часть
объёма — audit-trail и история эволюции через раунды:
- 8 «Changes vX → vY» overview-таблиц в начале (~245 lines)
- 11 версионных entries в §11 v3.9-v1 (~380 lines)
- inline traceability markers «v3.6 R5-audit H1 fix:» / «v3.7 R-NEW-4 closure:»

Эта информация дублируется (mechanism описан и в TL;DR overview, и в §11 entry,
и in-place в §3-§5) и НЕ нужна для составления implementation плана.

Что убрано (НИ ОДНОГО технического механизма не потеряно):
- Edit 1: «Changes v3.8 → v3.9» giant overview (13-row table + adversarial pre-check
  + implementation breakdown + Главный урок + Generalisable formula + Methodology +
  Связано) → 1 reference paragraph
- Edit 2: «Changes v3.7 → v3.8», «Changes v3.6 → v3.7», ... «Changes v1 → v2»
  (9 overview blocks + 4 FATAL table + Доп v3.8 closures C5-E30 list + adversarial
  pre-check v3.8 table) → один Timeline эволюции v1→v3.9 paragraph
- Edit 4: §11 v3.8/v3.7/v3.6/v3.5/v3.4/v3.3/v3.2/v3.1/v3/v2/v1 entries → один
  условный compaction-summary («### v1 – v3.8 — 9 раундов, 105 holes»). v3.9
  entry полностью сохранён — план будет ссылаться на R7 closure details.

Что сохранено verbatim (100% technical content):
- §1 Цель и контекст / §2 Принципы дизайна
- §3 Архитектура: §3.0 PowerShell hook / §3.0.1 OS-keychain / §3.1 protected paths
  (~80 paths + path normalization NFC/8.3/inode) / §3.2 subagent inheritance +
  parent_random_id sentinel / §3.2.0 10 smokes / §3.2.1 automated bootstrap /
  §3.3 failure modes / §3.4 subagent constraints + tool_result scanner / §3.5
  atomic writes / §3.6 gate budget + state cache / §3.6.1 dep-checksums /
  §3.6.2 normative-content second-layer
- §4 Decision Flow (Поведения 1-4 + §4.5 AskUser parser + §4.6 partial unlock +
  §4.7 question quality detector 3-layer LLM-judge)
- §5 Безопасная база + MCP classification / §5.1 Bash rules (whitelist +
  hard-blacklist + conditional + path-deny + SKILL_BASH_ALLOW + sub-shell sweep) /
  §5.1.2 PowerShell mirror / §5.2 multi-language static scan (PHP/Ruby/Go/Java)
- §6 Recovery: 3 levels + §6.1 cheatsheet + §6.2 PII guard + §6.3 redacted reason
- §7 Logging + §7.1 coverage-hint coordination
- §8 Этапы реализации (implementation order matrix + риски миграции)
- §9 Open questions + acceptable residuals R-NEW-7..R-NEW-19
- §10 Cross-refs + §10.1 functions/registry + §10.2 ALL state schemas verbatim
  (router-state, chain-state, askuser-decisions, router-gate-decisions, subagent-
  inheritance, subagent-block, parent-sentinel, restricted/journal-access-log,
  edited-files, coverage-hint, gate-errors, gate-config v3.9 fields, session-counters)
  + §10.3 test strategy + §10.4 success metrics + §10.5 rollback + §10.6 parallelism
- §11 v3.9 entry полный (R7 closure mechanism + generalisable formula + 13-row table)

Verification:
- Spec: 2964 → 2404 строк (-560 lines / -19%); технический объём ≥99%
- Mechanism keyword counts: fs.lstatSync 4 / parent_random_id 29 / SKILL_BASH_ALLOW 9
  / schema_version 11 / Поведение[1-4] 17 / node_modules 15 / claude-md-management 19
  / approve_git_operation 28 / subagent-block 14 / restricted/ 21 / keytar 15
  / shell-quote 17 / dep-checksums 11 / multi-judge 8 / NFC|normalize 12
  / mcp_tool_classification 7 / /etc/hosts 11 / git rev-parse HEAD 5
- markdownlint 0 errors; cspell 0 issues
- All §1-§11 sections intact (12 top-level headings preserved)

§0 cross-refs не меняются — spec-only, не tooling-канон / не ADR / не off-phase
подкатегория. Self-contained для writing-plans skill input в следующей сессии.

Methodology: EnterPlanMode → write plan → user approval → ExitPlanMode → 4 Edits
(Edit 3 inline-marker trim skipped как cosmetic — quality бы не выросло).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 14:58:46 +03:00
Дмитрий 6383da7f12 chore(incident-followup): close 4 tails from 29.05 disk-full incident
ремонт: incident-followup cleanup batch — 4 хвоста

1. Larastan baseline regenerated (was 161 errors pre-existing IDE helper drift)
2. Deptrac Mail: [Model, Service] + ADR-005 amend (was 4 pre-existing violations)
3. PG logrotate config in setup-logrotate.yml
4. F1 6 mismatches — RCA updated (algorithm divergence trigger global vs verify per-tenant)

+3 cspell words: notifempty, missingok, верифицируется.

Ref: docs/incidents/2026-05-29-disk-full-pg-recovery.md §4-5
2026-05-29 14:45:28 +03:00
Дмитрий 8910ae6cd6 spec(router-gate): v3.8 → v3.9 Round 7 audit closure (13 классов, 3 фундаментальные плоскости)
Round 7 adversarial audit (через superpowers:brainstorming skill) выявил 13 классов
которые 9 предыдущих раундов не покрывали:
- 2 FATAL: F5 Read-leak parent_random_id через Glob+Read (R-NEW-4 обнулён),
  F6 subagent tool_result.content exfil
- 4 CRITICAL: C12 system DNS/config (/etc/hosts/~/.ssh/registry) вне §3.1,
  C13 || true exit-code spoof (per-token vs per-chain),
  C14 subagent state exfil,
  C15 §5.2 multi-language gap (PHP/Ruby/Go test runners)
- 5 SERIOUS: S22 Skill(claude-md-management) exemption backdoor,
  S23 Workflow args parameter payload,
  S24 path-equivalence (Unicode NFC/NFD + Windows 8.3 + hardlinks),
  S25 MCP filesystem/redis write tools classification,
  S26 stop-keywords morphology gaps
- 2 EDGE: E31 gate-error reason disclosure (probing pattern),
  E32 LLM-judge cache cross-session persistence

18 spec edits: header bump + TL;DR + Changes v3.8→v3.9 table + §3.1 system paths
+ parent-sentinel→restricted + §3.4 PostToolUse Task scanner + §3.6.2 normative-content
second-layer gate + §4.5 stop-keywords expanded + §4.7 cache per-session + §5 MCP
classification + §5.1 chain ANY-mutating + PostToolUse rev-parse verify + §5.1.2
PowerShell mirror + §5.2 multi-language scan + §6.3 redacted reason mode + §9 13 closures
+ §10.2 gate-config v3.9 fields + §11 v3.9 history entry.

Spec: 2554 → 2964 строк (+410 lines). Budget: 45-60h (v3.8) → 53-72h (v3.9).
Закрыто 118 holes total через 10 раундов adversarial audit.

cspell-words.txt +18 терминов (exfiltration/exfil/NFD/RCE/syscall/Inodes/PROGRA/
resolv/nsswitch/ics/HKCU/HKLM/fsutil/unstar/mvn/popen/брэйншторм/стопаем).

Generalisable formula R7 (новая): для каждого следующего audit задавать 3 вопроса
до enumeration — какие safe tools/paths/chains дают visibility/leverage; какие
границы scope подразумеваются но не enforce'ятся; где per-token vs per-chain
formulation gap есть в композиции.

§0 cross-refs не меняются — spec-only, не tooling-канон / не ADR / не off-phase
подкатегория.

Methodology: superpowers:brainstorming skill + AskUserQuestion scope choice
(user выбрал «Полное v3.9 closure всех 13»).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 14:36:36 +03:00
Дмитрий d181e98046 docs(claude-md): v2.40 add §5 п.13 NB (mixed-diff blocks docs-only short-circuit) + §5 +п.15 (memory-coverage rejects chain channels)
Two operational gotchas discovered в session 29.05.2026 (router-gate v3.6-3.8 sweep + post-sweep memory updates):

1. §5 п.13 NB — docs-only short-circuit считает строго .md-суффикс.
   cspell-words.txt / package.json / lefthook.yml рядом со spec.md
   делают diff mixed → verify-before-push активен → нужен vitest sentinel
   ИЛИ override. Прецедент: commit 46c43169.

2. §5 +п.15 — enforce-memory-coverage hook не принимает chain-каналы
   (chain:commit-push-mem-sync etc); требует строго direct:memory-sync
   в свежем turn'е. Memory updates как часть multi-step задачи планировать
   отдельным turn'ом или использовать memory dump override.
   Прецедент: 4-й шаг sweep задачи заблокирован.

Via /claude-md-management:revise-claude-md skill flow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 14:29:38 +03:00
Дмитрий 46c4316966 spec(router-gate): v3.6-3.8 sweep — rounds 5/6 closure (45 holes in 22 classes)
Adversarial audit rounds 5+6 на v3.5 spec выявили 45 attack-векторов:
- Round 5 (v3.5→v3.6 + v3.6→v3.7 maximum-closure): 15 + 7 acceptable residual closed
- Round 6 (v3.7→v3.8): 30 attack-векторов в 18 классах вкл. 4 FATAL

v3.6 — Round 5 closure (15 holes, +9 protected paths, Smoke 4-7,
§5.1↔§4.5 sync, recovery-pattern hard-block в §4.7).

v3.7 — Maximum closure pass (7 v3.6 residual + edge): LLM-judge на
каждую AskUser-опцию (R-NEW-1), parent_random_id 256-bit sentinel
(R-NEW-4), restricted/ subdir + journal-reader CLI (R-NEW-2),
shell-quote tokenizer (R-NEW-3), TF-IDF + LLM-judge для off-topic
(R12), smart batching multi-question AskUser (S6), git stash variants
в conditional class (Edge).

v3.8 — Round 6 closure (30 holes в 18 classes):
- F1 PowerShell tool gate (был полностью вне scope §5.1 — обнуляло
  архитектуру на Windows)
- F2 Workflow internal agent() enumeration (nested агенты вне
  gate-visibility — unlimited bypass)
- F3 node_modules whole-dir hard-deny + dep-checksums verification
  через SHA-256 ключевых deps
- F4 LLM-judge anti-injection (delimiter tokens + pre-filter +
  multi-judge consensus Sonnet+Haiku)

§3.1 protected paths расширен +30 entries (memory/CLAUDE.md/Pravila/
PSR/Tooling с Skill exemption для claude-md-management, CI/CD configs,
lint/build configs, plugin cache, shell init, npm configs, node_modules,
parent-sentinel, dep-checksums, expected-path).

§3.0.1 OS-keychain для LLM key (Windows Credential Manager / Keychain /
libsecret через keytar); key не в process.env → не утечёт через npm
test stdout.

§3.2.1 automated bootstrap smoke (1/5/6/7 на каждый session start,
cached 7 days); user-run остаётся для 3/4/8.

§6.1 docs/recovery-procedures.md новый файл — пошаговая шпаргалка
PowerShell-команд для 3 уровней recovery.

Budget: 13.5-20h (v3.5) → 22.5-32h (v3.6) → 33-44h (v3.7) → 45-60h (v3.8).
Закрыто 105 holes total через 9 раундов adversarial audit.

Generalisable lesson v3.8: каждый раунд аудита должен начинать с
abstract classification классов атак до enumeration конкретных дыр.
v3.7 «maximum closure» был maximum внутри границ воображения v3.6 R5-audit;
Round 6 показал что сами границы имели дыры.

Spec: 1980 → 2554 строк (+1110 inserts / -44 deletes за v3.6-3.8 sweep).
+13 терминов в cspell-words.txt (PowerShell aliases, npm deps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 13:55:11 +03:00