Commit Graph

347 Commits

Author SHA1 Message Date
Дмитрий 2907e3f25f fix(m1): extractPath — расширены path-поля (ловит MCP filename/uri/destination) (B4) 2026-06-05 03:42:40 +03:00
Дмитрий 7b578cd391 fix(m1): seq+ts входят в chain_hash журнала — подмена метаданных ломает цепь (B2) 2026-06-05 03:41:38 +03:00
Дмитрий b94f7d244c feat(m3-d): контракты + look-ahead в промпт роутера + проброс runRouter (фикс-2) 2026-06-05 03:24:49 +03:00
Дмитрий 92ba55bc0f feat(m3-d): нюх 5.3 + интервьюер 4.4 в промпт роутера (фикс-3) 2026-06-05 03:23:11 +03:00
Дмитрий 58f3a65800 feat(m3-a): checkContractNeutrality — G1 страж нейтральности этикетки (фикс-5, опц.) 2026-06-05 03:22:07 +03:00
Дмитрий eb3f4c4ed1 feat(m3-a): dispatchContract — G3 детерминированная диспетчеризация точно|мягко (фикс-4) 2026-06-05 03:20:55 +03:00
Дмитрий 003bd3d86b fix(m3-b): resolveNode заземляет skill-ref по префиксу (superpowers:X -> #19) — фикс-1 2026-06-05 03:18:26 +03:00
Дмитрий a27a848d7c test(m3-e): learning hard-rule invariants + plan — Машина 3 собрана
Машина 3-E «Очередь одобрений + ручка разведки» собрана (TDD): router-learning-queue.mjs
(propose-only + owner batch approval + render/signal/persist; hard-rule «без да — никак»)
+ router-exploration.mjs (ручка %разведки=0 default, проба=вопрос владельцу, риск-гард).
19 новых тестов. Финальная регрессия tools-only 2212 GREEN.

МАШИНА 3 (Роутер-наставник) собрана целиком: 3-A контракты / 3-B граф узлов /
3-C машина охвата / 3-D движок роутера / 3-E очередь обучения. Доставка в живую
инфру (STATUS/brain-retro), K4-поправка, live-wiring, перенос волн — follow-up
после Машины 4 (журнал вопросов).
2026-06-04 19:51:45 +03:00
Дмитрий dcf772bac5 feat(m3-e): exploration knob (#3) — probe = owner question, off by default + risk-guard 2026-06-04 19:50:31 +03:00
Дмитрий 4cb17fc4d5 feat(m3-e): learning queue — propose-only + owner batch approval + render/signal/persist (hard-rule no auto-fill) 2026-06-04 19:49:32 +03:00
Дмитрий ed89028b1d test(m3-d): router-engine invariants on real graph + plan + questions log
Машина 3-D «Движок роутера» собрана (TDD): router-engine.mjs (detectHighRisk 6.1
детерминированный / validateLevelSkip 6.2 / cheaperOf / validateTrace 5.1 /
groundTrace ОВ-Д2 / buildRouterPrompt+parse+runRouter, llmCall мокается как
router-classifier) + step-pointer.mjs (дерево-указатель волн D6/OQ1, стендово).
35 новых тестов, регрессия tools-only 2193 GREEN. K4-поправка к стене + live-wiring
+ перенос волн в живой main — ОТЛОЖЕНО до Машины 4 (журнал вопросов).
2026-06-04 19:46:09 +03:00
Дмитрий 28b129ed9c feat(m3-d): step-pointer tree (waves D6/OQ1) — standalone, not yet wired into M2 2026-06-04 19:45:04 +03:00
Дмитрий 3a80bdde5c feat(m3-d): router-engine — risk(6.1)/skip(6.2)/price + trace 5.1 + grounding(ОВ-Д2) + buildRouterPrompt/parse/runRouter (llmCall injected, mocked) 2026-06-04 19:43:57 +03:00
Дмитрий 80ebec9e82 test(m3-c): coverage-machine invariants on 3-A contracts + plan
Машина 3-C «Машина охвата A/B/C/D» собрана (TDD): coverage-machine.mjs —
A граф зависимостей (buildDependencyGraph/topoOrder/findHoles/decompositionGroups),
B реестр нужды↔решения (coverageRegistry: дыры+сироты), C requestsChecklist,
D ограничения как нужды (effectiveNeeds), хребет readinessChecklist (4 галочки + §).
Независимый верификатор охвата (рычаг E §6.3). 19 новых тестов, регрессия 2158 GREEN.
2026-06-04 19:26:21 +03:00
Дмитрий 8df8d05612 feat(m3-c): coverage-machine A/B/C/D + readinessChecklist (C-14, set/graph ops) 2026-06-04 19:23:02 +03:00
Дмитрий 699da97dc2 test(m3-b): node-graph invariants on real registry + plan
Машина 3-B «Граф узлов из реестра» собрана (TDD): node-graph.mjs поверх
loadRegistry — buildNodeGraph/resolveNode (ОВ-Д2 заземление) + twinsOf
(subcategory) / hintLinksOf (chains) / conflictsOf (явные) + checkGraphFreshness
(3.6). 20 новых тестов, регрессия tools-only 2139 GREEN.
2026-06-04 19:19:00 +03:00
Дмитрий 750f406cbd feat(m3-b): node-graph from registry — buildNodeGraph/resolveNode (ОВ-Д2) + twins/hints/conflicts + freshness 3.6 2026-06-04 19:17:50 +03:00
Дмитрий 53db0ee2b3 test(m3-a): contract fixtures (own+external) + 3-A invariants + plan + questions log
Машина 3-A «Контракты скилов» собрана (TDD): skill-contract.mjs (схема C-13/L
+ form validator + normalize/accessors + G4 drift-guard) + skill-contract-registry.mjs
(buildRegistry/loadRegistry). 28 новых тестов, регрессия tools-only 2119 GREEN.
Образцы own (writing-plans) + external (operations:process-doc). Журнал вопросов заведён.
2026-06-04 19:12:47 +03:00
Дмитрий a905abd1b4 feat(m3-a): skill-contract-registry — buildRegistry (validate/dedupe/drift) + loadRegistry (disk) 2026-06-04 19:11:28 +03:00
Дмитрий f82cefaead feat(m3-a): skill-contract schema + form validator + normalize/accessors + G4 drift-guard (C-13/L) 2026-06-04 19:10:24 +03:00
Дмитрий ec4733f77a test(m2): supreme-wall invariants — default-deny/seal/seed/step-match (Task 9)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:49:29 +03:00
Дмитрий cfbfd9c6b4 feat(m2): door-coverage — auditDoors (forgotten-channel) + auditExempt (green-pass safety) (Tasks 7,14)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:49:10 +03:00
Дмитрий 8d9ca65cf3 feat(m2): supreme-gate — seeds/decide/runGate/decideMode/observe-only/closed-door (Tasks 4-6,10,13,12)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:48:49 +03:00
Дмитрий 599dca15ec feat(m2): plan-lock — freeze/verify/match/persist/reconcile/2nd-seal/closed-door (Tasks 1-3,8,11,12)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:48:30 +03:00
Дмитрий b83cea2e73 test(m1): foundation cross-invariants (journal tamper / unsigned receipt / runtime deny) 2026-06-04 04:05:10 +03:00
Дмитрий 7af68d62c8 feat(m1): runtime-write-deny — block any path-bearing tool (P10-a all channels) 2026-06-04 04:04:25 +03:00
Дмитрий 56da7faba9 feat(m1): pathNormalize NFC normalization (P10-b unicode evasion) 2026-06-04 04:01:17 +03:00
Дмитрий 3af57e180a feat(m1): signed askuser approval records (P10-c HMAC receipts) 2026-06-04 03:59:38 +03:00
Дмитрий e3da14a7fc feat(m1): action-journal — append-only hash chain + HMAC head anchor + JSONL persist 2026-06-03 19:18:08 +03:00
Дмитрий 9bd45ce510 feat(m1): receipt-sign — canonicalJson + HMAC signPayload/verifyReceipt (fail-closed) 2026-06-03 19:16:28 +03:00
Дмитрий d7dc03271a feat(m1): receipt-key-config — HMAC key resolution (keychain -> env -> null) 2026-06-03 19:15:25 +03:00
Дмитрий 9689a6e5b8 feat(router): max_tokens 1500->15000 + task_type rasinhron fix + design notes (router-mentor)
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
2026-06-01 11:50:17 +03:00
Дмитрий c55e14b626 feat(brain): surface router-gate v4 signals into episode + factor axes
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
2026-05-31 19:05:20 +03:00
Дмитрий 30b79c7228 fix(router-gate): narrow cd app whitelist (TDD, tools 1978 GREEN)
Add /^cd\s+app$/ to SAFE_EXACT so already-whitelisted commands (pest,
php artisan test) run from app/. Scope limited to the literal `app` dir:
cd into any other path (incl. protected .claude/runtime, memory/,
transcripts) stays default-deny, so the cwd-shift read-bypass is contained.
Mutations remain caught at the hard-blacklist + chain-mutating rule, and
each chain segment after `cd app &&` must still be independently whitelisted.

Owner-authorized, narrow scope = literal `app` only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 13:34:42 +03:00
Дмитрий d647bf1858 fix(router-gate-v4): calibration 5 - cosmetic-detector exempts git-approval AskUser (scope fix, regression-tested) 2026-05-31 11:19:14 +03:00
Дмитрий 1f9b51bc39 feat(router-gate-v4): parallel-session-lock live main() — acquire on PreToolUse + release on Stop (point 2)
The Stream H wrapper shipped a deliberate no-op main() — the lock did nothing.
This wires it live: PreToolUse on a mutating tool acquires/refreshes the
workspace lock (blocks only when a DIFFERENT session holds a fresh, non-stale
lock); the Stop event releases it. Fail-open on any error so a lock bug can
never wedge the user out of their own session.

- runAcquireDecision({event,now,pid,cwd,readLock,writeLock}) — compose
  acquire() + decide().
- runReleaseAction({event,cwd,readLock,deleteLock}) — release() if this
  session owns the lock, no-op otherwise.
- live main(): branches on tool_name (present → acquire/refresh; absent/Stop
  → release); real fs binding via runtimeDir()/session-lock-<workspaceHash>.json.

Activation registers BOTH the PreToolUse (acquire) AND the Stop (release)
entries — the Stop wiring is mandatory; without it the lock is never released
and the next abnormal exit would lock the user out. Script:
.scratch/activate-point2-hooks.ps1 (also registers safe-baseline-metering +
runtime-write-deny per the point-2 plan).

Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md Task 7.
Regression: parallel-session-lock 12/12 GREEN; full tools suite 1958 passed | 2 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 11:06:52 +03:00
Дмитрий 8a7144892c fix(router-gate-v4): calibrate per-tool LLM-judge — calibration 4 soft user-prompt fallback
The per-tool judge compares each mutating tool call against the classifier's
distilled task summary read from router-state. That summary is lossy and
frequently "(unknown)" even for a perfectly explicit user request — and with an
unknown task the judge has nothing to compare against, so "Сомнения → NO"
blocked every real edit. Reproduced repeatedly this session: an explicit
"реализуй ... main() ..." prompt still classified unknown → all edits blocked,
including the judge's own fix. Calibration 2 (allow on unknown) was rejected by
the owner as a discipline hole.

Calibration 4 (soft, scope-preserving): when — and only when — the classifier
summary is "(unknown)"/empty, fall back to judging against the user's actual
last prompt (the ground-truth request) instead of nothing. The judge still runs
and still blocks on doubt; it just uses better evidence. When the summary is
meaningful, behaviour is unchanged (the user-prompt reader is not consulted).
When both summary and prompt are unavailable, the task stays "(unknown)" and
doubt→block is preserved.

NOT calibration 2: this does not blindly allow on unknown — it re-grounds the
judge in the literal user request, which the controller cannot fabricate (the
user writes it; it is read locally from the session transcript).

- tools/llm-judge-per-tool.mjs: resolveEffectiveTask(declaredTask, lastUserPrompt).
- tools/enforce-llm-judge-per-tool.mjs: runPerTool reads the last user prompt
  (helpers.lastUserPromptText + readTranscript) only on an unknown summary;
  main() binds it.

Regression: judge tests 57/57 GREEN; full tools suite 1951 passed | 2 skipped.
The 6 remaining failures are uncommitted point-2 WIP in
enforce-parallel-session-lock.test.mjs — not part of this change, not committed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:34:27 +03:00
Дмитрий 722f4bb189 fix(router-gate-v4): calibrate per-tool LLM-judge — exempt Skill (calib 1) + test-runners (calib 3)
The Layer-4 per-tool judge over-blocked: it judged every Skill/Edit/Write/
Bash/Task against the declared task and blocked on doubt. A vague prompt
classifies as unknown/ambiguous, so the judge then blocked essentially all
artifact-producing tools — including the prescribed §17 skill entry and the
mandatory TDD test run — making legitimate, owner-mandated work impossible
and blocking its own fix (3 reproduced blocks this session).

Calibration 1 (scope fix, NOT a discipline drop): remove `Skill` from
MUTATING_TOOLS in tools/llm-judge-per-tool.mjs. Invoking a skill mutates no
state and is the §17-mandated entry into work; the real mutations it leads to
(Edit/Write/MultiEdit/Bash/PowerShell/Task/commit/push) stay fully judged.

Calibration 3 (scope fix, NOT a discipline drop): add isTestRunnerBashEvent to
tools/enforce-llm-judge-per-tool.mjs and skip it in runPerTool, mirroring the
existing readonly-Bash exemption. A test run (vitest/pest/phpunit/php artisan
test/composer test/npm test) only inspects + reports and is a mandatory TDD
step; commands chaining to a mutation (&& ; | backtick $() are NOT exempt.

doubt→block on real mutations against a known task is unchanged (covered by the
"mutating Bash (git commit) STILL judged" test). Calibration 2 (allow on
unknown task) was rejected by the owner as a discipline hole and not added.

Regression: vitest tools-only 1945 passed | 2 skipped (+18 calibration tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:04:43 +03:00
Дмитрий c9b9efd6e4 fix(router-gate-v4): exclude readonly Bash from per-tool judge — scope fix, discipline unchanged 2026-05-31 08:59:18 +03:00
Дмитрий dfae9f760b feat(router-gate-v4): live main() for LLM-judge wrappers — flag-gated spend (item 2b) 2026-05-31 08:06:26 +03:00
Дмитрий a8996896a8 test(router-gate-v4): Read-deny boundary cases (.env.production blocked, Tooling doc readable) 2026-05-31 07:38:18 +03:00
Дмитрий 3c5266c022 fix(router-gate-v4): narrow Read-deny so CLAUDE.md and memory are Read-allowed, transcripts/runtime still blocked (over-block fix) 2026-05-31 07:26:30 +03:00
Дмитрий 80e514f5bb feat(router-gate-v4): enforce-runtime-write-deny protect runtime side-channels (C3) 2026-05-31 05:57:59 +03:00
Дмитрий f740f6124a feat(safe-baseline): live main() metering + hard-block + Skill/EnterPlanMode escape (item 1b) 2026-05-31 05:57:47 +03:00
Дмитрий ca52d354f9 feat(router-gate-v4): LLM-judge per-tool + response-scan hook wrappers (Stream H tail) 2026-05-30 19:59:42 +03:00
Дмитрий 6ac4b1c1b1 feat(router-gate-v4): safe-baseline-metering wrapper + llm-judge-config gate (Stream H tail) 2026-05-30 19:29:58 +03:00
Дмитрий f172e2a580 feat(router-gate): SAFE_EXACT +Laravel dev workflow
Closes design gap in v4 whitelist: dev commands (pest, composer test/pint/stan/insights/rector,
php artisan test/migrate variants/db:seed/cache:clear etc., vendor/bin/pest) were falling into
default-deny. That blocked sessions working on app/ code and pushed controllers toward override
phrases or requests to disable the defense.

Changes are surgical and do not weaken discipline defense:
- 4 new SAFE_EXACT regex entries for specific dev commands
- tinker EXCLUDED on purpose (REPL = arbitrary PHP exec risk)
- migrate:install and other unknown migrate subcommands stay blocked via
  lookahead instead of word-boundary (precision fix)
- Hard-blacklist for mutating package operations, chain-semantics C13,
  file-watcher, TDD-gate, path-deny, coverage requirement and the other 15
  defense hooks are NOT touched.

TDD: 22 RED allow-tests + 7 still-block tests + 3 regression tests.
Full tools-only regression 1821/1821 GREEN.

Live smoke verified: composer test allowed; migrate:install blocked.

Whitelist v3.8 was sized around vitest tools-only; Laravel app/ dev workflow
slipped through. This commit corrects that without touching the architecture.
2026-05-30 16:11:34 +03:00
Дмитрий ffd70d6fa5 fix(router-gate-v4): lastTurnEntries skips harness-injected skill bodies (isMeta + sourceToolUseID)
Sibling Claude session 2026-05-30 found that lastTurnEntries treats
harness-injected skill bodies as spurious turn boundaries, breaking both
enforce-memory-coverage (can't find user's coverage line) AND
enforce-normative-content-rules::detectLegitSkillActive (can't find the
Skill tool_use that lives in the assistant message BEFORE the body).

Refinement applied here: this session inspected 29 isMeta:true entries
across the live transcript (8f4ba767-...jsonl) via a debug helper and
found isMeta:true is ALSO used for "Continue from where you left off"
auto-resume, Stop hook feedback strings, and <local-command-caveat>
wrappers — those are real user-equivalent boundaries that must remain
visible. Sibling's blanket "skip isMeta" proposal would have broken them.

Discriminator: skip ONLY when isMeta === true AND typeof sourceToolUseID
=== 'string' (tool-spawned content). Skill bodies have the linking field;
the other isMeta sources do not. The sourceToolUseID field is harness-
controlled and not writable by controller from inside a tool call —
cannot be spoofed.

Behaviour after fix:
  * Skill body injection → skipped → walk continues back to find user's
    real prompt (with coverage line).
  * The assistant message containing the Skill tool_use is now inside the
    turn → detectLegitSkillActive finds it → normative writes pass when
    invoked under an active claude-md-management skill.
  * "Continue from where you left off." → still treated as turn boundary.
  * Stop hook feedback strings → still treated as turn boundary.

TDD:
  * 3 new tests in tools/enforce-hook-helpers.test.mjs under the
    "lastTurnEntries / lastUserPromptText / lastAssistantText / turnToolUses"
    describe block:
      - lastTurnEntries skips skill body injections (isMeta + sourceToolUseID)
      - lastTurnEntries does NOT skip "Continue from where you left off"
        (isMeta but no sourceToolUseID)
      - turnToolUses includes Skill tool_use spawned in same turn as the
        injected skill body
  * 2/3 RED→GREEN (the "Continue" negative test passed on baseline already
    since its string content satisfies the existing string-content branch).

Scope:
  * Fixes 2 of the 5 structural quirks documented in the Stream H
    completion log (enforce-memory-coverage gap, enforce-normative-
    content-rules detectLegitSkillActive gap).
  * Does NOT fix: enforce-read-path-deny LEGIT_SKILLS exemption gap
    (separate hook, no lastTurnEntries dependency); TDD-gate cross-actor
    blindness (different mechanism — actor session boundaries);
    detectFullTestRun regex narrowness (command-pattern matching).

Regression: vitest tools 1788/1788 GREEN (was 1785; +3 new tests).

Plan: docs/superpowers/plans/2026-05-30-lastturnentries-skill-body-skip.md
2026-05-30 14:16:12 +03:00
Дмитрий f1c422af49 feat(router-gate-v4): Stream H Task 10 — subagent-prompt-prefix worktree bootstrap auto-inject
Closes Stream H Task 10 (H10) that was deferred from the initial Stream H
push. Adds two pure helpers to tools/subagent-prompt-prefix.mjs and wires
them into buildHeader() so subagents spawned inside a linked git worktree
get a SETUP block with vendor symlink + storage/framework mkdir guidance
in their injected prompt.

Two new exports:

1. detectWorktreeMode({cwd, gitDir, gitCommonDir}) — pure detector that
   returns {isWorktree, parentRepoRoot}. Worktree is detected when the
   per-worktree git-dir differs from the shared git-common-dir; the
   parent repo root is derived by stripping the trailing `/.git` segment
   from the common dir (separators normalized to forward slashes). Handles
   null inputs gracefully and accepts mixed forward/backslash separators.

2. buildSetupBlock({isWorktree, parentRepoRoot, platform}) — pure renderer
   that returns the SETUP — worktree bootstrap text block (or '' to omit
   when not in a worktree or parentRepoRoot is missing). Picks `mklink /D`
   on win32 vs `ln -s` elsewhere. Mentions all four storage/framework
   subdirs (cache, sessions, views, testing) per memory
   `feedback_subagent_worktree_bootstrap.md` — exactly what Pest 4 needs
   to resolve the Eloquent facade and view cache paths inside a worktree.

buildHeader() now resolves --git-dir + --git-common-dir alongside the
existing --show-toplevel, calls detectWorktreeMode to classify the
spawn site, then inserts buildSetupBlock's output between rule 5 and
the END marker. When not in a worktree the block is empty and the header
layout is unchanged.

Regression: vitest tools 1785/1785 GREEN (was 1776; +9 tests across
"detectWorktreeMode (Stream H Task 10)" and "buildSetupBlock (Stream H
Task 10)" describe blocks in the new
tools/subagent-prompt-prefix-h10.test.mjs file). The pre-existing
tools/subagent-prompt-prefix.test.mjs is intentionally excluded from
vitest config (node:test runner used for subprocess-style tests) — H10
helpers are pure and live in the vitest scope so the new test file is
not added to the exclude list.

Stream H Task 10 of 11 — closes the deferred H10. Plan:
docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 12:08:33 +03:00
Дмитрий d75c8922aa fix(router-gate-v4): Stream H Task 9 — cosmetic path-format fixes (Cygwin /c/ prefix + PowerShell $env:VAR expansion)
Closes Stream H Task 9 (H3). Two cosmetic fixes in tools/path-normalization.mjs
for gate error messages observed during Smoke 5 Real Fix Re-test 2026-05-30
(steps 4 and 5). Both purely affect human-readable display in block messages
— security behaviour is unchanged (path-deny still fires correctly in all
the original test scenarios).

1. Cygwin/git-bash `/c/Users/...` prefix collapsed before path.resolve.
   On win32, path.resolve('/c/Users/x') treats `/c/...` as drive-relative
   and prepends cwd's drive letter, producing display paths like
   `c:/c/users/...` (doubled drive). The fix inserts a single-letter-drive
   normalization step BEFORE resolve when the input looks Cygwin-style.
   Guarded by `homedir matches ^[a-zA-Z]:` so POSIX test fixtures
   (homedir='/h') still get the original behaviour.

2. PowerShell `$env:USERPROFILE` syntax expanded in expandEnvVars.
   The expander handled `%NAME%`, `${NAME}`, and bare `$NAME` but not
   the PowerShell-native `$env:NAME` form, so messages displayed the
   literal `$env:USERPROFILE` instead of the expanded path. Added a
   case-insensitive matcher (PowerShell is case-insensitive) covering
   all ENV_WHITELIST names. Non-whitelisted `$env:SECRET` still passes
   through unchanged.

Regression: vitest tools 1776/1776 GREEN (was 1772; +4 new tests across
"pathNormalize" (+1 cygwin), "expandEnvVars — PowerShell $env:VAR
(Stream H Task 9 cosmetic)" (+3)). One pre-existing test ("case-folds on
win32") would have broken without the homedir-drive guard — guard
preserves it.

Stream H Task 9 of 11. Plan: docs/superpowers/plans/2026-05-30-router-gate-v4-stream-H.md
2026-05-30 11:43:31 +03:00