diff --git a/docs/observer/STATUS.md b/docs/observer/STATUS.md index d489b2d5..b1d0a409 100644 --- a/docs/observer/STATUS.md +++ b/docs/observer/STATUS.md @@ -1,6 +1,6 @@ # Brain Status (auto-generated) -Last updated: 2026-05-25T03:07:04.756Z +Last updated: 2026-05-25T03:08:14.868Z | Контролёр | Состояние | Детали | |---|---|---| diff --git a/docs/superpowers/plans/2026-05-25-llm-first-router-overhaul.md b/docs/superpowers/plans/2026-05-25-llm-first-router-overhaul.md index f3cadedd..d30d1937 100644 --- a/docs/superpowers/plans/2026-05-25-llm-first-router-overhaul.md +++ b/docs/superpowers/plans/2026-05-25-llm-first-router-overhaul.md @@ -1,18 +1,37 @@ -# LLM-first router overhaul — Implementation Plan (фазы 1+2+3) +# LLM-first router overhaul — Implementation Plan v1.1 (фазы 1+2+3) > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Заменить regex-первичный router + §12 hard-rule на LLM-first классификатор (Sonnet 4.6 + памятка) с §17 universal skill-coverage, inheritance для коротких prompt'ов, и автоматический evidence loop (self-assessment + Opus reviewer subagent) для последующей дистилляции эмпирических regex-правил. -**Architecture:** Prefilter (regex) → LLM-classifier (Sonnet 4.6) → embedding → §17 gate (warn-only) → execution → self-assessment (Stop-hook) → Opus reviewer subagent (в /brain-retro). Гранулярные flag-переключатели. Schema эпизода bumps v4.0→v4.3 поэтапно. Все routing-docs + §12 архивируются. +**Architecture:** Prefilter (regex) → LLM-classifier (Sonnet 4.6) → embedding → §17 gate (warn-only) → execution → self-assessment (Stop-hook) → Opus reviewer subagent (в /brain-retro). Гранулярные flag-переключатели. Schema эпизода bumps v4.0→v4.3 поэтапно. §12 + routing-docs архивируются. Economy-mode система (0%/5%/100%) **сохраняется** — снимается только §12 skill-discipline. -**Tech Stack:** Node.js ESM (`.mjs`), Vitest (для tools tests), Anthropic API (Sonnet 4.6 classifier + Opus 4.7 reviewer/self-assessment), `@xenova/transformers` (local embedding), Claude Code hooks (UserPromptSubmit/PreToolUse/Stop/SessionStart) + agents + skills, lefthook (pre-commit), markdownlint/cspell/gitleaks. +**Tech Stack:** Node.js ESM (`.mjs`), Vitest, Anthropic API (Sonnet 4.6 classifier + Opus 4.7 reviewer/self-assessment), `@xenova/transformers` (local embedding), Claude Code hooks + agents + skills, lefthook, markdownlint/cspell/gitleaks. **Source spec:** `docs/superpowers/specs/2026-05-24-llm-first-router-overhaul-design.md` v2.2. -**Тайминг:** реализация — после закрытия Биллинга v2 Спек C (per user decision 2026-05-24). План написан заранее; перед стартом — refresh check (см. Task 0). +**v1.1 changelog (после 0%-аудита плана v1.0):** Task 1 = **«Откат мозга»** (полная инфра + snapshot user-level + dry-run + end-to-end smoke ДО любой деструкции). Закрыты 9 пробелов аудита: G3 (economy vs skill-discipline в user hooks), G5 (parser fwd-compat в откате), G8/G9 (C1/C2 ordering), G11 (accuracy-runner import), G14 (registry-to-classification-map), G16 (reviewer файл — create, не keep), user-level rollback, episodes preservation. -**Scope:** этот план покрывает implementation-фазы 1+2+3 (~3.5-4.5 недели). Фаза 4 (distillation, через ~6 месяцев) — отдельный план, пишется когда накопится база. +**Тайминг:** реализация — после закрытия Биллинга v2 Спек C. Перед стартом — refresh check (Task 0). + +**Scope:** implementation-фазы 1+2+3 (~3.5-4.5 недели). Фаза 4 (distillation) — отдельный план через ~6 месяцев. + +--- + +## ⚠️ DECISION POINT (требует подтверждения заказчика до старта Task 2) + +**Economy-mode vs §12 skill-discipline в `~/.claude/settings.json`** (user-level, вне репо). 6 Python-хуков в `~/.claude/hooks/`: + +| хук | категория | действие в этом плане | +|---|---|---| +| `economy-mode.py` | economy (0%/5%/100%) | **СОХРАНИТЬ**, отредактировать stale-упоминание §12 | +| `economy-self-check.py` | economy | СОХРАНИТЬ | +| `economy-postcompact.py` | economy | СОХРАНИТЬ | +| `economy-state-guard.py` | смешанный (economy + §12 bypass-detect) | **ОТРЕДАКТИРОВАТЬ** — убрать §12-часть, оставить economy | +| `skill-marker.py` | §12 skill-discipline | **СНЯТЬ** (archive) | +| `skill-check.py` | §12 skill-discipline | **СНЯТЬ** (archive) | + +**Default этого плана:** economy-система сохраняется (активно используется), §12-enforcement (`skill-marker.py`+`skill-check.py`+§12-часть `state-guard`) снимается. Если хочешь иначе — скорректировать Task 2 до старта. --- @@ -23,11 +42,13 @@ | Файл | Ответственность | |---|---| | `docs/adr/ADR-016-section17-universal-skill-coverage.md` | ADR: §17 заменяет §12 | -| `docs/archive/llm-bootstrap-2026-05/` (+ подпапки + ROLLBACK.md) | Архив §12, routing-docs, memory | -| `tools/test-rollback.mjs` | Dry-run + execute автоматизация отката | +| `docs/archive/llm-bootstrap-2026-05/` (+ подпапки + ROLLBACK.md) | Архив §12, routing-docs, memory, user-hooks snapshot | +| `tools/test-rollback.mjs` | Откат: git-tracked + user-level + flags; dry-run/execute | | `tools/router-classifier-regex-fallback.mjs` | Сохранённый старый regex Layer 1 (fallback) | +| `tools/router-config.mjs` | Model IDs + константы (INHERITANCE_MAX_AGE_MIN, etc.) | | `tools/router-embedding.mjs` | Local embedding (Xenova) | | `tools/router-embedding-warmup.mjs` | SessionStart pre-warm | +| `tools/brain-retro-opus-reviewer.mjs` | **CREATE** (не существует) — direct-API fallback handler для reviewer | | `tools/brain-retro-sanity-generator.mjs` | top-5 candidate sanity questions | | `.claude/skills/self-retrospect/SKILL.md` | Opt-in self-retrospect skill | | `docs/observer/.self-retrospect-counter.json` | Counter для self-retrospect threshold | @@ -38,27 +59,30 @@ | Файл | Что | |---|---| | `tools/router-classifier.mjs` | Prefilter (3 группы + manual override + anchor) + Sonnet classifier + памятка | -| `tools/router-prehook.mjs` | Удалить ENFORCEMENT_TYPES; inheritance logic; cost tracking write | +| `tools/router-accuracy-runner.mjs` | **import classifyByRegex → из regex-fallback** (G11) | +| `tools/router-prehook.mjs` | Удалить ENFORCEMENT_TYPES; inheritance logic; cost tracking | | `tools/router-tool-gate.mjs` | §17 enforcement logic + mode | | `tools/observer-stop-hook.mjs` | execution_trace + chain_gaps + self_assessment + inheritance copy + timeout | -| `tools/observer-transcript-parser.mjs` | handle schema v4.0-v4.3 | +| `tools/observer-transcript-parser.mjs` | handle schema v4.0-v4.3; **forward-compat для отката (G5)** | | `tools/missed-activations.mjs` | source → nodes.yaml; §17 definition | +| `tools/registry-to-classification-map.mjs` | **neutralize/archive** — map deprecated (G14) | | `tools/brain-retro-analyzer.mjs` | v4.* факторы + inheritance анализ | -| `tools/status-md-generator.mjs` | cost + anomaly + self-retrospect + reviewer ratio секции | +| `tools/status-md-generator.mjs` | cost + anomaly + self-retrospect + reviewer ratio | | `tools/l1-watcher.mjs` | source Tooling §3.3 → nodes.yaml | | `tools/cross-ref-checker.mjs` | cross-refs list §12→§17 | -| `tools/brain-retro-opus-reviewer.mjs` | сохранить как direct-API fallback handler | -| `.claude/agents/reviewer-agent.md` | verify + при необходимости обновить (уже создан) | -| `.claude/skills/brain-retro/SKILL.md` | новые шаги 5/6/9/11 + description | +| `.claude/agents/reviewer-agent.md` | verify (уже создан) | +| `.claude/skills/brain-retro/SKILL.md` | новые шаги + description | | `.claude/skills/brain-retro/references/aggregation-template.md` | под расширенную процедуру | -| `.claude/settings.json` | Stop-hook timeout 15s; SessionStart hook; remove skill-discipline (if any) | +| `.claude/settings.json` (project) | Stop-hook timeout 15s; SessionStart hook | +| `~/.claude/settings.json` (user) | снять skill-marker/skill-check; отредактировать economy-state-guard (G3) | +| `~/.claude/hooks/economy-mode.py` | убрать stale «§12» текст (G3) | | `docs/registry/nodes.yaml` | +`capabilities:` per узел | -| `docs/Pravila_raboty_Claude_v1_1.md` | −§12, +§17, §16.4 update | +| `docs/Pravila_raboty_Claude_v1_1.md` | −§12, +§17, §16.4 | | `CLAUDE.md`, `docs/Plugin_stack_rules_v1.md`, `docs/Tooling_v8_3.md` | cross-refs §12→§17, §3.3/R15 архив | --- -## Phase 0 — Pre-flight (перед стартом, ~0.5 дня) +## Phase 0 — Pre-flight (~0.5 дня) ### Task 0: Refresh check + worktree @@ -72,9 +96,9 @@ Run: git fetch origin && git log HEAD..origin/main --oneline ``` -Expected: понять, не разошлась ли база. Если есть untracked commits на origin/main — rebase до начала. +Expected: понять, не разошлась ли база. Если untracked commits — rebase до начала. -- [ ] **Step 2: Verify spec актуален** +- [ ] **Step 2: Verify spec актуален (node count)** Run: @@ -82,7 +106,7 @@ Run: node -e "const fs=require('fs'); const y=fs.readFileSync('docs/registry/nodes.yaml','utf8'); console.log('nodes:', (y.match(/^ - id:/gm)||[]).length)" ``` -Expected: число узлов. Если изменилось vs spec (84) — обновить §11.2 task 2 count. Если изменилось >5 узлов — re-read spec §18.3 + поправить. +Expected: ~84. Если изменилось >5 узлов — re-read spec §18.3 + поправить Task 9 count. - [ ] **Step 3: Verify Sonnet 4.6 model ID** @@ -92,69 +116,204 @@ Run: node -e "fetch('https://api.anthropic.com/v1/models',{headers:{'x-api-key':process.env.ANTHROPIC_API_KEY,'anthropic-version':'2023-06-01'}}).then(r=>r.json()).then(d=>console.log(d.data.filter(m=>m.id.includes('sonnet')).map(m=>m.id)))" ``` -Expected: список Sonnet model IDs. Записать точный `claude-sonnet-4-6-` в `tools/router-config.mjs` (создаётся в Task 9). +Expected: список Sonnet IDs. Записать точный `claude-sonnet-4-6-` для Task 8. -- [ ] **Step 4: Create isolated worktree (Pravila §15.1)** +- [ ] **Step 4: Confirm DECISION POINT (economy vs skill-discipline)** -Используй `superpowers:using-git-worktrees`. Имя ветки `feat/llm-first-router`. +Подтвердить: economy сохраняем, §12-skill-discipline снимаем. Без подтверждения — не стартовать Task 2. + +- [ ] **Step 5: Create isolated worktree (Pravila §15.1)** + +Используй `superpowers:using-git-worktrees`. Ветка `feat/llm-first-router`. --- -## Phase 1 — Foundation + archive (~1 неделя) +## Phase 1 — Откат мозга + foundation + archive (~1 неделя) -### Task 1: Git tag + archive structure +### Task 1: ⭐ ОТКАТ МОЗГА — полная инфраструктура + верификация (FIRST, до любой деструкции) + +**Цель:** прежде чем что-либо ломать — установить и **доказать** работающий механизм полного отката (git-tracked + user-level + runtime flags). Если этот таск не зелёный — дальше не идём. **Files:** -- Create: `docs/archive/llm-bootstrap-2026-05/` (+ подпапки) +- Create: `tools/test-rollback.mjs`, `tools/test-rollback.test.mjs` +- Create: `docs/archive/llm-bootstrap-2026-05/` (+ подпапки), `ROLLBACK.md` +- Create: snapshots — `nodes-yaml-archive/`, `settings-snapshot/`, `runtime-flags-snapshot/`, `user-hooks/` -- [ ] **Step 1: Tag pre-overhaul state** +- [ ] **Step 1: Git tag pre-overhaul state** ```bash -git tag brain-pre-llm-bootstrap -git tag --list brain-pre-llm-bootstrap +git tag brain-pre-llm-bootstrap && git tag --list brain-pre-llm-bootstrap ``` -Expected: тег создан на текущем HEAD. +Expected: тег на текущем HEAD (покрывает все git-tracked файлы). -- [ ] **Step 2: Create archive dirs** +- [ ] **Step 2: Create archive structure** ```bash -mkdir -p docs/archive/llm-bootstrap-2026-05/{pravila-12,routing-docs,tooling-when-to-use,memory,settings-skill-discipline,nodes-yaml-archive} +mkdir -p docs/archive/llm-bootstrap-2026-05/{pravila-12,routing-docs,tooling-when-to-use,memory,settings-snapshot,nodes-yaml-archive,runtime-flags-snapshot,user-hooks} ``` -- [ ] **Step 3: Snapshot nodes.yaml (для отката)** +- [ ] **Step 3: Snapshot ALL non-git-tracked state (закрывает user-level rollback gap + G3)** ```bash -cp docs/registry/nodes.yaml docs/archive/llm-bootstrap-2026-05/nodes-yaml-archive/nodes.yaml.pre-overhaul +node -e "const fs=require('fs'),os=require('os'),p=require('path'); const A='docs/archive/llm-bootstrap-2026-05'; const us=p.join(os.homedir(),'.claude','settings.json'); if(fs.existsSync(us)) fs.copyFileSync(us, A+'/settings-snapshot/user-settings.json.pre-overhaul'); const hd=p.join(os.homedir(),'.claude','hooks'); if(fs.existsSync(hd)){for(const f of fs.readdirSync(hd)) fs.copyFileSync(p.join(hd,f), A+'/user-hooks/'+f);} const rd=p.join(os.homedir(),'.claude','runtime'); if(fs.existsSync(rd)){for(const f of fs.readdirSync(rd).filter(x=>x.endsWith('-mode.json'))) fs.copyFileSync(p.join(rd,f), A+'/runtime-flags-snapshot/'+f);} fs.copyFileSync('docs/registry/nodes.yaml', A+'/nodes-yaml-archive/nodes.yaml.pre-overhaul'); fs.copyFileSync('.claude/settings.json', A+'/settings-snapshot/project-settings.json.pre-overhaul'); console.log('snapshots written')" ``` -- [ ] **Step 4: Commit archive scaffold** +Expected: «snapshots written». Verify: `ls docs/archive/llm-bootstrap-2026-05/{settings-snapshot,user-hooks,runtime-flags-snapshot}` непусто. -```bash -git add docs/archive/llm-bootstrap-2026-05/ && git commit -m "chore(brain): archive scaffold + pre-overhaul tag (phase 1)" +- [ ] **Step 4: Write failing test for test-rollback.mjs** + +```javascript +// tools/test-rollback.test.mjs +import { describe, it, expect } from 'vitest'; +import { planRollback } from './test-rollback.mjs'; + +describe('planRollback', () => { + it('restores git-tracked via tag + lists user-level snapshots', () => { + const plan = planRollback(); + expect(plan.gitTag).toBe('brain-pre-llm-bootstrap'); + expect(plan.userLevelRestores.some(r => r.to.includes('settings.json'))).toBe(true); + }); + it('resets runtime flags to snapshot (not hardcoded)', () => { + expect(planRollback().flagStrategy).toBe('restore-snapshot-delete-new'); + }); + it('lists episodes as PRESERVED not reverted (G5/G6)', () => { + expect(planRollback().preserve.some(x => x.includes('episodes'))).toBe(true); + }); +}); ``` -### Task 2: Inventory user-level hooks +- [ ] **Step 5: Run test to verify it fails** -**Files:** none (read-only inventory → записать в archive README) +Run: `npx vitest run tools/test-rollback.test.mjs` +Expected: FAIL — `planRollback is not exported`. -- [ ] **Step 1: Read user settings** +- [ ] **Step 6: Implement test-rollback.mjs (execFileSync — no shell, no injection)** -```bash -node -e "const fs=require('fs'),os=require('os'),p=require('path'); const f=p.join(os.homedir(),'.claude','settings.json'); const j=JSON.parse(fs.readFileSync(f,'utf8')); console.log(JSON.stringify(j.hooks||{},null,2))" +```javascript +// tools/test-rollback.mjs +#!/usr/bin/env node +import { existsSync, copyFileSync, readdirSync, rmSync } from 'fs'; +import { join } from 'path'; +import { homedir } from 'os'; +import { execFileSync } from 'child_process'; +import { fileURLToPath } from 'url'; + +const ARCHIVE = 'docs/archive/llm-bootstrap-2026-05'; + +export function planRollback() { + return { + gitTag: 'brain-pre-llm-bootstrap', + gitStrategy: 'git checkout brain-pre-llm-bootstrap -- ', + userLevelRestores: [ + { from: `${ARCHIVE}/settings-snapshot/user-settings.json.pre-overhaul`, to: '~/.claude/settings.json' }, + { from: `${ARCHIVE}/user-hooks/*`, to: '~/.claude/hooks/' }, + ], + flagStrategy: 'restore-snapshot-delete-new', + preserve: ['docs/observer/episodes-*.jsonl', 'docs/observer/notes/*'], + parserNote: 'после отката parser остаётся forward-compatible к v4 эпизодам (read-only graceful skip) — Task 15 (G5)', + }; +} + +function dryRun() { + const plan = planRollback(); + let ok = true; + const base = `${ARCHIVE}/settings-snapshot/user-settings.json.pre-overhaul`; + if (!existsSync(base)) { console.error('MISSING snapshot:', base); ok = false; } + try { execFileSync('git', ['rev-parse', plan.gitTag], { stdio: 'pipe' }); } + catch { console.error('MISSING git tag:', plan.gitTag); ok = false; } + console.log(ok ? '[dry-run] OK — rollback ready' : '[dry-run] FAIL — see above'); + return ok; +} + +function execRollback() { + const usFrom = `${ARCHIVE}/settings-snapshot/user-settings.json.pre-overhaul`; + if (existsSync(usFrom)) copyFileSync(usFrom, join(homedir(), '.claude', 'settings.json')); + const hooksDir = `${ARCHIVE}/user-hooks`; + if (existsSync(hooksDir)) for (const f of readdirSync(hooksDir)) copyFileSync(join(hooksDir, f), join(homedir(), '.claude', 'hooks', f)); + const rd = join(homedir(), '.claude', 'runtime'); + const snapDir = `${ARCHIVE}/runtime-flags-snapshot`; + const snapFlags = existsSync(snapDir) ? readdirSync(snapDir) : []; + for (const f of readdirSync(rd).filter(x => x.endsWith('-mode.json'))) { + if (!snapFlags.includes(f)) rmSync(join(rd, f)); + } + for (const f of snapFlags) copyFileSync(join(snapDir, f), join(rd, f)); + console.log('[execute] user-level + flags restored. Now run: git checkout brain-pre-llm-bootstrap -- . && npm install'); +} + +const isMain = process.argv[1] && fileURLToPath(import.meta.url) === process.argv[1]; +if (isMain) { + const mode = process.argv[2]; + if (mode === '--dry-run') process.exit(dryRun() ? 0 : 1); + else if (mode === '--execute') execRollback(); + else console.log('usage: test-rollback.mjs --dry-run | --execute'); +} ``` -Expected: список user-level hooks. Если есть skill-discipline-подобные (matcher на economy/skill-marker/skill-check) — выписать имена. +- [ ] **Step 7: Run test to verify it passes** -- [ ] **Step 2: Save inventory to archive** +Run: `npx vitest run tools/test-rollback.test.mjs` +Expected: PASS (3 tests). -Write `docs/archive/llm-bootstrap-2026-05/settings-skill-discipline/INVENTORY.md` с найденным списком (или «skill-discipline hooks не найдены в user settings» если их нет). +- [ ] **Step 8: Write ROLLBACK.md** -- [ ] **Step 3: Commit** +Write `docs/archive/llm-bootstrap-2026-05/ROLLBACK.md`: пошагово — (1) `node tools/test-rollback.mjs --execute` (user-level + flags), (2) `git checkout brain-pre-llm-bootstrap -- .` (git-tracked), (3) `npm install` (deps revert, G4), (4) smoke. Явно: **episodes-*.jsonl НЕ откатываются** (G6); **parser forward-compat** к v4 (G5). Manifest from→to. + +- [ ] **Step 9: ⭐ END-TO-END SMOKE — доказать что откат работает ДО деструкции** ```bash -git add docs/archive/llm-bootstrap-2026-05/settings-skill-discipline/ && git commit -m "chore(brain): user-level hooks inventory (phase 1 task 2)" +echo "rollback-smoke-test" > docs/observer/.rollback-smoke && git add -A && git commit -m "test: rollback smoke marker (TEMP)" +node -e "require('fs').writeFileSync(require('path').join(require('os').homedir(),'.claude','runtime','smoke-test-mode.json'),'{}')" +node tools/test-rollback.mjs --dry-run +node tools/test-rollback.mjs --execute +git checkout brain-pre-llm-bootstrap -- . ; git reset --soft brain-pre-llm-bootstrap +node -e "const fs=require('fs'),os=require('os'),p=require('path'); console.log('marker:', fs.existsSync('docs/observer/.rollback-smoke')?'STILL THERE (FAIL)':'gone (OK)'); console.log('smoke-flag:', fs.existsSync(p.join(os.homedir(),'.claude','runtime','smoke-test-mode.json'))?'STILL THERE (FAIL)':'deleted (OK)')" +``` + +Expected: marker gone, smoke-flag deleted, router-gate-mode.json = warn-only. Если хоть один FAIL — **СТОП**, чинить test-rollback.mjs до продолжения. + +- [ ] **Step 10: Re-commit rollback infra (clean)** + +```bash +git add tools/test-rollback.mjs tools/test-rollback.test.mjs docs/archive/llm-bootstrap-2026-05/ && git commit -m "feat(brain): rollback infra + snapshots + e2e-verified BEFORE any destruction (phase 1 task 1)" +``` + +**Exit Task 1:** откат мозга установлен, протестирован end-to-end, доказан. Дальше любая деструкция обратима за ~1 час. + +### Task 2: Снять §12 skill-discipline, сохранить economy (G3, DECISION POINT) + +**Files:** + +- Modify: `~/.claude/settings.json` (user), `~/.claude/hooks/economy-state-guard.py`, `~/.claude/hooks/economy-mode.py` +- Archive: `skill-marker.py`, `skill-check.py` (snapshot уже в Task 1) + +- [ ] **Step 1: Baseline — economy hooks работают** + +Run: `python "$HOME/.claude/hooks/economy-mode.py" < /dev/null; echo "exit: $?"` +Expected: exit 0 (или graceful). Зафиксировать baseline. + +- [ ] **Step 2: Снять skill-marker.py + skill-check.py из user settings** + +Удалить из `~/.claude/settings.json` блоки `PreToolUse Skill → skill-marker.py` и `PreToolUse Edit|Write|MultiEdit → skill-check.py`. Оригиналы в snapshot. + +- [ ] **Step 3: economy-state-guard.py — убрать §12-часть** + +Прочитать `economy-state-guard.py`, удалить только §12 skill-discipline логику (Bash-bypass-detect skill-enforcement), сохранить economy-state. Если чисто economy — no-op. + +- [ ] **Step 4: economy-mode.py — актуализировать «§12»→«§17»** + +В `economy-mode.py` инжект-текст «§12 hard rule НЕ override-ится» → «§17 universal skill-coverage». + +- [ ] **Step 5: Verify economy работает + §12 enforcement снят** + +Run: `python "$HOME/.claude/hooks/economy-mode.py" < /dev/null; echo "exit: $?"` +Expected: exit 0, economy-директива генерится. skill-check больше не блокирует Edit без skill. + +- [ ] **Step 6: Commit (project-side note)** + +```bash +git add docs/archive/llm-bootstrap-2026-05/ && git commit -m "chore(brain): archive §12 skill-discipline hooks, keep economy (phase 1 task 2)" ``` ### Task 3: Inventory discipline-metrics.mjs (closes B4) @@ -163,114 +322,30 @@ git add docs/archive/llm-bootstrap-2026-05/settings-skill-discipline/ && git com - [ ] **Step 1: Read what it measures** -```bash -node -e "console.log(require('fs').readFileSync('tools/discipline-metrics.mjs','utf8').slice(0,2000))" -``` +Run: `node -e "console.log(require('fs').readFileSync('tools/discipline-metrics.mjs','utf8').slice(0,2000))"` - [ ] **Step 2: Decide keep/remove** -Если измеряет only-§12-discipline (skill-invocation rate под §12) → archive вместе с §12. Если измеряет общий `path_type` (нужен для brain-retro дальше) → keep. Решение записать в commit message. +only-§12 → archive; общий path_type → keep. Решение в commit. -- [ ] **Step 3: Apply decision + commit** - -Если remove: `git mv tools/discipline-metrics.mjs docs/archive/llm-bootstrap-2026-05/` + удалить из STATUS.md generator если ссылается. Если keep: no-op. +- [ ] **Step 3: Apply + commit** ```bash -git commit -m "chore(brain): discipline-metrics.mjs decision — (phase 1 task 3)" +git commit -m "chore(brain): discipline-metrics.mjs — (phase 1 task 3)" ``` -### Task 4: test-rollback.mjs (TDD) +### Task 4: Archive §12 + routing-docs + memory + registry-map (G14) **Files:** -- Create: `tools/test-rollback.mjs` -- Test: `tools/test-rollback.test.mjs` - -- [ ] **Step 1: Write failing test** - -```javascript -// tools/test-rollback.test.mjs -import { describe, it, expect } from 'vitest'; -import { planRollback } from './test-rollback.mjs'; - -describe('planRollback', () => { - it('lists archive files to restore with their targets', () => { - const archive = { - 'pravila-12/Pravila_section_12.md': 'docs/Pravila_raboty_Claude_v1_1.md', - 'routing-docs/routing-off-phase.md': 'docs/routing-off-phase.md', - }; - const plan = planRollback(archive); - expect(plan.restores).toHaveLength(2); - expect(plan.restores[0]).toEqual({ from: 'pravila-12/Pravila_section_12.md', to: 'docs/Pravila_raboty_Claude_v1_1.md' }); - }); - - it('returns flags to reset', () => { - const plan = planRollback({}); - expect(plan.flags['router-classifier-mode.json']).toBe('regex-first'); - expect(plan.flags['skill-discipline-mode.json']).toBe('on'); - }); -}); -``` - -- [ ] **Step 2: Run test to verify it fails** - -Run: `npx vitest run tools/test-rollback.test.mjs` -Expected: FAIL — `planRollback is not exported`. - -- [ ] **Step 3: Write minimal implementation** - -```javascript -// tools/test-rollback.mjs -#!/usr/bin/env node -const PRE_OVERHAUL_FLAGS = { - 'router-classifier-mode.json': 'regex-first', - 'router-gate-mode.json': 'off', - 'skill-discipline-mode.json': 'on', - 'reviewer-mode.json': 'off', - 'self-assessment-mode.json': 'off', - 'self-retrospect-mode.json': 'off', - 'embedding-mode.json': 'off', - 'sanity-check-mode.json': 'off', - 'prompt-enrichment-mode.json': 'off', - 'inheritance-mode.json': 'off', -}; - -export function planRollback(archiveMap) { - return { - restores: Object.entries(archiveMap).map(([from, to]) => ({ from, to })), - flags: PRE_OVERHAUL_FLAGS, - }; -} -// CLI: --dry-run validates archive integrity; --execute restores. (added in step 5) -``` - -- [ ] **Step 4: Run test to verify it passes** - -Run: `npx vitest run tools/test-rollback.test.mjs` -Expected: PASS (2 tests). - -- [ ] **Step 5: Add CLI (--dry-run / --execute)** - -Добавить `main()` который читает archive manifest `docs/archive/llm-bootstrap-2026-05/ROLLBACK.md` frontmatter, `--dry-run` проверяет существование всех `from`-файлов, `--execute` копирует обратно + пишет flag-файлы. CLI-guard через `fileURLToPath(import.meta.url) === process.argv[1]`. - -- [ ] **Step 6: Commit** - -```bash -git add tools/test-rollback.mjs tools/test-rollback.test.mjs && git commit -m "feat(brain): test-rollback.mjs dry-run/execute (phase 1 task 4)" -``` - -### Task 5: Archive §12 + routing-docs + memory - -**Files:** - -- Modify: `docs/Pravila_raboty_Claude_v1_1.md` (extract §12) -- Move: `docs/routing-off-phase.md`, `docs/router-procedure.md`, `tools/observer-classification-map.json`, 2 memory files +- Modify: `docs/Pravila_raboty_Claude_v1_1.md` +- Move: routing-off-phase.md, router-procedure.md, observer-classification-map.json, registry-to-classification-map.mjs, 2 memory files - [ ] **Step 1: Extract Pravila §12 to archive** -Скопировать текст §12 (sub-секции 12.1-12.4) в `docs/archive/llm-bootstrap-2026-05/pravila-12/Pravila_section_12.md`. Удалить §12 из `docs/Pravila_raboty_Claude_v1_1.md` (NB: §15.2 pre-flight — Pravila в списке 8 нормативных файлов, sync был в Task 0). +§12 (12.1-12.4) → `archive/.../pravila-12/Pravila_section_12.md`, удалить из Pravila. -- [ ] **Step 2: Move routing-docs** +- [ ] **Step 2: Move routing-docs + classification-map** ```bash git mv docs/routing-off-phase.md docs/archive/llm-bootstrap-2026-05/routing-docs/ @@ -278,120 +353,115 @@ git mv docs/router-procedure.md docs/archive/llm-bootstrap-2026-05/routing-docs/ git mv tools/observer-classification-map.json docs/archive/llm-bootstrap-2026-05/routing-docs/ ``` -NB: `tools/missed-activations.mjs` читает `observer-classification-map.json` — он сломается до Task 12 (Phase 2 task 9 адаптирует на nodes.yaml). Между фазами checker в проде не вызывается (только в /brain-retro), так что окно безопасно. +- [ ] **Step 3: Neutralize registry-to-classification-map.mjs (G14)** -- [ ] **Step 3: Move 2 memory files** +Run: `grep -rn "registry-to-classification-map" lefthook.yml package.json .claude/ 2>&1 || echo "no refs"` +Expected: найти ссылки. Затем `git mv tools/registry-to-classification-map.mjs docs/archive/llm-bootstrap-2026-05/routing-docs/` + убрать его вызовы из lefthook.yml. + +- [ ] **Step 4: Move 2 memory files** ```bash git mv memory/feedback_superpowers_hard_rule.md docs/archive/llm-bootstrap-2026-05/memory/ git mv memory/feedback_feature_via_writing_plans.md docs/archive/llm-bootstrap-2026-05/memory/ ``` -Удалить их строки из `memory/MEMORY.md` индекса. +Удалить их строки из `memory/MEMORY.md`. -- [ ] **Step 4: Verify §12 references won't break lint** +- [ ] **Step 5: Verify lint + commit** Run: `npx -y markdownlint-cli2 docs/Pravila_raboty_Claude_v1_1.md` -Expected: 0 errors (или только pre-existing). - -- [ ] **Step 5: Commit** +Expected: 0 errors. ```bash -git add -A && git commit -m "chore(brain): archive §12 + routing-docs + 2 memory files (phase 1 task 5)" +git add -A && git commit -m "chore(brain): archive §12 + routing-docs + classification-map + 2 memory (phase 1 task 4)" ``` -### Task 6: Add Pravila §17 + ADR-016 +### Task 5: Add Pravila §17 + ADR-016 **Files:** -- Modify: `docs/Pravila_raboty_Claude_v1_1.md` (+§17, §16.4 update) +- Modify: `docs/Pravila_raboty_Claude_v1_1.md` - Create: `docs/adr/ADR-016-section17-universal-skill-coverage.md` - [ ] **Step 1: Add §17 to Pravila** -Вставить текст §17 из spec §6 (с подпунктами Acknowledgment/Cancellation exempt, Continuation НЕ exempt — D1 fix). Обновить §16.4 (missed-activations работают с §17). Обновить §0/§1 cross-refs + §9 changelog entry. +§17 из spec §6 (Acknowledgment/Cancellation exempt; Continuation НЕ exempt — D1). Обновить §16.4 + §0/§1 cross-refs + §9 changelog. - [ ] **Step 2: Create ADR-016** -Write `docs/adr/ADR-016-section17-universal-skill-coverage.md` со структурой Status (Accepted) / Context (§12 a-priori bias) / Decision (§17 default-deny) / Consequences (§19 trade-offs) / Boundaries (§17 vs §14 vs §15). Включить секцию `## Enforcement` (обязательно для adr-judge lefthook job). +Status/Context/Decision/Consequences/Boundaries + `## Enforcement` секция (adr-judge). -- [ ] **Step 3: Verify adr-judge passes** +- [ ] **Step 3: Verify + commit** -Run: `git add docs/adr/ADR-016-section17-universal-skill-coverage.md && npx -y markdownlint-cli2 docs/adr/ADR-016-section17-universal-skill-coverage.md` +Run: `npx -y markdownlint-cli2 docs/adr/ADR-016-section17-universal-skill-coverage.md` Expected: 0 errors. +```bash +git add -A && git commit -m "feat(brain): Pravila §17 + ADR-016 (phase 1 task 5)" +``` + +### Task 6: Normative cross-refs §12→§17 + adapt C1/C2 EARLY (closes G8/G9) + +**Files:** + +- Modify: `CLAUDE.md`, `docs/Plugin_stack_rules_v1.md`, `docs/Tooling_v8_3.md`, `tools/cross-ref-checker.mjs`, `tools/l1-watcher.mjs` + +- [ ] **Step 1: Archive §3.3 / R15 / Tooling «когда брать»** + +CLAUDE.md §3.3 → archive + pin на nodes.yaml; §5 п.11 →§17. PSR_v1 R15 → archive, R10.1 pin, R0.4.A §12→§17. Tooling §4.X «когда брать» → archive, §0 pin, §3.X router-procedure cross-refs убрать. NB: CLAUDE.md прямой Edit — worktree-эксцепшн §5 п.10. + +- [ ] **Step 2: Adapt C1 + C2 СРАЗУ (G8/G9)** + +`cross-ref-checker.mjs` — обновить ожидаемые cross-refs (§12→§17, убрать §3.3). `l1-watcher.mjs` — source §3.3 → nodes.yaml. Иначе pre-commit в Task 7+ упадёт. + +- [ ] **Step 3: Verify checkers green** + +Run: `node tools/cross-ref-checker.mjs && node tools/l1-watcher.mjs` +Expected: OK / 0 drift. + - [ ] **Step 4: Commit** ```bash -git add -A && git commit -m "feat(brain): Pravila §17 + ADR-016 universal skill-coverage (phase 1 task 6)" +git add -A && git commit -m "chore(brain): cross-refs §12→§17 + C1/C2 adapted early (phase 1 task 6)" ``` -### Task 7: Normative cross-refs §12→§17 (delegate to normative-sync agent) +### Task 7: Phase-1 flags + rollback re-verify **Files:** -- Modify: `CLAUDE.md` (§3.3 → pin, §5 п.11 rewrite), `docs/Plugin_stack_rules_v1.md` (R10.1 pin, R15 archive, R0.4.A), `docs/Tooling_v8_3.md` (§4.X «когда брать» archive, §0 pin) +- Create: `~/.claude/runtime/skill-discipline-mode.json`=off -- [ ] **Step 1: Archive CLAUDE.md §3.3** - -Скопировать §3.3 в `docs/archive/llm-bootstrap-2026-05/routing-docs/CLAUDE-section-3.3.md`, заменить в CLAUDE.md на pin: `См. реестр [docs/registry/nodes.yaml](docs/registry/nodes.yaml)`. §5 п.11 переписать (§12 → §17). NB: CLAUDE.md правится только через `claude-md-management` plugin (§5 п.10) — но в worktree эксцепшн (прецедент A8/A11/finance), прямой Edit допустим. - -- [ ] **Step 2: Archive PSR_v1 R15 + Tooling «когда брать»** - -PSR_v1: R15 → archive, R10.1 счётчики → pin, R0.4.A §12→§17. Tooling: §4.X поля «когда брать» (inventory подтвердил отсутствие — проверить повторно; если есть — extract) → archive, §0 счётчики → pin, §3.X cross-refs на router-procedure.md удалить. - -- [ ] **Step 3: Verify cross-ref-checker (после Task 11 он адаптирован; пока ручная проверка)** - -Run: `npx -y markdownlint-cli2 CLAUDE.md docs/Plugin_stack_rules_v1.md docs/Tooling_v8_3.md` -Expected: 0 errors. - -- [ ] **Step 4: Commit** +- [ ] **Step 1: Set phase-1 flags** ```bash -git add -A && git commit -m "chore(brain): normative cross-refs §12→§17, archive §3.3/R15 (phase 1 task 7)" +node -e "const fs=require('fs'),os=require('os'),p=require('path'); fs.writeFileSync(p.join(os.homedir(),'.claude','runtime','skill-discipline-mode.json'),JSON.stringify({mode:'off'})); console.log('skill-discipline off')" ``` -### Task 8: Create flag files + ROLLBACK.md +(router-gate-mode остаётся warn-only.) -**Files:** - -- Create: `~/.claude/runtime/*.json` (9 flags), `docs/archive/llm-bootstrap-2026-05/ROLLBACK.md` - -- [ ] **Step 1: Create flag files in phase-1 state** - -```bash -node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.join(os.homedir(),'.claude','runtime'); fs.mkdirSync(d,{recursive:true}); const flags={'router-gate-mode':'off','skill-discipline-mode':'off'}; for(const[k,v]of Object.entries(flags)) fs.writeFileSync(p.join(d,k+'.json'),JSON.stringify({mode:v})); console.log('phase-1 flags written')" -``` - -(Остальные 7 flags создаются по мере включения компонентов в фазах 2-3.) - -- [ ] **Step 2: Write ROLLBACK.md** - -Пошаговая инструкция (spec §7.6 + §10): git tag reset, archive restore через `test-rollback.mjs --execute`, flag reset, smoke. Frontmatter с archive manifest (from→to map) для `test-rollback.mjs`. - -- [ ] **Step 3: Dry-run rollback test** +- [ ] **Step 2: Rollback dry-run (после деструкции)** Run: `node tools/test-rollback.mjs --dry-run` -Expected: «all N archive files present, M flags would reset» без ошибок. +Expected: OK. -- [ ] **Step 4: Commit** +- [ ] **Step 3: Commit** ```bash -git add docs/archive/llm-bootstrap-2026-05/ROLLBACK.md && git commit -m "feat(brain): ROLLBACK.md + flag scaffold + dry-run verified (phase 1 task 8)" +git add -A && git commit -m "chore(brain): phase-1 flags + rollback re-verified (phase 1 task 7)" ``` -**Phase 1 exit:** §12 нет, §17 в Pravila, ADR-016 создан, ничего не enforce'ится (gate off). Routing работает по старому regex Layer 1. Переходное окно. +**Phase 1 exit:** откат доказан; §12 нет; §17 + ADR-016; economy сохранён; C1/C2 адаптированы; ничего не enforce'ится. --- ## Phase 2 — Classifier + памятка + inheritance + §17 (~1-1.5 недели) -### Task 9: router-config.mjs + nodes.yaml capabilities +### Task 8: router-config.mjs + nodes.yaml capabilities **Files:** - Create: `tools/router-config.mjs` -- Modify: `docs/registry/nodes.yaml` (+capabilities per узел) +- Modify: `docs/registry/nodes.yaml` - [ ] **Step 1: Create router-config.mjs** @@ -403,94 +473,20 @@ export const INHERITANCE_MAX_AGE_MIN = 30; export const REVIEWER_MAX_NEIGHBOR_EPISODES = 10; ``` -(заменить YYYYMMDD на реальный ID из Task 0). +- [ ] **Step 2: Fill capabilities (delegate to subagent)** -- [ ] **Step 2: Fill capabilities in nodes.yaml** +Каждому из ~84 узлов `capabilities:` (1-2 предложения, без «когда выбирать»). schema.json permissive (G12 verified). -Для каждого из ~84 узлов добавить поле `capabilities:` (1-2 предложения «что умеет», без «когда выбирать»). Делегировать subagent'у (механическая задача, full spec — описания из Tooling §4.X capability-части). Verify: каждый узел имеет непустой capabilities. - -Run: `node -e "const y=require('fs').readFileSync('docs/registry/nodes.yaml','utf8'); const nodes=(y.match(/^ - id:/gm)||[]).length; const caps=(y.match(/^ capabilities:/gm)||[]).length; console.log('nodes',nodes,'caps',caps); process.exit(nodes===caps?0:1)"` -Expected: nodes === caps, exit 0. +Run: `node -e "const y=require('fs').readFileSync('docs/registry/nodes.yaml','utf8'); const n=(y.match(/^ - id:/gm)||[]).length, c=(y.match(/^ capabilities:/gm)||[]).length; console.log('nodes',n,'caps',c); process.exit(n===c?0:1)"` +Expected: n===c, exit 0. - [ ] **Step 3: Commit** ```bash -git add tools/router-config.mjs docs/registry/nodes.yaml && git commit -m "feat(brain): router-config + nodes.yaml capabilities (phase 2 task 9)" +git add tools/router-config.mjs docs/registry/nodes.yaml && git commit -m "feat(brain): router-config + nodes.yaml capabilities (phase 2 task 8)" ``` -### Task 10: Prefilter — 3 группы + manual override + anchor (TDD) - -**Files:** - -- Modify: `tools/router-classifier.mjs` -- Test: `tools/router-classifier.test.mjs` - -- [ ] **Step 1: Write failing tests for prefilter** - -```javascript -// add to tools/router-classifier.test.mjs -import { prefilter } from './router-classifier.mjs'; - -describe('prefilter', () => { - it('manual override takes priority over continuation', () => { - const r = prefilter('делай через TDD', { prevState: null }); - expect(r.task_type).toBe('manual_override'); - expect(r.requested_node).toContain('test-driven-development'); - }); - it('continuation inherits prev classification within 30 min', () => { - const prevState = { classification: { task_type: 'feature', recommendedNode: '#19' }, timestamp: new Date().toISOString() }; - const r = prefilter('делай', { prevState }); - expect(r.source).toBe('prefilter_inherited'); - expect(r.task_type).toBe('feature'); - }); - it('continuation falls through when prev state older than 30 min', () => { - const old = new Date(Date.now() - 31 * 60000).toISOString(); - const prevState = { classification: { task_type: 'feature' }, timestamp: old }; - const r = prefilter('делай', { prevState }); - expect(r.task_type).toBe('conversation'); - }); - it('acknowledgment is pure conversation', () => { - expect(prefilter('спасибо', {}).task_type).toBe('conversation'); - }); - it('cancellation flags previous rejected', () => { - const prevState = { task_id: 'abc' }; - const r = prefilter('нет', { prevState }); - expect(r.previous_rejected).toBe(true); - }); - it('anchor word saves "делай аудит" from conversation', () => { - const r = prefilter('делай аудит', {}); - expect(r).toBeNull(); // falls through to LLM - }); - it('micro keyword → micro direct', () => { - expect(prefilter('поправь typo в строке', {}).task_type).toBe('micro'); - }); - it('content prompt falls through to LLM', () => { - expect(prefilter('добавь endpoint для экспорта сделок', {})).toBeNull(); - }); -}); -``` - -- [ ] **Step 2: Run tests to verify they fail** - -Run: `npx vitest run tools/router-classifier.test.mjs -t prefilter` -Expected: FAIL — `prefilter is not exported`. - -- [ ] **Step 3: Implement prefilter** - -Добавить в `router-classifier.mjs` (после imports): `CONTINUATION_PATTERNS`, `ACKNOWLEDGMENT_PATTERNS`, `CANCELLATION_PATTERNS`, `MANUAL_OVERRIDE_RE`, `ANCHOR_NOUNS` (28), `ANCHOR_IMPERATIVES` (10), `MICRO_KEYWORDS` (из spec §4.1). Функция `prefilter(prompt, { prevState })`: проверки 1-7 в порядке (manual override → continuation → acknowledgment → cancellation → short+anchor → micro → null). `requested_node` через fuzzy match (substring/Levenshtein ≤2) по nodes.yaml. Возраст: `(Date.now() - Date.parse(prevState.timestamp))/60000 <= 30`. - -- [ ] **Step 4: Run tests to verify they pass** - -Run: `npx vitest run tools/router-classifier.test.mjs -t prefilter` -Expected: PASS (8 tests). - -- [ ] **Step 5: Commit** - -```bash -git add tools/router-classifier.mjs tools/router-classifier.test.mjs && git commit -m "feat(router): prefilter 3 groups + manual override + anchor (phase 2 task 10)" -``` - -### Task 11: LLM-classifier Sonnet 4.6 + памятка (TDD) +### Task 9: Prefilter — 3 группы + manual override + anchor (TDD) **Files:** @@ -500,27 +496,74 @@ git add tools/router-classifier.mjs tools/router-classifier.test.mjs && git comm - [ ] **Step 1: Write failing tests** ```javascript -describe('buildClassifierPrompt', () => { - it('includes 4 памятка patterns', () => { - const p = buildClassifierPrompt('добавь фичу', { nodes: [], chains: {} }); - expect(p).toContain('ПАТТЕРН 1'); - expect(p).toContain('ПАТТЕРН 4'); - expect(p).toContain('no_skill_found'); +import { prefilter } from './router-classifier.mjs'; +describe('prefilter', () => { + it('manual override priority over continuation', () => { + const r = prefilter('делай через TDD', { prevState: null }); + expect(r.task_type).toBe('manual_override'); + expect(r.requested_node).toContain('test-driven-development'); }); - it('omits памятка when enrichment off', () => { - const p = buildClassifierPrompt('добавь фичу', { nodes: [], chains: {} }, { enrichment: false }); - expect(p).not.toContain('ПАТТЕРН 1'); + it('continuation inherits within 30 min', () => { + const prevState = { classification: { task_type: 'feature', recommendedNode: '#19' }, timestamp: new Date().toISOString() }; + const r = prefilter('делай', { prevState }); + expect(r.source).toBe('prefilter_inherited'); + expect(r.task_type).toBe('feature'); }); + it('continuation falls through when >30 min', () => { + const old = new Date(Date.now() - 31 * 60000).toISOString(); + expect(prefilter('делай', { prevState: { classification: { task_type: 'feature' }, timestamp: old } }).task_type).toBe('conversation'); + }); + it('acknowledgment is conversation', () => { expect(prefilter('спасибо', {}).task_type).toBe('conversation'); }); + it('cancellation flags previous rejected', () => { expect(prefilter('нет', { prevState: { task_id: 'abc' } }).previous_rejected).toBe(true); }); + it('anchor saves "делай аудит" → null', () => { expect(prefilter('делай аудит', {})).toBeNull(); }); + it('micro keyword → micro', () => { expect(prefilter('поправь typo в строке', {}).task_type).toBe('micro'); }); + it('content prompt → null', () => { expect(prefilter('добавь endpoint для экспорта сделок', {})).toBeNull(); }); }); +``` +- [ ] **Step 2: Run to verify fail** + +Run: `npx vitest run tools/router-classifier.test.mjs -t prefilter` +Expected: FAIL. + +- [ ] **Step 3: Implement prefilter** + +`CONTINUATION_PATTERNS`, `ACKNOWLEDGMENT_PATTERNS`, `CANCELLATION_PATTERNS`, `MANUAL_OVERRIDE_RE`, `ANCHOR_NOUNS` (28), `ANCHOR_IMPERATIVES` (10), `MICRO_KEYWORDS` (spec §4.1). `prefilter(prompt, {prevState})` — проверки 1-7. `requested_node` fuzzy. Возраст `<= INHERITANCE_MAX_AGE_MIN`. + +- [ ] **Step 4: Run to verify pass** + +Run: `npx vitest run tools/router-classifier.test.mjs -t prefilter` +Expected: PASS (8 tests). + +- [ ] **Step 5: Commit** + +```bash +git add tools/router-classifier.mjs tools/router-classifier.test.mjs && git commit -m "feat(router): prefilter 3 groups + manual override + anchor (phase 2 task 9)" +``` + +### Task 10: Sonnet 4.6 classifier + памятка + fallback module + accuracy-runner (TDD, closes G11) + +**Files:** + +- Modify: `tools/router-classifier.mjs`, `tools/router-accuracy-runner.mjs` +- Create: `tools/router-classifier-regex-fallback.mjs` +- Test: `tools/router-classifier.test.mjs` + +- [ ] **Step 1: Write failing tests** + +```javascript +describe('buildClassifierPrompt', () => { + it('includes 4 памятка when enrichment on', () => { + const p = buildClassifierPrompt('добавь фичу', { nodes: [], chains: {} }, { enrichment: true }); + expect(p).toContain('ПАТТЕРН 1'); expect(p).toContain('ПАТТЕРН 4'); + }); + it('omits памятка when off', () => { expect(buildClassifierPrompt('x', { nodes: [], chains: {} }, { enrichment: false })).not.toContain('ПАТТЕРН 1'); }); +}); describe('parseClassifierResponse', () => { - it('accepts null recommended_chain_id for custom chain', () => { - const r = parseClassifierResponse('{"task_type":"feature","recommended_node":"x","recommended_chain":["x"],"recommended_chain_id":null,"alternatives_considered":[],"no_skill_found":false}'); - expect(r.recommended_chain_id).toBeNull(); - }); - it('returns null on malformed JSON', () => { - expect(parseClassifierResponse('not json')).toBeNull(); + it('accepts null recommended_chain_id', () => { + expect(parseClassifierResponse('{"task_type":"feature","recommended_node":"x","recommended_chain":["x"],"recommended_chain_id":null,"alternatives_considered":[],"no_skill_found":false}').recommended_chain_id).toBeNull(); }); + it('returns null on malformed', () => { expect(parseClassifierResponse('nope')).toBeNull(); }); }); ``` @@ -529,73 +572,61 @@ describe('parseClassifierResponse', () => { Run: `npx vitest run tools/router-classifier.test.mjs -t 'buildClassifierPrompt|parseClassifierResponse'` Expected: FAIL. -- [ ] **Step 3: Implement** +- [ ] **Step 3: Extract regex to fallback module** -`buildClassifierPrompt(prompt, registry, { enrichment=true })` — system prompt из spec §4.2 (с памяткой, gated by enrichment flag). `parseClassifierResponse` — strip fences, JSON.parse, validate `task_type` string, return null on fail. `classify()` переписать: вызвать `prefilter` first → если null, escalate to Sonnet (`CLASSIFIER_MODEL` из router-config). Удалить `classifyByRegex` primary path, `shouldEscalate`, `HARD_KEYWORD_STEMS`. Fallback chain: Sonnet→Haiku→regex-fallback→degraded. +Скопировать `classifyByRegex` + `TASK_TYPE_KEYWORDS` + хелперы → `tools/router-classifier-regex-fallback.mjs` (export `classifyByRegex`). -- [ ] **Step 4: Move old regex to fallback module** +- [ ] **Step 4: Fix router-accuracy-runner.mjs import (G11)** -```bash -node -e "/* extract classifyByRegex + TASK_TYPE_KEYWORDS to tools/router-classifier-regex-fallback.mjs */" -``` +`tools/router-accuracy-runner.mjs:11`: `from './router-classifier.mjs'` → `from './router-classifier-regex-fallback.mjs'`. -(Скопировать старую regex-логику в `tools/router-classifier-regex-fallback.mjs` с export `classifyByRegex`.) +Run: `node tools/router-accuracy-runner.mjs 2>&1 | head -3` +Expected: работает. -- [ ] **Step 5: Run to verify pass** +- [ ] **Step 5: Implement classifier** + +`buildClassifierPrompt(prompt, registry, {enrichment=true})` (spec §4.2). `parseClassifierResponse`. `classify()`: prefilter→если null escalate to Sonnet. Удалить `shouldEscalate`, `HARD_KEYWORD_STEMS`; `classifyByRegex` re-export из fallback. Fallback chain Sonnet→Haiku→regex→degraded. + +- [ ] **Step 6: Run to verify pass + commit** Run: `npx vitest run tools/router-classifier.test.mjs` -Expected: PASS (all). - -- [ ] **Step 6: Commit** +Expected: PASS. ```bash -git add tools/router-classifier.mjs tools/router-classifier.test.mjs tools/router-classifier-regex-fallback.mjs && git commit -m "feat(router): Sonnet 4.6 classifier + памятка + fallback chain (phase 2 task 11)" +git add tools/router-classifier.mjs tools/router-classifier-regex-fallback.mjs tools/router-accuracy-runner.mjs tools/router-classifier.test.mjs && git commit -m "feat(router): Sonnet classifier + памятка + fallback + accuracy-runner fix (phase 2 task 10)" ``` -### Task 12: Adapt missed-activations to nodes.yaml (TDD, closes A5) +### Task 11: Adapt missed-activations to nodes.yaml + §17 (TDD, closes A5) **Files:** - Modify: `tools/missed-activations.mjs` - Test: `tools/missed-activations.test.mjs` -- [ ] **Step 1: Write failing test (new source + §17 definition)** +- [ ] **Step 1: Write failing test** ```javascript describe('detectMissedActivations §17', () => { - it('flags direct on non-conversation with non-null recommended_node', () => { - const eps = [{ schema_version: 4, classifier_output: { task_type: 'feature', recommended_node: '#19', no_skill_found: false }, execution_trace: { actual_node_invoked_first: 'direct' } }]; - const r = detectMissedActivations(eps); - expect(r.totalMissed).toBe(1); + it('flags direct on non-conversation with recommended_node', () => { + expect(detectMissedActivations([{ schema_version: 4, classifier_output: { task_type: 'feature', recommended_node: '#19', no_skill_found: false }, execution_trace: { actual_node_invoked_first: 'direct' } }]).totalMissed).toBe(1); }); - it('does not flag conversation/micro/manual_override', () => { - const eps = [{ schema_version: 4, classifier_output: { task_type: 'conversation', recommended_node: null } }]; - expect(detectMissedActivations(eps).totalMissed).toBe(0); + it('does not flag conversation', () => { + expect(detectMissedActivations([{ schema_version: 4, classifier_output: { task_type: 'conversation', recommended_node: null } }]).totalMissed).toBe(0); }); }); ``` -- [ ] **Step 2: Run to verify fail** +- [ ] **Step 2-4: Fail → implement → pass** -Run: `npx vitest run tools/missed-activations.test.mjs -t §17` -Expected: FAIL. - -- [ ] **Step 3: Implement new signature** - -Переписать `detectMissedActivations(episodes)` (drop classificationMap/dormancy params): missed = `classifier_output.recommended_node !== null && actual_node === 'direct' && task_type ∉ {conversation,micro,manual_override}` без escape-hatch. Старые v2/v3 эпизоды (без classifier_output) — fallback на legacy logic (читать из nodes.yaml triggers вместо archived map). - -- [ ] **Step 4: Run to verify pass** - -Run: `npx vitest run tools/missed-activations.test.mjs` -Expected: PASS. +`detectMissedActivations(episodes)` (drop classificationMap/dormancy). §17 definition. v2/v3 legacy на nodes.yaml triggers. Run `npx vitest run tools/missed-activations.test.mjs`. - [ ] **Step 5: Commit** ```bash -git add tools/missed-activations.mjs tools/missed-activations.test.mjs && git commit -m "feat(brain): missed-activations on nodes.yaml + §17 definition (phase 2 task 12)" +git add tools/missed-activations.mjs tools/missed-activations.test.mjs && git commit -m "feat(brain): missed-activations on nodes.yaml + §17 (phase 2 task 11)" ``` -### Task 13: Embedding layer (TDD, closes 3.3) +### Task 12: Embedding layer (TDD, closes 3.3) **Files:** @@ -604,49 +635,37 @@ git add tools/missed-activations.mjs tools/missed-activations.test.mjs && git co - [ ] **Step 1: Install dep** -```bash -npm i @xenova/transformers -``` +Run: `npm i @xenova/transformers` - [ ] **Step 2: Write failing test** ```javascript import { shouldEmbed, encodeBase64, decodeBase64 } from './router-embedding.mjs'; -describe('shouldEmbed', () => { - it('skips conversation/micro/manual_override', () => { +describe('embedding', () => { + it('shouldEmbed skips conversation/micro/manual_override', () => { expect(shouldEmbed('conversation')).toBe(false); expect(shouldEmbed('micro')).toBe(false); expect(shouldEmbed('manual_override')).toBe(false); expect(shouldEmbed('feature')).toBe(true); }); - it('base64 roundtrip preserves floats', () => { + it('base64 roundtrip', () => { const v = new Float32Array([0.1, -0.5, 0.9]); expect(Array.from(decodeBase64(encodeBase64(v)))).toEqual(Array.from(v)); }); }); ``` -- [ ] **Step 3: Run to verify fail** +- [ ] **Step 3-5: Fail → implement → pass** -Run: `npx vitest run tools/router-embedding.test.mjs` -Expected: FAIL. - -- [ ] **Step 4: Implement** - -`shouldEmbed(taskType)` → `!['conversation','micro','manual_override'].includes(taskType)`. `encodeBase64(float32)` → Buffer base64. `decodeBase64` → Float32Array. `embed(prompt)` → lazy-load Xenova pipeline, return Float32Array 384. Fallback: try/catch → null + log. `router-embedding-warmup.mjs` — load model on SessionStart. - -- [ ] **Step 5: Run to verify pass** - -Run: `npx vitest run tools/router-embedding.test.mjs` -Expected: PASS. +`shouldEmbed`, `encodeBase64`, `decodeBase64`, `embed(prompt)` (lazy Xenova, fallback null). `router-embedding-warmup.mjs`. Run `npx vitest run tools/router-embedding.test.mjs`. - [ ] **Step 6: Commit** ```bash -git add tools/router-embedding.mjs tools/router-embedding-warmup.mjs tools/router-embedding.test.mjs package.json package-lock.json && git commit -m "feat(router): local embedding layer + warmup (phase 2 task 13)" +git add tools/router-embedding.mjs tools/router-embedding-warmup.mjs tools/router-embedding.test.mjs package.json package-lock.json && git commit -m "feat(router): local embedding + warmup (phase 2 task 12)" ``` -### Task 14: §17 enforcement in router-tool-gate (TDD, closes D1) +### Task 13: §17 enforcement (TDD, closes D1) **Files:** @@ -658,51 +677,26 @@ git add tools/router-embedding.mjs tools/router-embedding-warmup.mjs tools/route ```javascript describe('§17 shouldBlock', () => { const base = { classification: { task_type: 'feature', no_skill_found: false }, skillInvokedThisTurn: false }; - it('warn-only never blocks', () => { - expect(shouldBlock('Edit', base, '', { mode: 'warn-only' })).toBe(false); - }); - it('enforce blocks direct on feature', () => { - expect(shouldBlock('Edit', base, '', { mode: 'enforce' })).toMatchObject({ block: true }); - }); - it('enforce passes conversation', () => { - const s = { ...base, classification: { task_type: 'conversation' } }; - expect(shouldBlock('Edit', s, '', { mode: 'enforce' })).toBe(false); - }); - it('enforce passes when skill invoked', () => { - expect(shouldBlock('Edit', { ...base, skillInvokedThisTurn: true }, '', { mode: 'enforce' })).toBe(false); - }); - it('enforce blocks no_skill_found', () => { - const s = { ...base, classification: { task_type: 'feature', no_skill_found: true } }; - expect(shouldBlock('Edit', s, '', { mode: 'enforce' })).toMatchObject({ reason: 'no_skill_found_block' }); - }); - it('continuation-inherited feature is NOT exempt (D1)', () => { - const s = { ...base, classification: { task_type: 'feature' } }; // inherited from continuation - expect(shouldBlock('Edit', s, '', { mode: 'enforce' })).toMatchObject({ block: true }); - }); + it('warn-only never blocks', () => { expect(shouldBlock('Edit', base, '', { mode: 'warn-only' })).toBe(false); }); + it('enforce blocks direct on feature', () => { expect(shouldBlock('Edit', base, '', { mode: 'enforce' })).toMatchObject({ block: true }); }); + it('enforce passes conversation', () => { expect(shouldBlock('Edit', { ...base, classification: { task_type: 'conversation' } }, '', { mode: 'enforce' })).toBe(false); }); + it('enforce passes when skill invoked', () => { expect(shouldBlock('Edit', { ...base, skillInvokedThisTurn: true }, '', { mode: 'enforce' })).toBe(false); }); + it('enforce blocks no_skill_found', () => { expect(shouldBlock('Edit', { ...base, classification: { task_type: 'feature', no_skill_found: true } }, '', { mode: 'enforce' })).toMatchObject({ reason: 'no_skill_found_block' }); }); + it('continuation-inherited feature NOT exempt (D1)', () => { expect(shouldBlock('Edit', { ...base, classification: { task_type: 'feature' } }, '', { mode: 'enforce' })).toMatchObject({ block: true }); }); }); ``` -- [ ] **Step 2: Run to verify fail** +- [ ] **Step 2-4: Fail → implement → pass** -Run: `npx vitest run tools/router-tool-gate.test.mjs -t §17` -Expected: FAIL. - -- [ ] **Step 3: Implement spec §4.4 logic** - -`NON_BLOCKING_TASK_TYPES = ['conversation','micro','manual_override']` (continuation НЕ входит — D1). Реализовать `shouldBlock` точно по spec §4.4. Mode из `~/.claude/runtime/router-gate-mode.json` (hot-reload per call). - -- [ ] **Step 4: Run to verify pass** - -Run: `npx vitest run tools/router-tool-gate.test.mjs` -Expected: PASS. +`NON_BLOCKING_TASK_TYPES = ['conversation','micro','manual_override']` (continuation НЕ входит — D1). `shouldBlock` spec §4.4. Mode hot-reload. Run `npx vitest run tools/router-tool-gate.test.mjs`. - [ ] **Step 5: Commit** ```bash -git add tools/router-tool-gate.mjs tools/router-tool-gate.test.mjs && git commit -m "feat(router): §17 enforcement logic, continuation not exempt (phase 2 task 14)" +git add tools/router-tool-gate.mjs tools/router-tool-gate.test.mjs && git commit -m "feat(router): §17 enforcement, continuation not exempt (phase 2 task 13)" ``` -### Task 15: router-prehook inheritance + cost tracking (TDD, closes B1/B5) +### Task 14: prehook inheritance + cost (TDD, closes B1/B5) **Files:** @@ -713,65 +707,47 @@ git add tools/router-tool-gate.mjs tools/router-tool-gate.test.mjs && git commit ```javascript describe('inheritance state', () => { - it('writes inheritance block when continuation', () => { - const state = buildStateFromClassification( - { task_type: 'feature', source: 'prefilter_inherited' }, - { sessionId: 's', promptHash: 'h', inheritedFrom: 'prev-task', ageMin: 5 } - ); - expect(state.inheritance.inherited_from_task_id).toBe('prev-task'); - expect(state.inheritance.inheritance_age_minutes).toBe(5); - }); - it('no ENFORCEMENT_TYPES whitelist anymore', async () => { - const mod = await import('./router-prehook.mjs'); - expect(mod.ENFORCEMENT_TYPES).toBeUndefined(); + it('writes inheritance block on continuation', () => { + const s = buildStateFromClassification({ task_type: 'feature', source: 'prefilter_inherited' }, { sessionId: 's', promptHash: 'h', inheritedFrom: 'prev', ageMin: 5 }); + expect(s.inheritance.inherited_from_task_id).toBe('prev'); + expect(s.inheritance.inheritance_age_minutes).toBe(5); }); + it('no ENFORCEMENT_TYPES', async () => { expect((await import('./router-prehook.mjs')).ENFORCEMENT_TYPES).toBeUndefined(); }); }); ``` -- [ ] **Step 2: Run to verify fail** +- [ ] **Step 2-4: Fail → implement → pass** -Run: `npx vitest run tools/router-prehook.test.mjs -t inheritance` -Expected: FAIL. - -- [ ] **Step 3: Implement** - -Удалить `ENFORCEMENT_TYPES` + `isEnforcementRequired`. `buildStateFromClassification` +`inheritance` блок. `main()`: вызвать `prefilter` first (pass prevState прочитанный из текущего state file ДО перезаписи); если continuation — наследовать + записать inheritance. Cost tracking: после classify записать `task_cost.classifier_*` в state. - -- [ ] **Step 4: Run to verify pass** - -Run: `npx vitest run tools/router-prehook.test.mjs` -Expected: PASS. +Удалить `ENFORCEMENT_TYPES` + `isEnforcementRequired`. `buildStateFromClassification` +inheritance. `main()`: prefilter с prevState (до перезаписи); continuation → наследовать. Cost → state. Run `npx vitest run tools/router-prehook.test.mjs`. - [ ] **Step 5: Commit** ```bash -git add tools/router-prehook.mjs tools/router-prehook.test.mjs && git commit -m "feat(router): prehook inheritance + cost tracking, drop ENFORCEMENT_TYPES (phase 2 task 15)" +git add tools/router-prehook.mjs tools/router-prehook.test.mjs && git commit -m "feat(router): prehook inheritance + cost, drop ENFORCEMENT_TYPES (phase 2 task 14)" ``` -### Task 16: Parser v4.0 + adapters + SessionStart hook + flags flip +### Task 15: Parser v4.0 fwd-compat + warmup hook + phase-2 flags (closes G5) **Files:** -- Modify: `tools/observer-transcript-parser.mjs`, `tools/l1-watcher.mjs`, `tools/cross-ref-checker.mjs`, `tools/registry-load.mjs`, `tools/registry-render.mjs`, `.claude/settings.json` +- Modify: `tools/observer-transcript-parser.mjs`, `tools/registry-load.mjs`, `tools/registry-render.mjs`, `.claude/settings.json` -- [ ] **Step 1: Parser handles v4.0** +- [ ] **Step 1: Parser v4.0 + forward-compat (G5)** -`observer-transcript-parser.mjs` — detect `schema_version`, при write новых эпизодов писать `classifier_output` + `environment.classifier_model` + `degraded_mode`. Read — handle v1/v2/v3/v4. +Write новых эпизодов с `classifier_output` + `environment.classifier_model` + `degraded_mode`. Read — handle v1/v2/v3/v4 + **graceful skip неизвестных future-версий** (для отката, G5). -- [ ] **Step 2: Adapt C1/C2** +- [ ] **Step 2: Adapt registry-load/render для capabilities** -`l1-watcher.mjs` source Tooling §3.3 → nodes.yaml. `cross-ref-checker.mjs` cross-refs list (§12→§17). `registry-load/render` — handle new capabilities field. +- [ ] **Step 3: Register SessionStart warmup** -- [ ] **Step 3: Register SessionStart warmup hook** +`.claude/settings.json` +SessionStart → `node tools/router-embedding-warmup.mjs`. -`.claude/settings.json` +SessionStart hook → `node tools/router-embedding-warmup.mjs`. +- [ ] **Step 4: Integration smoke** -- [ ] **Step 4: Integration test + smoke** - -Run: `npx vitest run tools/ && npx -y markdownlint-cli2 docs/**/*.md` +Run: `npx vitest run tools/ && npx -y markdownlint-cli2 "docs/**/*.md"` Expected: PASS / 0 errors. -- [ ] **Step 5: Create + flip phase-2 flags** +- [ ] **Step 5: Flip phase-2 flags** ```bash node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.join(os.homedir(),'.claude','runtime'); const f={'router-classifier-mode':'llm-first','prompt-enrichment-mode':'on','inheritance-mode':'on','embedding-mode':'on'}; for(const[k,v]of Object.entries(f)) fs.writeFileSync(p.join(d,k+'.json'),JSON.stringify({mode:v})); console.log('phase-2 flags on')" @@ -780,16 +756,16 @@ node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.jo - [ ] **Step 6: Commit** ```bash -git add tools/ .claude/settings.json && git commit -m "feat(router): parser v4.0 + C1/C2 adapters + warmup hook + phase-2 flags (phase 2 task 16)" +git add tools/ .claude/settings.json && git commit -m "feat(router): parser v4.0 fwd-compat + warmup + phase-2 flags (phase 2 task 15)" ``` -**Phase 2 exit:** новый router работает, эпизоды v4.0 + inheritance, §17 warn-only. +**Phase 2 exit:** новый router работает, эпизоды v4.0, §17 warn-only, accuracy-runner чинён, registry-map нейтрализован. --- ## Phase 3 — Evidence loop (~1.5-2 недели) -### Task 17: Stop-hook execution_trace + chain_gaps + timeout (TDD) +### Task 16: Stop-hook execution_trace + chain_gaps + inheritance copy + timeout (TDD) **Files:** @@ -800,37 +776,28 @@ git add tools/ .claude/settings.json && git commit -m "feat(router): parser v4.0 ```javascript describe('execution_trace v4.1', () => { - it('builds chain_gaps when chain incomplete', () => { - const trace = buildExecutionTrace({ recommended_chain: ['a','b','c'], invoked: ['a'] }); - expect(trace.chain_gaps[0].executed_steps).toBe(1); - expect(trace.chain_gaps[0].expected_steps).toBe(3); + it('builds chain_gaps when incomplete', () => { + const t = buildExecutionTrace({ recommended_chain: ['a','b','c'], invoked: ['a'] }); + expect(t.chain_gaps[0].executed_steps).toBe(1); + expect(t.chain_gaps[0].expected_steps).toBe(3); }); - it('copies inheritance from state to episode', () => { - const ep = buildEpisode({ state: { inheritance: { inherited_from_task_id: 'x' } } }); - expect(ep.inheritance.inherited_from_task_id).toBe('x'); + it('copies inheritance from state (B5)', () => { + expect(buildEpisode({ state: { inheritance: { inherited_from_task_id: 'x' } } }).inheritance.inherited_from_task_id).toBe('x'); }); }); ``` -- [ ] **Step 2: Run to verify fail** +- [ ] **Step 2-4: Fail → implement → pass** -Run: `npx vitest run tools/observer-stop-hook.test.mjs -t execution_trace` -Expected: FAIL. +`buildExecutionTrace` + `chain_gaps`. `buildEpisode` copies `inheritance.*` (B5). schema_minor→1. `.claude/settings.json` Stop timeout 5s→15s. Run `npx vitest run tools/observer-stop-hook.test.mjs`. -- [ ] **Step 3: Implement** - -`buildExecutionTrace` + `chain_gaps` computation. `buildEpisode` copies `inheritance.*` from state (closes B5). Bump schema_minor→1. `.claude/settings.json` Stop-hook timeout 5s→15s. - -- [ ] **Step 4: Run + commit** - -Run: `npx vitest run tools/observer-stop-hook.test.mjs` -Expected: PASS. +- [ ] **Step 5: Commit** ```bash -git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs .claude/settings.json && git commit -m "feat(observer): execution_trace + chain_gaps + inheritance copy, timeout 15s (phase 3 task 17)" +git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs .claude/settings.json && git commit -m "feat(observer): execution_trace + chain_gaps + inheritance copy, timeout 15s (phase 3 task 16)" ``` -### Task 18: Self-assessment in Stop-hook (TDD) +### Task 17: Self-assessment in Stop-hook (TDD) **Files:** @@ -841,100 +808,87 @@ git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs .claude/s ```javascript describe('self_assessment', () => { - it('marks pending when API call skipped/failed', () => { - const sa = buildSelfAssessment({ apiResult: null }); - expect(sa.self_assessment_pending).toBe(true); - }); - it('parses valid assessment JSON', () => { + it('marks pending when API skipped', () => { expect(buildSelfAssessment({ apiResult: null }).self_assessment_pending).toBe(true); }); + it('parses valid', () => { const sa = buildSelfAssessment({ apiResult: '{"summary":"x","confidence_in_choice":0.8,"what_could_be_better":null,"lesson_learned":null}' }); - expect(sa.confidence_in_choice).toBe(0.8); - expect(sa.self_assessment_pending).toBe(false); + expect(sa.confidence_in_choice).toBe(0.8); expect(sa.self_assessment_pending).toBe(false); }); }); ``` - [ ] **Step 2-4: Fail → implement → pass** -`buildSelfAssessment` — parse API result или mark pending. API call к `REVIEWER_MODEL` (Opus) с prompt из spec §4.5, обёрнут в timeout-budget (skip if Stop near 15s limit). Cost → `task_cost.self_assessment_*`. Bump schema_minor→2. Run `npx vitest run tools/observer-stop-hook.test.mjs`. +`buildSelfAssessment`. API к `REVIEWER_MODEL` (Opus) spec §4.5 в timeout-budget. Cost → `task_cost.self_assessment_*`. schema_minor→2. Run `npx vitest run tools/observer-stop-hook.test.mjs`. - [ ] **Step 5: Flip flag + commit** ```bash node -e "require('fs').writeFileSync(require('path').join(require('os').homedir(),'.claude','runtime','self-assessment-mode.json'),JSON.stringify({mode:'on'}))" -git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs && git commit -m "feat(observer): self_assessment in Stop-hook with retroactive fallback (phase 3 task 18)" +git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs && git commit -m "feat(observer): self_assessment + retroactive fallback (phase 3 task 17)" ``` -### Task 19: Reviewer subagent integration + fallback (closes 1.2/2.2/B2) +### Task 18: CREATE brain-retro-opus-reviewer.mjs + verify reviewer-agent (TDD, closes G16) **Files:** -- Modify: `.claude/agents/reviewer-agent.md` (verify), `tools/brain-retro-opus-reviewer.mjs` (keep as fallback) -- Test: `tools/brain-retro-opus-reviewer.test.mjs` +- Create: `tools/brain-retro-opus-reviewer.mjs` (**не существует — G16**), `tools/brain-retro-opus-reviewer.test.mjs` +- Modify: `.claude/agents/reviewer-agent.md` (verify) -- [ ] **Step 1: Verify reviewer-agent.md (уже создан)** +- [ ] **Step 1: Verify reviewer-agent.md (создан commit 49aa4ba7)** -Read `.claude/agents/reviewer-agent.md` — verify frontmatter (tools, model=opus), 8-dimension output, adaptive v2/v3/v4. Update system prompt if review checklist changed. NOT create from scratch (closes A3). +Read `.claude/agents/reviewer-agent.md` — verify frontmatter (tools, model=opus), 8-dim, adaptive v2/v3/v4. Обновить system prompt если изменился чек-лист. НЕ создавать (A3). -- [ ] **Step 2: Write failing test for fallback handler** +- [ ] **Step 2: Write failing test** ```javascript import { buildReviewPrompt, parseReview } from './brain-retro-opus-reviewer.mjs'; describe('reviewer fallback handler', () => { - it('adaptive prompt for v3 omits alternatives', () => { - expect(buildReviewPrompt({ schema_version: 3 })).not.toContain('alternatives_considered'); - }); - it('parses 8-dimension review', () => { - const r = parseReview('{"node_quality":"correct","chain_quality":"n/a","gap_assessment":"n/a","agent_self_assessment_accuracy":"accurate","error_root_cause":"n/a","alternative_better":null,"outcome_reviewed":"success","reasoning":"x"}'); - expect(r.node_quality).toBe('correct'); + it('v3 omits alternatives', () => { expect(buildReviewPrompt({ schema_version: 3 })).not.toContain('alternatives_considered'); }); + it('parses 8-dim review', () => { + expect(parseReview('{"node_quality":"correct","chain_quality":"n/a","gap_assessment":"n/a","agent_self_assessment_accuracy":"accurate","error_root_cause":"n/a","alternative_better":null,"outcome_reviewed":"success","reasoning":"x"}').node_quality).toBe('correct'); }); }); ``` - [ ] **Step 3-4: Fail → implement → pass** -`brain-retro-opus-reviewer.mjs` — keep/refactor as direct-API fallback handler: `buildReviewPrompt(episode)` adaptive by schema, `parseReview`, `reviewViaDirectApi(episode)` calls Opus. Run `npx vitest run tools/brain-retro-opus-reviewer.test.mjs`. +**CREATE** `tools/brain-retro-opus-reviewer.mjs` (G16 — файл не существует): `buildReviewPrompt(episode)` adaptive, `parseReview`, `reviewViaDirectApi(episode)` к Opus. Это direct-API fallback для reviewer subagent (spec §4.6). Run `npx vitest run tools/brain-retro-opus-reviewer.test.mjs`. - [ ] **Step 5: Commit** ```bash -git add .claude/agents/reviewer-agent.md tools/brain-retro-opus-reviewer.mjs tools/brain-retro-opus-reviewer.test.mjs && git commit -m "feat(brain): reviewer subagent verify + direct-API fallback handler (phase 3 task 19)" +git add tools/brain-retro-opus-reviewer.mjs tools/brain-retro-opus-reviewer.test.mjs .claude/agents/reviewer-agent.md && git commit -m "feat(brain): CREATE reviewer fallback handler + verify subagent (phase 3 task 18)" ``` -### Task 20: Sanity-generator + /brain-retro skill + self-retrospect skill +### Task 19: Sanity-generator + brain-retro v2 + self-retrospect skill **Files:** - Create: `tools/brain-retro-sanity-generator.mjs`, `.claude/skills/self-retrospect/SKILL.md`, `docs/observer/.self-retrospect-counter.json`, `docs/observer/sanity-checks/.gitkeep` -- Modify: `.claude/skills/brain-retro/SKILL.md`, `.claude/skills/brain-retro/references/aggregation-template.md` +- Modify: `.claude/skills/brain-retro/SKILL.md`, `references/aggregation-template.md` - Test: `tools/brain-retro-sanity-generator.test.mjs` -- [ ] **Step 1: Write failing test for sanity-generator** +- [ ] **Step 1: Write failing test** ```javascript import { generateCandidateQuestions } from './brain-retro-sanity-generator.mjs'; describe('sanity questions', () => { - it('generates bugfix question when >10 bugfix episodes', () => { - const eps = Array(11).fill({ classifier_output: { task_type: 'bugfix' } }); - const qs = generateCandidateQuestions(eps); - expect(qs.some(q => /баг/i.test(q))).toBe(true); - }); - it('returns max 5 candidates', () => { - expect(generateCandidateQuestions([]).length).toBeLessThanOrEqual(5); - }); + it('bugfix question when >10 bugfix', () => { expect(generateCandidateQuestions(Array(11).fill({ classifier_output: { task_type: 'bugfix' } })).some(q => /баг/i.test(q))).toBe(true); }); + it('max 5', () => { expect(generateCandidateQuestions([]).length).toBeLessThanOrEqual(5); }); }); ``` - [ ] **Step 2-4: Fail → implement → pass** -`generateCandidateQuestions(episodes)` — 5 candidate questions из spec §4.7. Run `npx vitest run tools/brain-retro-sanity-generator.test.mjs`. +`generateCandidateQuestions(episodes)` (spec §4.7). Run `npx vitest run tools/brain-retro-sanity-generator.test.mjs`. - [ ] **Step 5: Update brain-retro SKILL.md** -Description → «раз в 1-2 недели» (closes 3.1). New procedure steps 5/6/9/11 (sanity-check + PII filter + reviewer subagent + opt-in self-retrospect + cost report) из spec §4.7. Update `references/aggregation-template.md`. +Description → «раз в 1-2 недели» (3.1). New steps 5/6/9/11 (sanity + PII filter + reviewer Task() + opt-in self-retrospect + cost report) spec §4.7. Update `references/aggregation-template.md`. - [ ] **Step 6: Create self-retrospect skill** -`.claude/skills/self-retrospect/SKILL.md` — opt-in, pattern aggregation (classifier promahi / confidence calibration / cognitive errors), output `docs/observer/self-retrospect/YYYY-MM-DD.md`, lesson_learned → memory. Per spec §4.8. +`.claude/skills/self-retrospect/SKILL.md` (spec §4.8). - [ ] **Step 7: Init counter + sanity dir** @@ -946,37 +900,30 @@ node -e "require('fs').mkdirSync('docs/observer/sanity-checks',{recursive:true}) - [ ] **Step 8: Commit** ```bash -git add tools/brain-retro-sanity-generator.* .claude/skills/ docs/observer/.self-retrospect-counter.json docs/observer/sanity-checks/.gitkeep && git commit -m "feat(brain): sanity-generator + brain-retro v2 procedure + self-retrospect skill (phase 3 task 20)" +git add tools/brain-retro-sanity-generator.* .claude/skills/ docs/observer/.self-retrospect-counter.json docs/observer/sanity-checks/.gitkeep && git commit -m "feat(brain): sanity-generator + brain-retro v2 + self-retrospect skill (phase 3 task 19)" ``` -### Task 21: brain-retro-analyzer v4 + status-md-generator + final flags +### Task 20: analyzer v4 + status-md + schema v4.3 + final flags **Files:** -- Modify: `tools/brain-retro-analyzer.mjs`, `tools/status-md-generator.mjs` +- Modify: `tools/brain-retro-analyzer.mjs`, `tools/status-md-generator.mjs`, `tools/observer-stop-hook.mjs` - Test: `tools/brain-retro-analyzer.test.mjs` -- [ ] **Step 1: Write failing test for v4 factors** +- [ ] **Step 1: Write failing test** ```javascript -describe('analyzer v4 factors', () => { - it('aggregates inheritance episodes', () => { - const eps = [{ schema_version: 4, inheritance: { inherited_from_task_id: 'x' } }]; - const r = analyze(eps); - expect(r.inheritanceCount).toBe(1); - }); - it('aggregates review.node_quality distribution', () => { - const eps = [{ schema_version: 4, review: { node_quality: 'correct' } }]; - expect(analyze(eps).reviewQuality.correct).toBe(1); - }); +describe('analyzer v4', () => { + it('aggregates inheritance', () => { expect(analyze([{ schema_version: 4, inheritance: { inherited_from_task_id: 'x' } }]).inheritanceCount).toBe(1); }); + it('aggregates review.node_quality', () => { expect(analyze([{ schema_version: 4, review: { node_quality: 'correct' } }]).reviewQuality.correct).toBe(1); }); }); ``` - [ ] **Step 2-4: Fail → implement → pass** -`brain-retro-analyzer.mjs` +inheritance count, +review distribution, +cost aggregation. `status-md-generator.mjs` +Cost monitoring, +Classifier anomaly, +Self-retrospect last, +Sanity coverage, +Reviewer subagent/fallback ratio (spec §9.4). Run `npx vitest run tools/brain-retro-analyzer.test.mjs`. +`analyzer` +inheritance +review distribution +cost. `status-md-generator` +Cost/anomaly/self-retrospect/reviewer-ratio (spec §9.4). Run `npx vitest run tools/brain-retro-analyzer.test.mjs`. -- [ ] **Step 5: Bump schema v4.3 + create cost-daily** +- [ ] **Step 5: Schema v4.3 + cost-daily** Parser writes `prompt_embedding_base64` + `outcome_reviewed` + extended task_cost (schema_minor→3). Stop-hook updates `~/.claude/runtime/cost-daily.json`. @@ -986,62 +933,59 @@ Parser writes `prompt_embedding_base64` + `outcome_reviewed` + extended task_cos node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.join(os.homedir(),'.claude','runtime'); const f={'reviewer-mode':'subagent','self-retrospect-mode':'on','sanity-check-mode':'mandatory'}; for(const[k,v]of Object.entries(f)) fs.writeFileSync(p.join(d,k+'.json'),JSON.stringify({mode:v})); console.log('phase-3 flags on')" ``` -- [ ] **Step 7: Full smoke regression** +- [ ] **Step 7: Full smoke** Run: `npx vitest run tools/ && npx -y markdownlint-cli2 "docs/**/*.md" && ./bin/gitleaks.exe protect --staged` -Expected: all PASS, 0 leaks. +Expected: PASS, 0 leaks. - [ ] **Step 8: Commit** ```bash -git add tools/ docs/observer/ && git commit -m "feat(brain): analyzer v4 + status-md cost sections + schema v4.3 + phase-3 flags (phase 3 task 21)" +git add tools/ docs/observer/ && git commit -m "feat(brain): analyzer v4 + status-md + schema v4.3 + phase-3 flags (phase 3 task 20)" ``` -### Task 22: Phase 3 verification + rollback dry-run +### Task 21: Phase 3 verification + rollback re-proof **Files:** none (verification) -- [ ] **Step 1: Verify all 9 flags present** - -```bash -node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.join(os.homedir(),'.claude','runtime'); ['router-classifier-mode','router-gate-mode','self-assessment-mode','reviewer-mode','self-retrospect-mode','embedding-mode','sanity-check-mode','prompt-enrichment-mode','inheritance-mode'].forEach(f=>{const e=fs.existsSync(p.join(d,f+'.json')); console.log(f, e?'OK':'MISSING')})" -``` +- [ ] **Step 1: Verify all 9 flags** +Run: `node -e "const fs=require('fs'),os=require('os'),p=require('path'); const d=p.join(os.homedir(),'.claude','runtime'); ['router-classifier-mode','router-gate-mode','self-assessment-mode','reviewer-mode','self-retrospect-mode','embedding-mode','sanity-check-mode','prompt-enrichment-mode','inheritance-mode'].forEach(f=>console.log(f, fs.existsSync(p.join(d,f+'.json'))?'OK':'MISSING'))"` Expected: all 9 OK. -- [ ] **Step 2: Rollback dry-run** +- [ ] **Step 2: Rollback dry-run (после всех фаз)** Run: `node tools/test-rollback.mjs --dry-run` -Expected: «all archive files present, 9 flags would reset». +Expected: OK. -- [ ] **Step 3: Run brain-retro end-to-end smoke (manual)** +- [ ] **Step 3: brain-retro e2e smoke (manual)** -Invoke `/brain-retro` for a tiny period — verify sanity-check questions appear, reviewer subagent spawns (or fallback), cost report renders. Document result. +Invoke `/brain-retro` крошечный период — verify sanity-questions, reviewer subagent spawn (или fallback), cost report. Document. -- [ ] **Step 4: Final regression + DoD check** +- [ ] **Step 4: Final regression + DoD** -Run: `npx vitest run tools/` (all GREEN) + verify spec §16 DoD checklist items. Document each ✅. +Run: `npx vitest run tools/` (GREEN) + verify spec §16 DoD. Document each ✅. - [ ] **Step 5: Finishing the branch** -Используй `superpowers:finishing-a-development-branch` для merge/PR decision. +Используй `superpowers:finishing-a-development-branch`. -**Phase 3 exit:** полный evidence pipeline. Episode v4.3 + inheritance + памятка + reviewer subagent + self-retrospect. Bootstrap начинается (passive period). +**Phase 3 exit:** полный evidence pipeline, episode v4.3, reviewer subagent, self-retrospect, откат доказан повторно. Bootstrap начинается. --- -## Self-review (выполнено при написании плана) +## Self-review (выполнено при написании v1.1) -**1. Spec coverage:** все 8 слоёв spec §4 → задачи (Слой 1 Task 10, Слой 2 Task 11, Слой 3 Task 13, Слой 4 Task 14, Слой 5 Task 17-18, Слой 6 Task 19, Слой 7-8 Task 20). §17 → Task 6+14. Архив §7 → Task 5+7. Контролёры §9 → Task 16+21. Flags §10 → Task 8+16+18+21. Schema §5 v4.0→v4.3 → Task 16/17/18/21. 16 находок v2.2 — каждая имеет задачу с пометкой «closes X». +**1. Spec coverage:** все 8 слоёв → задачи (Слой 1 Task 9, Слой 2 Task 10, Слой 3 Task 12, Слой 4 Task 13, Слой 5 Task 16-17, Слой 6 Task 18, Слой 7-8 Task 19). §17 → Task 5+13. Архив §7 → Task 4+6. Контролёры §9 → Task 6 (C1/C2) + Task 20 (C4). 16 находок v2.2 + 9 находок аудита плана — каждая «closes X». -**2. Placeholder scan:** код-шаги содержат реальный код/тесты. `claude-sonnet-4-6-YYYYMMDD` — намеренный placeholder, резолвится Task 0 step 3 + Task 9. nodes.yaml capabilities — делегируется subagent (механика). +**2. Placeholder scan:** код-шаги реальны. `claude-sonnet-4-6-YYYYMMDD` — резолвится Task 0/8. nodes.yaml capabilities — subagent. -**3. Type consistency:** `prefilter()` / `buildClassifierPrompt()` / `shouldBlock()` / `detectMissedActivations()` / `buildExecutionTrace()` / `buildSelfAssessment()` / `parseReview()` / `generateCandidateQuestions()` / `analyze()` — имена консистентны между задачами и тестами. Schema поля (`classifier_output`, `inheritance`, `review`, `outcome_reviewed_source`) совпадают с spec §5. +**3. Type consistency:** `planRollback()` / `prefilter()` / `buildClassifierPrompt()` / `parseClassifierResponse()` / `shouldBlock()` / `detectMissedActivations()` / `shouldEmbed()` / `buildStateFromClassification()` / `buildExecutionTrace()` / `buildSelfAssessment()` / `buildReviewPrompt()` / `parseReview()` / `generateCandidateQuestions()` / `analyze()` — консистентны. -**Gap fixed inline:** Task 12 (missed-activations) добавлен явно в Phase 2 (spec §11.2 task 9), чтобы checker не сломался после архива map в Task 5. +**Аудит-fixes inline:** Task 1 rollback FIRST + e2e proof; Task 2 economy vs §12 (G3); Task 4 registry-map (G14); Task 6 C1/C2 early (G8/G9); Task 10 accuracy-runner (G11); Task 15 parser fwd-compat (G5); Task 18 reviewer CREATE not keep (G16); rollback covers user-level + episodes preserved. --- -**Версия плана:** 1.0 от 2026-05-25. -**Источник:** spec v2.2 `docs/superpowers/specs/2026-05-24-llm-first-router-overhaul-design.md`. +**Версия плана:** 1.1 от 2026-05-25 (v1.0 → v1.1 после 0%-аудита: rollback-first + 9 пробелов). +**Источник:** spec v2.2. **Фаза 4 (distillation):** отдельный план через ~6 месяцев.