Closes brain-retro #9 candidate 10 + self-retrospect 28.05: 16 reviewer-
Opus marks of "should have delegated to coder-agent". Controller (Opus)
was doing repetitive mechanical work itself, burning big-context budget
on tasks suited for fresh subagent.
PATTERN 8 trains classifier to recognize mechanical/repetitive signals
(N odnotipnyh, massovaya pravka, po shablonu) and recommend coder-agent
#19 via Task tool delegation.
Closes brain-retro #9 candidate 8: 8 reviewer-Opus marks of "should
have used Sentry first". Self-retrospect 28.05: "симптом с боевого →
гадать по коду вместо Sentry".
PATTERN 7 forces classifier to put Sentry MCP (#34) FIRST in
recommended_chain when prompt indicates production-runtime origin
(boevoj, klient soobschil, v logah, etc).
NB: Sentry MCP is currently pending B-1 deployment per Tooling section
4.8, but pattern is added so classifier produces correct recommendation
once instance is live.
Closes brain-retro #9 candidate 1: classifier recognized bugfix via
PATTERN 4 (→ systematic-debugging) but didn't extend to chain with
Pest #18 for test-first regression coverage.
Real-world driver: adr-judge.py catastrophic backtracking fix (commit
1e1457eb) — should have gone through TDD via Pest, not direct edit.
Reviewer Section A in retro #9 flagged this.
PATTERN 6 extends PATTERN 4 with explicit chain recommendation when
fix touches live code (regex/parser/hook/race/perf).
Brain-retro #6 follow-up #2 (consolidated). Eight independent fixes:
A1 — task_cost wiring (cost tracking)
- router-prehook.mjs: capture classifier LLM usage via onUsage callback,
persist to state.task_cost.classifier_input_tokens / output_tokens.
- observer-transcript-parser.mjs: merge router-state.task_cost on top of
extractTokenUsage(turn). State-file values win for classifier/
self_assessment/reviewer fields.
- New buildCostFromClassifierUsage() exported from router-prehook.
- Verified live: state file now shows real input_tokens=190 /
output_tokens=598 / cache_read=10075 (was 0 before).
A2 — self-assessment coverage
- observer-self-assessment-api.mjs: DEFAULT_TIMEOUT_MS 10s -> 30s.
- .claude/settings.json: Stop-hook timeout 15s -> 60s.
- Same Windows TLS handshake issue. Was 85% no_self_assessment in retro #6.
B3 — brain-retro SKILL.md reconciliation
- Step 5b: batch=default for N>=20, subagent for N<20.
C1 — dead-code cleanup
- Removed recommendNode import + getClassificationMap + getDormancy from
observer-transcript-parser.mjs.
G — parseClassifierResponse Pass 3 (fixLLMJsonQuirks)
- Root cause: real Sonnet output sometimes contains raw newlines inside
string values (multi-line reason_for_choice) and trailing commas, which
strict JSON.parse rejects. Result was llm_error_type=parse_null on
every other call, falling back to regex with task_type=unknown.
- Fix: after Pass 1 (clean) and Pass 2 (brace-extract) fail, try Pass 3
that escapes raw newline/tab inside string values and strips trailing
commas before final JSON.parse attempt. Pure char-walk, no JSON5 dep.
H — 'unknown' added to NON_BLOCKING_TASK_TYPES in router-tool-gate.mjs
- Until G fully proves itself, blocking Bash/Edit on unknown is too strict.
With G in place, parse_null should be rare; H gives a safety net.
Tests added: +9 across 5 test files. Regression: 913 vitest tests in tools/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three independent fixes from brain-retro #6 root-cause analysis:
1. **.claude/settings.json** — UserPromptSubmit `router-prehook.mjs` timeout
raised 10s→60s. First fetch on Windows triggers TLS handshake which can
take 20+ seconds; LLM classifier had perAttemptTimeoutMs=30s with 4
retries but the WRAPPING hook timeout killed the process at 10s before
first attempt completed. Result: only 1 of 325 episodes since 24.05
actually classified via Sonnet 4.6 (rest fell to regex fallback or
left state-file untouched).
2. **tools/observer-transcript-parser.mjs:937-959** — removed
`classifMapNode` silent fallback in `primary_rationale.recommended_node`.
When router-state file had no recommended_node, the parser was filling
it with `recommendNode(classifyTask(prompt), ...)` — a keyword-regex
that LOOKED like a classifier signal but wasn't. brain-retro #6
analysis showed 60-70% of «recommended_node» values were just regex
false-positives, polluting the «direct_ignored_rec» metric.
Now recommended_node is null when no real classifier signal exists.
3. **.claude/skills/brain-retro/SKILL.md** — added MANDATORY DIGITAL
ANALYSIS block at the top of Procedure. Every /brain-retro run MUST
emit 7 quantitative tables (path-type, node_chosen, recommended_node,
GAP, outcome×group, classifier presence, per-classification discipline).
Also forbids jargon in sanity questions (per memory
`feedback_plain_language.md`) — owner is non-developer.
Tests:
- tools/observer-transcript-parser.test.mjs — 2 tests updated to assert
recommended_node=null on no-state-file (was '#19'). Confirmed RED
→ fix → GREEN.
- tools/router-classifier.test.mjs — 10 new parametrised tests for
project-vocabulary anchors (webhook/queue/migration/RLS/etc).
Already GREEN with current ANCHOR_NOUNS — prefilter uses len<15
threshold which doesn't catch typical business prompts.
Regression: 899 vitest tests passed (1 file failure pre-existing in
.claude/worktrees/supplier-project-failover/ — empty file, unrelated).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaces 4 new fields from the Sonnet classifier path into the v4
episode and exposes 2 new factor-matrix axes. Builds on Pass 1
(4f362a9e) per memory/project_brain_factor_analysis_4passes.md.
# router-classifier.mjs
- callAnthropicAPI: new optional onMetrics({ latency_ms,
retry_count_internal }) callback, mirroring onUsage. Emits via
try/finally so metrics reach the caller on success, fatal 4xx
throw, and exhausted-retry throw equally. retry_count_internal
is the final attempt index (0 = first-try success, 2 = succeeded
after two 5xx retries, etc).
- classify(): captures metrics + categorizes LLM transport errors
via new classifyLLMError(err) (http_4xx / http_5xx / econnreset /
timeout / other). Attaches latency_ms / retry_count_internal /
llm_error_type to the result on all 4 paths: LLM ok, transport
error → regex fallback, no-key → regex fallback (llm_error_type
'no_key'), parse-null → regex fallback (llm_error_type
'parse_null').
- Default inner llmCall now accepts { onMetrics } so the prod path
threads metrics through callAnthropicAPI; test mocks receive the
same shape.
# observer-state-enricher.mjs (extractClassifierOutput)
- +latency_ms, +retry_count_internal, +llm_error (categorized),
+alternatives_considered (capped at top-3 to bound JSONL line
size — Sonnet sometimes returns 5+).
- All four fields null-safe on regex / prefilter / cache paths.
# brain-retro-analyzer.mjs (FACTOR_FNS)
- latency_bucket: fast (<500ms) / medium / slow / very_slow / null.
- error_type: classifier_output.llm_error verbatim with null default.
# Tests
15 new tests (all RED first, then GREEN):
- router-classifier.test.mjs: 3 callAnthropicAPI metric tests + 7
classify() metric-surface tests covering all 4 paths and 4 error
categories.
- observer-state-enricher.test.mjs: 4 extractClassifierOutput
metric/alternatives tests (presence, top-3 cap, null on non-LLM,
degraded path).
- brain-retro-analyzer.test.mjs: 2 axis-presence tests.
Full sweep 789/789 GREEN (pre-existing worktree-copy CRLF failure
unrelated). Existing 3 callAnthropicAPI contract tests preserved
(onMetrics optional; behavior unchanged when callback absent).
LEFTHOOK=0 due to quirk #111. Manual gitleaks scan: clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
router-classifier больше не ходит в недоступный api.anthropic.com и не читает ANTHROPIC_API_KEY (это перехватывало основную сессию Claude Code с подписки). callAnthropicAPI теперь ходит в ProxyAPI по умолчанию, ключ берёт из отдельной ROUTER_LLM_KEY, базовый URL — ROUTER_LLM_BASE_URL (опционально). Нет ключа → Layer 2 тихо выключен, откат на regex. +6 тестов (30/30 GREEN).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
buildLLMPrompt сериализует активные узлы + chains в prompt.
classify() — гибрид regex + LLM с кэшем per-prompt-hash.
callAnthropicAPI через built-in fetch (без SDK).
shouldEscalate: confidence<0.7 AND not micro.
Fallback на regex-result при ошибке LLM.
NB: real-API verification отложена — нет ANTHROPIC_API_KEY на dev-машине;
Phase A 'вариант 2': mock-тесты only. Когда ключ появится, код заработает
без изменений.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
classifyByRegex(prompt, registry) → {taskType, micro, recommendedNode, confidence, source}.
Read-only, без fs/exec/net. RU+EN keyword'ы для типа задачи + детект micro
+ матч по keyword/classification триггерам активных узлов реестра.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>