Дмитрий
|
81cbd8c1c2
|
feat(brain-retro #7): C1+C2+C3+C4 router-discipline fixes
retro #7 (docs/observer/notes/2026-05-27-brain-retro-7.md) surfaced 4
candidates against 23 turns since retro #6. All four implemented TDD.
C1 — translit slang vocabulary in router-classifier-regex-fallback.mjs.
TASK_TYPE_KEYWORDS += deploy bucket (push / запушь / выкат);
memory-sync += обнови мозг / эталон / пилот / memory dump.
C2 — short_ambiguous_block in router-tool-gate.mjs + router-prehook.mjs.
prehook persists prompt_length; gate blocks Edit/Write/MultiEdit/Bash
when task_type in {ambiguous, unknown} AND prompt_length <= 30 AND
skill not invoked AND no direct_justified tag.
C3 — self-assessment timeout 30s to 50s in observer-self-assessment-api.mjs.
Windows TLS handshake + Sonnet latency exceeded 30s. Stop-hook has 60s
budget; 50s leaves headroom. DEFAULT_TIMEOUT_MS exported for tests.
C4 — Reviewer findings block in status-md-generator.mjs. New helper
computeReviewerFindingsBlock surfaces 51 actionable findings without
running /brain-retro. Detects batch-reviewed via
outcome_reviewed_source=direct_api_batch. MD012 guard test added.
C5 (gitleaks-before-push) intentionally skipped — pre-push hook already
blocks at server side.
Tests: 956/956 root tools, 0 regressions. LEFTHOOK=0 used per quirk #111.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-27 06:46:55 +03:00 |
|
Дмитрий
|
808461295a
|
feat(router): Sonnet classifier + памятка + regex-fallback module (phase 2 task 10)
Phase 2 Task 10 of LLM-first router overhaul. Spec §4.2 — Layer 2 Sonnet 4.6
classifier with 4-pattern памятка enrichment, JSON output per spec, fallback
chain Sonnet → regex → degraded. Phase 1 regex Layer 1 extracted to its own
module so it can be called only as a fallback.
- tools/router-classifier-regex-fallback.mjs (NEW): self-contained regex
fallback. Extracts TASK_TYPE_KEYWORDS, HARD_KEYWORD_STEMS, detectTaskType,
keywordMatches, detectRecommendedNode, computeConfidence, classifyByRegex
verbatim from the prior classifier. Self-contained (own MICRO_KEYWORDS,
detectMicro, lower) — no circular imports.
- tools/router-classifier.mjs (REWRITE):
+ import { CLASSIFIER_MODEL } from router-config.mjs
+ re-export { classifyByRegex } from regex-fallback (back-compat surface)
+ buildClassifierPrompt(prompt, registry, { enrichment=true }) — spec §4.2
format with 4-pattern памятка (brainstorming / discovery-interview /
writing-plans / systematic-debugging) togglable via enrichment flag.
+ parseClassifierResponse(text) — strict task_type required, ```json fence
aware, accepts null recommended_chain_id.
+ classify() rewritten: prefilter → cache → Sonnet (CLASSIFIER_MODEL) →
regex fallback (transport error OR no key/unparseable).
+ callAnthropicAPI default model = CLASSIFIER_MODEL; max_tokens 300 → 1500
(full classifier output with alternatives & памятка needs the budget).
- removed: shouldEscalate, TASK_TYPE_KEYWORDS, detectTaskType,
keywordMatches, detectRecommendedNode, HARD_KEYWORD_STEMS, computeConfidence
(all live in regex-fallback now).
Kept legacy: buildLLMPrompt / parseLLMResponse (back-compat surface).
- tools/router-accuracy-runner.mjs: import classifyByRegex from regex-fallback
module (G11 from plan). Runner functionality unchanged.
- tools/router-classifier.test.mjs: +8 tests for buildClassifierPrompt (4) and
parseClassifierResponse (4); removed obsolete shouldEscalate block (3);
rewrote classify integration block (4 tests) to reflect new flow
(prefilter-first, LLM-always-on-fallthrough, regex on error).
Tests: tools/router-classifier.test.mjs 44/44 PASS. Full tools/ suite:
557 tests passed, 0 failed (4 pre-existing empty test files report
"no test suite found" — unrelated: ruflo-recall-hook, subagent-prompt-prefix,
plus 2 others — not touched in this commit).
accuracy-runner smoke: type=85%/node=55%/micro=100% on the 20-prompt set,
unchanged from pre-Task-10 baseline (regex path semantics preserved).
|
2026-05-25 14:28:25 +03:00 |
|