From e239160a2e338a9bd8ba5e2cae0a5e3ce574e345 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=D0=94=D0=BC=D0=B8=D1=82=D1=80=D0=B8=D0=B9?= Date: Sun, 24 May 2026 07:09:10 +0300 Subject: [PATCH] docs(brain): baseline pre-enforcement snapshot (stage 2 task 6) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Зафиксированы цифры дисциплины роутера на 2026-05-24 перед запуском enforcement-хука этапа 3. Sanity-check passed: missed_before=17 == missed_after=17 (delta=0) после переключения источника правды на реестр. observer-classification-map.json помечен deprecated — для удаления в этапе 4. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../baselines/2026-05-24-pre-enforcement.md | 96 +++++++++++++++++++ tools/observer-classification-map.json | 2 +- 2 files changed, 97 insertions(+), 1 deletion(-) create mode 100644 docs/observer/baselines/2026-05-24-pre-enforcement.md diff --git a/docs/observer/baselines/2026-05-24-pre-enforcement.md b/docs/observer/baselines/2026-05-24-pre-enforcement.md new file mode 100644 index 00000000..0d31e265 --- /dev/null +++ b/docs/observer/baselines/2026-05-24-pre-enforcement.md @@ -0,0 +1,96 @@ +# Baseline дисциплины роутера — pre-enforcement snapshot + +**Дата:** 2026-05-24 +**Источник данных:** `docs/observer/episodes-2026-05.jsonl` +**Этап:** Router discipline overhaul, Stage 2 (Measurements). Зафиксирован для сравнения с пост-enforcement цифрами этапа 3. +**Spec:** `docs/superpowers/specs/2026-05-23-router-discipline-overhaul-design.md` +**Plan:** `docs/superpowers/plans/2026-05-24-router-overhaul-stage-2-measurements.md` +**Commit:** 30b795c + +## Объём данных + +- Эпизодов всего: 129 (124 v2+ + 5 v1) +- v2+ эпизодов (анализируется): 124 +- v1 эпизодов пропущено: 5 +- Observer-error маркеров: 0 + +## Цифры + +### Дисциплина по типам задач + +| Тип задачи | Эпизодов | % с триггер-матчем | % через скил | +|---|---|---|---| +| bugfix | 6 | 33.3% | 33.3% | +| analysis | 4 | 0% | 25.0% | +| feature | 5 | 0% | 0% | +| planning | 2 | 0% | 0% | +| refactor | 1 | 0% | 0% | +| cleanup | 1 | 0% | 0% | +| monitoring | 1 | 0% | 0% | + +### Распределение по шагам роутера + +- distribution: `{"1": 124}` +- total: 124 +- **suspicious: true** — >90% эпизодов остановились на step=1; sentinel-bug парсера, требует исследования в этапе 3 + +### Применение границ (ADR) + +- Total: 124 +- With boundaries: 13 +- Rate: 10.5% +- By path_type: + - `improvised`: 112 эпизодов, 11 с boundaries, 9.8% + - `regulated`: 12 эпизодов, 2 с boundaries, 16.7% + +### Missed activations + +- Total: 17 + +By classification: + +```json +{ + "bugfix": 4, + "feature": 5, + "refactor": 1, + "planning": 2, + "cleanup": 1, + "monitoring": 1, + "analysis": 3 +} +``` + +By node (top 5 по количеству): + +```json +{ + "#19": 12, + "#34": 5, + "#18": 4, + "#41": 2, + "#42": 2 +} +``` + +## Контекст + +Это «точка До» перед включением enforcement-хука этапа 3. После недели работы хука повторно снимем эти цифры и сравним. + +**Цели overhaul'а (из spec'а §acceptance criteria):** + +- Дисциплина (% эпизодов с матченным триггером на классифицированных задачах): **≥75%** (baseline зафиксирован выше — сейчас 33.3% лишь у bugfix, остальные 0%). +- Missed activations: **≤5/неделю** (baseline: 17 за месяц). +- % feature/planning без skill: **≤10%** (baseline: feature 0%, planning 0% — обе категории нарушают цель). + +## Заметка о suspicious-флаге + +`suspicious: true` в `routerStep` указывает, что **все 124 v2+ эпизода имеют `step=1`**. Это означает, что парсер `tools/observer-transcript-parser.mjs` пока не enrich'ит фактический шаг роутера — поле `primary_rationale.step` сейчас постоянно `1` (sentinel default). Этот пропуск самой инструментовки наблюдателя — отдельный задел для этапа 3 (нужно либо расширить парсер, чтобы он различал шаги, либо явно вычислять step из контекста). До этого срез по router_step **не информативен**. + +## Воспроизводимость + +```bash +node tools/brain-retro-analyzer.mjs docs/observer/episodes-2026-05.jsonl +``` + +Источник classificationMap + dormancy — `docs/registry/nodes.yaml` (через `tools/registry-to-classification-map.mjs`). diff --git a/tools/observer-classification-map.json b/tools/observer-classification-map.json index 2c10bca8..c943fba7 100644 --- a/tools/observer-classification-map.json +++ b/tools/observer-classification-map.json @@ -1,6 +1,6 @@ { "$schema_version": 1, - "description": "Mapping from observer transcript-parser task_classification values to recommended Tooling Прил.Н node IDs. Source of truth for missed-activation detection (Pravila §16.4 conditional rule). 'other' deliberately empty — no recommendation, never counts as missed. DEFERRED-узлы filtered out by .node-dormancy.json at runtime. Classifier vocabulary is Claude's free judgment when writing the episode (no hardcoded enum) — adding a key here makes it 'blessed'. 'security' added 22.05.2026 (A8 follow-up): use when the PURPOSE of the task is verifying or improving security (scans, hardening, audits, threat modeling, go-live gates); NOT for bug-fixes that happen to be in security-relevant code (those stay 'bugfix'). 'marketing' added 22.05.2026 (C1 follow-up): use when the PURPOSE of the task is Лидерра's own marketing/lead-generation (content, SEO, campaigns, RU-channels, landing conversion, marketing-side 152-FZ); NOT for product features, billing flows, or PII-code audits. 'question' emptied 23.05.2026 (brain-retro #3 A1): conversational Russian Q&A («делай», «а», уточнения) was producing 17/40 false-positive missed-activations against #60 context7 — context7 is for library-docs lookup, not chat. 'memory-sync' emptied 23.05.2026 (brain-retro #3 A2): #33 claude-md-management is the channel for CLAUDE.md edits (Pravila §5 п.10), NOT for memory/*.md (auto-memory writes natively); was producing 8/40 false-positive missed-activations.", + "description": "DEPRECATED (2026-05-24): source of truth migrated to docs/registry/nodes.yaml + tools/registry-to-classification-map.mjs. This file is retained ONLY for historic v2-episode replay in tests; new code MUST consume the registry. Removal scheduled for stage 4 of router-discipline-overhaul. Original description follows. — Mapping from observer transcript-parser task_classification values to recommended Tooling Прил.Н node IDs. Source of truth for missed-activation detection (Pravila §16.4 conditional rule). 'other' deliberately empty — no recommendation, never counts as missed. DEFERRED-узлы filtered out by .node-dormancy.json at runtime. Classifier vocabulary is Claude's free judgment when writing the episode (no hardcoded enum) — adding a key here makes it 'blessed'. 'security' added 22.05.2026 (A8 follow-up): use when the PURPOSE of the task is verifying or improving security (scans, hardening, audits, threat modeling, go-live gates); NOT for bug-fixes that happen to be in security-relevant code (those stay 'bugfix'). 'marketing' added 22.05.2026 (C1 follow-up): use when the PURPOSE of the task is Лидерра's own marketing/lead-generation (content, SEO, campaigns, RU-channels, landing conversion, marketing-side 152-FZ); NOT for product features, billing flows, or PII-code audits. 'question' emptied 23.05.2026 (brain-retro #3 A1): conversational Russian Q&A («делай», «а», уточнения) was producing 17/40 false-positive missed-activations against #60 context7 — context7 is for library-docs lookup, not chat. 'memory-sync' emptied 23.05.2026 (brain-retro #3 A2): #33 claude-md-management is the channel for CLAUDE.md edits (Pravila §5 п.10), NOT for memory/*.md (auto-memory writes natively); was producing 8/40 false-positive missed-activations.", "map": { "refactor": ["#11", "#12", "#43", "#64", "#65"], "bugfix": ["#18", "#34"],