docs(observer): brain-retro skill + README for schema v2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 10:55:37 +03:00
parent a6f44e5bb4
commit d484e60c46
4 changed files with 48 additions and 40 deletions
@@ -25,10 +25,11 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
 2. **Read evidence**: glob `docs/observer/episodes-YYYY-MM.jsonl` for the period; read all lines as JSON.
 3. **Read optional notes**: glob `docs/observer/notes/*.md` filtered by date.
 4. **Update read-counter**: bump `docs/observer/.read-counter.json` `last_read_at` to now, increment `read_count_last_period`. (Side-effect — used by C3 observer-of-observer.)
-5. **Aggregate** per `references/aggregation-template.md` — includes Factor analysis matrix (v1.1+) on 5 axes.
-6. **Propose candidates** — clearly separated section «Candidates for owner review». Each candidate has rationale + suggested edit + rejection-option.
-7. **Save retro note**: `docs/observer/notes/YYYY-MM-DD-brain-retro.md` with full aggregation.
-8. **Report to user**: high-signal summary.
+5. **Run the deterministic analyzer**: `node tools/brain-retro-analyzer.mjs docs/observer/episodes-YYYY-MM.jsonl` (pass every monthly file in the period). It returns JSON with `episodeCount`, `observerErrorCount`, `tasks` (episodes grouped into tasks), `causalChains` (error→fix candidates) and `factorMatrix` (outcome distribution per factor). The analyzer deduplicates the routing-gate double-write and infers the true `outcome` of each episode from the next episode's `prompt_signal` — never trust the stored `outcome` (it is `unknown` at write time).
+6. **Aggregate** per `references/aggregation-template.md` — fill the Factor analysis matrix from the analyzer's `factorMatrix`, the task groups from `tasks`, the causal-chain candidates from `causalChains`.
+7. **Propose candidates** — clearly separated section «Candidates for owner review». Each candidate has rationale + suggested edit + rejection-option.
+8. **Save retro note**: `docs/observer/notes/YYYY-MM-DD-brain-retro.md` with full aggregation.
+9. **Report to user**: high-signal summary.

 ## Output anatomy

@@ -27,52 +27,49 @@ YYYY-MM-DD .. YYYY-MM-DD ({N} sessions)
 | node | times used | first / last |
 |---|---|---|

-## Factor analysis matrix (v1.1+ — from `routing_decision` events + `primary_rationale`)
+## Factor analysis matrix (v2 — from `tools/brain-retro-analyzer.mjs`)

-Per-node breakdown by 5 factor axes (per spec §5.2.1).
+Outcome distribution per factor value. Source: the analyzer’s `factorMatrix`.
+Outcome is the *inferred* outcome (next-prompt sentiment), not the stored
+`unknown`. The factor `decision_provenance` directly answers the owner’s
+question — "is the rework mine or the router’s?"

-### Axis 1: triggers_matched → node
+For each factor below, render a table: factor value × outcome counts
+(`success` / `partial` / `rework` / `unknown`).

-«Какие триггеры чаще всего ведут к этому узлу»
+### decision_provenance (autonomous vs user_directed_method)

-| node | trigger | count |
-|---|---|---|
-
-### Axis 2: candidates_considered.dropped_because → node
-
-«Какие альтернативы и почему отбрасываются в пользу этого узла»
-
-| node | dropped alternative | dropped_because (top reason) | count |
-|---|---|---|---|
-
-### Axis 3: boundaries_applied → node
-
-«Какие ADR / R-rules чаще всего разруливают в пользу этого узла»
-
-| node | boundary (ADR-NNN / PSR R-rule) | count |
-|---|---|---|
-
-### Axis 4: hard_floor.rules → invoked frequency
-
-«Как часто узел вынуждается hard-floor §12/§14/§15»
-
-| node | rule | invoked-count | total decisions | % forced |
+| provenance | success | partial | rework | unknown |
 |---|---|---|---|---|

-### Axis 5: task_classification → node
+### economy_level

-«В каких классах задач узел доминирует»
+| economy_level | success | partial | rework | unknown |
+|---|---|---|---|---|

-| task_classification | top node | secondary nodes |
+### model · post_compaction · task_size bucket
+
+(one table each — same columns)
+
+### node_chosen · task_classification
+
+(one table each — same columns)
+
+## Episodes → tasks (from analyzer `tasks`)
+
+| task_ref | episodes | turns that are rework |
 |---|---|---|

-### Cross-tab: factor × factor
+## Causal-chain candidates (from analyzer `causalChains`)

-Pairs of factors that co-occur to resolve specific conflicts (e.g. «ADR-009 ↔ triggers_matched=['discovery']» — 8 раз).
-
-| factor pair | combined count | interpretation |
+| from (errored episode) | to (later episode) | shared files |
 |---|---|---|

+## Observer health
+
+- `observerErrorCount` from the analyzer — observer_error markers in the period.
+  Non-zero = the observer failed silently somewhere; investigate.
+
 ## Canonical chains L1–L12 hit rate

 | chain | times | notes |
@@ -4,7 +4,7 @@ Passive evidence-loop for the Лидерра «brain» per ADR-011.

 ## Files

- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event (one prompt→response turn). Written by `tools/observer-stop-hook.mjs`, which parses the session transcript (`transcript_path`) via `tools/observer-transcript-parser.mjs`.
+- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event. Schema **v2** (`schema_version: 2`): the 5 mandatory fields + `decision_provenance` (who chose the node), `environment` (economy_level / model / post_compaction / session_turn / parallel_session), `task_size`, `task_ref`, `prompt_signal`, and an `outcome` that is `unknown` at write time (refined by `/brain-retro`). On an internal hook failure a minimal `observer_error` marker line is written instead of a silent skip. Written by `tools/observer-stop-hook.mjs` via `tools/observer-transcript-parser.mjs`.
 - `notes/YYYY-MM-DD-<slug>.md` — optional MD notes for sessions with qualitative history.
 - `STATUS.md` — auto-generated dashboard. Regenerated per-commit by `tools/status-md-generator.mjs`.
 - `.read-counter.json` — C3 observer-of-observer counter. Updated on Read of observer files.
@@ -16,6 +16,16 @@ Passive evidence-loop for the Лидерра «brain» per ADR-011.
 3. **Surface**: `STATUS.md` shows controllers + monthly stats.
 4. **Self-prune**: C3 warns if 54 weeks pass without any read of observer files.

+## Routing-tag discipline
+
+When the user dictates a specific method/node (e.g. «запусти discovery-interview»), Claude must emit one line in its response:
+
+```
+<!-- routing: provenance=user_directed_method node=<chosen> counterfactual=<node Claude would have chosen autonomously> -->
+```
+
+The Stop-hook routing-gate (`tools/observer-routing-detector.mjs` + `routingGateDecision`) detects a dictated method; if the tag is missing it returns `decision: block`, so the turn cannot end without the tag. The gate fires at most once per turn (`stop_hook_active` guard). This makes `decision_provenance` reliable — factor analysis can separate a router error from a user-dictated one.
+
 ## Privacy

 PII filter (phone numbers, emails, tokens) is applied **before** every write — see `tools/observer-pii-filter.mjs`. gitleaks pre-push also scans observer files as part of full-history sweep.
@@ -1,6 +1,6 @@
 # Brain Status (auto-generated)

-Last updated: 2026-05-19T07:44:20.205Z
+Last updated: 2026-05-19T07:48:08.160Z

 | Контролёр | Состояние | Детали |
 |---|---|---|
@@ -8,7 +8,7 @@ Last updated: 2026-05-19T07:44:20.205Z
 | C2 Cross-ref consistency | ✅ | [cross-ref-checker] OK — 0 drift in 4 files |
 | C3 Observer-of-observer | ✅ | [observer-of-observer] OK — last read 0 week(s) ago |
 | C4 Сигнальный статус | ✅ | This file (self-reference) |
-| C5 Observer-coverage | ✅ | 10 episode(s), 950 recent commit(s) · Stop-hook + post-commit OK |
+| C5 Observer-coverage | ✅ | 10 episode(s), 951 recent commit(s) · Stop-hook + post-commit OK |

 ## Метрики (информационные, не алерты)