docs(observer): brain-retro skill + README for schema v2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Дмитрий
2026-05-19 10:55:37 +03:00
parent a6f44e5bb4
commit d484e60c46
4 changed files with 48 additions and 40 deletions
+5 -4
View File
@@ -25,10 +25,11 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
2. **Read evidence**: glob `docs/observer/episodes-YYYY-MM.jsonl` for the period; read all lines as JSON.
3. **Read optional notes**: glob `docs/observer/notes/*.md` filtered by date.
4. **Update read-counter**: bump `docs/observer/.read-counter.json` `last_read_at` to now, increment `read_count_last_period`. (Side-effect — used by C3 observer-of-observer.)
5. **Aggregate** per `references/aggregation-template.md` — includes Factor analysis matrix (v1.1+) on 5 axes.
6. **Propose candidates** — clearly separated section «Candidates for owner review». Each candidate has rationale + suggested edit + rejection-option.
7. **Save retro note**: `docs/observer/notes/YYYY-MM-DD-brain-retro.md` with full aggregation.
8. **Report to user**: high-signal summary.
5. **Run the deterministic analyzer**: `node tools/brain-retro-analyzer.mjs docs/observer/episodes-YYYY-MM.jsonl` (pass every monthly file in the period). It returns JSON with `episodeCount`, `observerErrorCount`, `tasks` (episodes grouped into tasks), `causalChains` (error→fix candidates) and `factorMatrix` (outcome distribution per factor). The analyzer deduplicates the routing-gate double-write and infers the true `outcome` of each episode from the next episode's `prompt_signal` — never trust the stored `outcome` (it is `unknown` at write time).
6. **Aggregate** per `references/aggregation-template.md` — fill the Factor analysis matrix from the analyzer's `factorMatrix`, the task groups from `tasks`, the causal-chain candidates from `causalChains`.
7. **Propose candidates** — clearly separated section «Candidates for owner review». Each candidate has rationale + suggested edit + rejection-option.
8. **Save retro note**: `docs/observer/notes/YYYY-MM-DD-brain-retro.md` with full aggregation.
9. **Report to user**: high-signal summary.
## Output anatomy
@@ -27,52 +27,49 @@ YYYY-MM-DD .. YYYY-MM-DD ({N} sessions)
| node | times used | first / last |
|---|---|---|
## Factor analysis matrix (v1.1+ — from `routing_decision` events + `primary_rationale`)
## Factor analysis matrix (v2 — from `tools/brain-retro-analyzer.mjs`)
Per-node breakdown by 5 factor axes (per spec §5.2.1).
Outcome distribution per factor value. Source: the analyzers `factorMatrix`.
Outcome is the *inferred* outcome (next-prompt sentiment), not the stored
`unknown`. The factor `decision_provenance` directly answers the owners
question — "is the rework mine or the routers?"
### Axis 1: triggers_matched → node
For each factor below, render a table: factor value × outcome counts
(`success` / `partial` / `rework` / `unknown`).
«Какие триггеры чаще всего ведут к этому узлу»
### decision_provenance (autonomous vs user_directed_method)
| node | trigger | count |
|---|---|---|
### Axis 2: candidates_considered.dropped_because → node
«Какие альтернативы и почему отбрасываются в пользу этого узла»
| node | dropped alternative | dropped_because (top reason) | count |
|---|---|---|---|
### Axis 3: boundaries_applied → node
«Какие ADR / R-rules чаще всего разруливают в пользу этого узла»
| node | boundary (ADR-NNN / PSR R-rule) | count |
|---|---|---|
### Axis 4: hard_floor.rules → invoked frequency
«Как часто узел вынуждается hard-floor §12/§14/§15»
| node | rule | invoked-count | total decisions | % forced |
| provenance | success | partial | rework | unknown |
|---|---|---|---|---|
### Axis 5: task_classification → node
### economy_level
«В каких классах задач узел доминирует»
| economy_level | success | partial | rework | unknown |
|---|---|---|---|---|
| task_classification | top node | secondary nodes |
### model · post_compaction · task_size bucket
(one table each — same columns)
### node_chosen · task_classification
(one table each — same columns)
## Episodes → tasks (from analyzer `tasks`)
| task_ref | episodes | turns that are rework |
|---|---|---|
### Cross-tab: factor × factor
## Causal-chain candidates (from analyzer `causalChains`)
Pairs of factors that co-occur to resolve specific conflicts (e.g. «ADR-009 ↔ triggers_matched=['discovery']» — 8 раз).
| factor pair | combined count | interpretation |
| from (errored episode) | to (later episode) | shared files |
|---|---|---|
## Observer health
- `observerErrorCount` from the analyzer — observer_error markers in the period.
Non-zero = the observer failed silently somewhere; investigate.
## Canonical chains L1L12 hit rate
| chain | times | notes |
+11 -1
View File
@@ -4,7 +4,7 @@ Passive evidence-loop for the Лидерра «brain» per ADR-011.
## Files
- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event (one prompt→response turn). Written by `tools/observer-stop-hook.mjs`, which parses the session transcript (`transcript_path`) via `tools/observer-transcript-parser.mjs`.
- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event. Schema **v2** (`schema_version: 2`): the 5 mandatory fields + `decision_provenance` (who chose the node), `environment` (economy_level / model / post_compaction / session_turn / parallel_session), `task_size`, `task_ref`, `prompt_signal`, and an `outcome` that is `unknown` at write time (refined by `/brain-retro`). On an internal hook failure a minimal `observer_error` marker line is written instead of a silent skip. Written by `tools/observer-stop-hook.mjs` via `tools/observer-transcript-parser.mjs`.
- `notes/YYYY-MM-DD-<slug>.md` — optional MD notes for sessions with qualitative history.
- `STATUS.md` — auto-generated dashboard. Regenerated per-commit by `tools/status-md-generator.mjs`.
- `.read-counter.json` — C3 observer-of-observer counter. Updated on Read of observer files.
@@ -16,6 +16,16 @@ Passive evidence-loop for the Лидерра «brain» per ADR-011.
3. **Surface**: `STATUS.md` shows controllers + monthly stats.
4. **Self-prune**: C3 warns if 54 weeks pass without any read of observer files.
## Routing-tag discipline
When the user dictates a specific method/node (e.g. «запусти discovery-interview»), Claude must emit one line in its response:
```
<!-- routing: provenance=user_directed_method node=<chosen> counterfactual=<node Claude would have chosen autonomously> -->
```
The Stop-hook routing-gate (`tools/observer-routing-detector.mjs` + `routingGateDecision`) detects a dictated method; if the tag is missing it returns `decision: block`, so the turn cannot end without the tag. The gate fires at most once per turn (`stop_hook_active` guard). This makes `decision_provenance` reliable — factor analysis can separate a router error from a user-dictated one.
## Privacy
PII filter (phone numbers, emails, tokens) is applied **before** every write — see `tools/observer-pii-filter.mjs`. gitleaks pre-push also scans observer files as part of full-history sweep.
+2 -2
View File
@@ -1,6 +1,6 @@
# Brain Status (auto-generated)
Last updated: 2026-05-19T07:44:20.205Z
Last updated: 2026-05-19T07:48:08.160Z
| Контролёр | Состояние | Детали |
|---|---|---|
@@ -8,7 +8,7 @@ Last updated: 2026-05-19T07:44:20.205Z
| C2 Cross-ref consistency | ✅ | [cross-ref-checker] OK — 0 drift in 4 files |
| C3 Observer-of-observer | ✅ | [observer-of-observer] OK — last read 0 week(s) ago |
| C4 Сигнальный статус | ✅ | This file (self-reference) |
| C5 Observer-coverage | ✅ | 10 episode(s), 950 recent commit(s) · Stop-hook + post-commit OK |
| C5 Observer-coverage | ✅ | 10 episode(s), 951 recent commit(s) · Stop-hook + post-commit OK |
## Метрики (информационные, не алерты)