Commit Graph

14 Commits

Author SHA1 Message Date
Дмитрий d080198220 feat(observer): coverage + registration-integrity controller (C5)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 10:38:25 +03:00
Дмитрий 35231d8b96 feat(observer): Stop-hook routing-gate enforcement 2026-05-19 10:34:57 +03:00
Дмитрий 2e11c452a9 feat(observer): Stop-hook v2 episode + observer_error marker 2026-05-19 10:31:37 +03:00
Дмитрий 02bff371c1 feat(observer): routing-gate method-direction detector 2026-05-19 10:27:23 +03:00
Дмитрий 375c3e2d1f feat(observer): parser v2 — process events, routing-tag, episode assembly 2026-05-19 10:23:08 +03:00
Дмитрий 85a95aa2d0 feat(observer): parser v2 — environment, task_size, prompt_signal extractors 2026-05-19 10:15:17 +03:00
Дмитрий 2501b00079 docs(plan): observer factor-analysis implementation plan
12-task plan implementing the spec
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
in 4 layers (schema v2 + capture + enforcement + analysis) plus
normative sync. Each task has TDD steps with full code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 10:09:56 +03:00
Дмитрий e0a25ff629 docs(brain): spec — observer factor-analysis extension
Design for making the brain governance observer rich enough for real
factor analysis. Surfaced during a discussion with the owner: the
observer is "paper-complete" but episodes lack the data factor analysis
needs — the outcome is a hardcoded "success", there is no decision
provenance (who chose the node — Claude autonomously, or the owner
forcing a method), no environment factors, no task grouping.

4-layer architecture:
- Layer 1 — episode schema v2: decision_provenance (+ counterfactual),
  environment block, task_size, real outcome enum, task_ref.
- Layer 2 — capture: deterministic transcript parsing for all factors +
  a one-line routing tag (owner-forced-method only).
- Layer 3 — two-sided enforcement: 3a routing-gate (Stop-hook blocks the
  turn until the tag is present — unbypassable by Claude); 3b observer
  self-discipline (silent failures become recorded observer_error
  markers; coverage + registration verified by a controller).
- Layer 4 — analysis: /brain-retro infers real outcome from the next
  episode's opening prompt, groups episodes into tasks, correlates
  causal chains, builds the factor matrix.

Scope: everything except an independent agent-judge — that, plus
confusion_marker as a real judgment and real-time friction flags, is
phase 2 (separate spec).

Brainstormed via superpowers:brainstorming. Next: writing-plans.

Refs: ADR-011, spec 2026-05-19-brain-governance-design.md, Pravila §16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:15:27 +03:00
Дмитрий d2b344ea24 chore(brain): refresh STATUS.md dashboard
The committed STATUS.md was stale (generated 2026-05-19T03:49, before
the C1/C2 strict-mode fixes and before the post-commit hook existed):
it showed C1/C2 🔴 and "0 episodes". Regenerated via the now-installed
post-commit hook (C4 status-md job) — C1/C2/C3/C4 all , 5 episodes.

Context: `.git/hooks/post-commit` was never installed, so the C4
status-md job (lefthook post-commit) never ran automatically. Fixed
locally via `lefthook install --force` (installs pre-commit/post-commit/
pre-push). The hook files live in `.git/` and are not version-tracked —
re-run `lefthook install` after clone if hooks go missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:13:19 +03:00
Дмитрий 99c7bac99b feat(brain): observer captures real session data via transcript parse
The Stop-hook was writing empty-shell episodes (task_id "unknown-<ts>",
node_chosen "unknown", events []). Root cause: buildEpisodeFromContext
read fields from the Stop-event stdin that Claude Code never sends
(primary_rationale, node_chosen, ...) and the session field name was
wrong (ctx.sessionId camelCase vs Claude Code's session_id). The hook
never read transcript_path — the only real source of session data.

New tools/observer-transcript-parser.mjs — pure parseTranscript(text,
fallbackSessionId):
- Scopes to the last turn (from the last real user prompt to EOF) —
  one episode == one prompt→response cycle. A tool_result-carrier user
  message is not treated as a turn boundary.
- Extracts task_id (real sessionId), timestamps (real duration),
  skill_invoked events, a tool_summary event with per-tool counts,
  error events (tool_result is_error), node_chosen (first skill, else
  "direct"), hard_floor (invoked when a superpowers:* skill is used),
  path_type (regulated/improvised), task_classification (keyword
  heuristic on the prompt).
- Reasoning fields triggers_matched/candidates_considered/
  boundaries_applied stay [] — not recoverable from a transcript;
  their capture is a separate ADR-011 follow-up.

observer-stop-hook.mjs: reads ctx.transcript_path + ctx.session_id
(camelCase fallback kept), readFileSync best-effort, delegates to
parseTranscript. No transcript → graceful fallback to ctx defaults.
Episode schema (5 mandatory + 7-field primary_rationale) unchanged —
no normative change. Stop-event is never blocked (exit 0 on any error).

TDD: 17 parseTranscript tests + 1 buildEpisodeFromContext transcript
test. Full tools Vitest 70/70 GREEN. CLI smoke against a real 575-entry
transcript: episode populated — real task_id, ~6.5 min duration,
tool_summary {Bash:5,Read:5,Grep:1,Edit:9,Write:1}, error event.

Refs: ADR-011 brain governance §6.2 (observer evidence loop).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:11:10 +03:00
Дмитрий 9ef5227f0f fix(observer): STATUS.md plain-text reference to memory file (lychee pre-push fix)
Memory files (e.g. feedback_brain_unused_tools_not_problem.md) live
in C:/Users/.../memory/, OUTSIDE the git repo. Markdown link from
docs/observer/STATUS.md (relative path) resolved to non-existent
in-repo path → lychee broken-link error in pre-push gate.

Fix: plain-text mention of memory key (no markdown link), with
explicit note «outside-repo memory store». Generator updated
accordingly; 31/31 Vitest tests still GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:49:39 +03:00
Дмитрий ce2333e309 feat(controller): C4 status-md-generator — dashboard
Aggregates C1/C2/C3 outputs via execFileSync (Security Guidance #40
compliant — uses fixed args array, no shell injection surface) +
observer episode count. Behavioral rule embedded in metric copy.
Per ADR-011 + spec §6.4.

3 Vitest tests GREEN (31/31 total).

Smoke run rebuilds STATUS.md with current state:
- C1 🔴 (l1-watcher surfaces 9 plugins in settings not formalized
  in Tooling Прил. Н by exact name@source — see commit 4382de3)
- C2 🔴 (cross-ref-checker surfaces noise from 'наследие' headers
  — see commit a780959 DWC)
- C3  (0 weeks since last read)
- C4  (this file)

Both 🔴 states surface known pre-existing drift (not regressions).
C5 lefthook wiring will handle WARN-vs-FAIL semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:37:27 +03:00
Дмитрий 0cf1406314 docs(observer): HK1 pre-check noted in README (ADR-010 compliance)
Verified Stop event collision before B5 registration:
- User-level (~/.claude/settings.json): Stop hook = agent-type
  Sonnet-4.6 economy compliance verifier (already wired in
  6-component arch).
- Project-level (.claude/settings.json): Stop slot empty.

observer-stop-hook will register as command-type entry in
project-level Stop array. Independent slot from user-level agent;
no overwrite, no collision. Per Pravila ADR-010 HK1 hard-rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:17:58 +03:00
Дмитрий 910c2d0e37 feat(observer): docs/observer/ scaffolding — README + STATUS + counter + JSONL seed
Empty infrastructure per ADR-011 + Pravila §16.2. Hook + generators
wire up in subsequent tasks (B2 PII filter, B3 Stop-hook, B5 register
in settings.json, C4 STATUS generator).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:07:42 +03:00