Worktree has no app/node_modules — vitest not run here; final regression
deferred to main-checkout post parallel-session release. Logic is a 7-line
pure filter; tests cover empty filter, classification, errors-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The config's include `../tools/*.test.mjs` resolves relative to its
own dir (app/), not cwd. Baseline verified 2026-05-19 from app/:
11 files, 169 tests passing, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Standalone HTML dashboard that visualises the observer episode log over
the automation-graph topology — 4 views (map / task-replay / session
feed / aggregate), graph as shared canvas, 3-phase build order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 — detectAskUserQuestionChoice: when a turn contains an AskUserQuestion
whose answer exactly matches an offered option label, classify as
user_chose_from_options. The answered entry carries a structured
toolUseResult (questions[].options[].label + answers map). A custom
"Other" free-text answer is NOT a pick — falls through. Wired into
parseTranscript after the text-list detector.
#3 — parallel_session: dropped broad word matches (параллельн /
"parallel session") that false-fired on any casual mention. Now only
strong collision evidence (foreign git index / чужой staged /
index.lock / another git process). Best-effort per spec R2 — prefer
false-negative over false-positive.
169/169 tools tests GREEN (+9 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Header «Версия» line lagged at 2.18 while §9 already carried v2.19
(factor-analysis extension) and v2.20 (phase 1.1) entries — pre-existing
drift from f7f37fb. Header now reflects actual latest version; v2.18
summary demoted to «v2.18 наследие». Full per-version detail stays in §9.
Через /claude-md-management:claude-md-improver (§5 п.10).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 3rd decision_provenance kind for collaborative-choice case
(user picks one of options Claude offered). Distinct from
user_directed_method: counterfactual = Claude's recommended option,
not "what Claude would have done autonomously". Routing-gate does
NOT block this kind — collaborative choice from Claude-designed
choice-space.
Trigger: 19.05.2026 live false-positives — "1 экономия 0%",
"в делаем", "делай 2" classified as user_directed_method.
§11 + 8 subsections; 7-attribute decision_provenance schema;
new tools/observer-choice-detector.mjs (pure module); parser
+routing-gate +/brain-retro extensions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure deterministic Layer-4 aggregation module (spec §6) for the /brain-retro
skill. Exports: dedupeEpisodes, inferOutcome, groupEpisodesToTasks,
findCausalChains, buildFactorMatrix, analyze. Read-only — never writes JSONL.
11/11 tests green. CLI smoke: 10 real episodes → valid JSON with all 5 keys.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12-task plan implementing the spec
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
in 4 layers (schema v2 + capture + enforcement + analysis) plus
normative sync. Each task has TDD steps with full code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design for making the brain governance observer rich enough for real
factor analysis. Surfaced during a discussion with the owner: the
observer is "paper-complete" but episodes lack the data factor analysis
needs — the outcome is a hardcoded "success", there is no decision
provenance (who chose the node — Claude autonomously, or the owner
forcing a method), no environment factors, no task grouping.
4-layer architecture:
- Layer 1 — episode schema v2: decision_provenance (+ counterfactual),
environment block, task_size, real outcome enum, task_ref.
- Layer 2 — capture: deterministic transcript parsing for all factors +
a one-line routing tag (owner-forced-method only).
- Layer 3 — two-sided enforcement: 3a routing-gate (Stop-hook blocks the
turn until the tag is present — unbypassable by Claude); 3b observer
self-discipline (silent failures become recorded observer_error
markers; coverage + registration verified by a controller).
- Layer 4 — analysis: /brain-retro infers real outcome from the next
episode's opening prompt, groups episodes into tasks, correlates
causal chains, builds the factor matrix.
Scope: everything except an independent agent-judge — that, plus
confusion_marker as a real judgment and real-time friction flags, is
phase 2 (separate spec).
Brainstormed via superpowers:brainstorming. Next: writing-plans.
Refs: ADR-011, spec 2026-05-19-brain-governance-design.md, Pravila §16.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The committed STATUS.md was stale (generated 2026-05-19T03:49, before
the C1/C2 strict-mode fixes and before the post-commit hook existed):
it showed C1/C2 🔴 and "0 episodes". Regenerated via the now-installed
post-commit hook (C4 status-md job) — C1/C2/C3/C4 all ✅, 5 episodes.
Context: `.git/hooks/post-commit` was never installed, so the C4
status-md job (lefthook post-commit) never ran automatically. Fixed
locally via `lefthook install --force` (installs pre-commit/post-commit/
pre-push). The hook files live in `.git/` and are not version-tracked —
re-run `lefthook install` after clone if hooks go missing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Stop-hook was writing empty-shell episodes (task_id "unknown-<ts>",
node_chosen "unknown", events []). Root cause: buildEpisodeFromContext
read fields from the Stop-event stdin that Claude Code never sends
(primary_rationale, node_chosen, ...) and the session field name was
wrong (ctx.sessionId camelCase vs Claude Code's session_id). The hook
never read transcript_path — the only real source of session data.
New tools/observer-transcript-parser.mjs — pure parseTranscript(text,
fallbackSessionId):
- Scopes to the last turn (from the last real user prompt to EOF) —
one episode == one prompt→response cycle. A tool_result-carrier user
message is not treated as a turn boundary.
- Extracts task_id (real sessionId), timestamps (real duration),
skill_invoked events, a tool_summary event with per-tool counts,
error events (tool_result is_error), node_chosen (first skill, else
"direct"), hard_floor (invoked when a superpowers:* skill is used),
path_type (regulated/improvised), task_classification (keyword
heuristic on the prompt).
- Reasoning fields triggers_matched/candidates_considered/
boundaries_applied stay [] — not recoverable from a transcript;
their capture is a separate ADR-011 follow-up.
observer-stop-hook.mjs: reads ctx.transcript_path + ctx.session_id
(camelCase fallback kept), readFileSync best-effort, delegates to
parseTranscript. No transcript → graceful fallback to ctx defaults.
Episode schema (5 mandatory + 7-field primary_rationale) unchanged —
no normative change. Stop-event is never blocked (exit 0 on any error).
TDD: 17 parseTranscript tests + 1 buildEpisodeFromContext transcript
test. Full tools Vitest 70/70 GREEN. CLI smoke against a real 575-entry
transcript: episode populated — real task_id, ~6.5 min duration,
tool_summary {Bash:5,Read:5,Grep:1,Edit:9,Write:1}, error event.
Refs: ADR-011 brain governance §6.2 (observer evidence loop).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Memory files (e.g. feedback_brain_unused_tools_not_problem.md) live
in C:/Users/.../memory/, OUTSIDE the git repo. Markdown link from
docs/observer/STATUS.md (relative path) resolved to non-existent
in-repo path → lychee broken-link error in pre-push gate.
Fix: plain-text mention of memory key (no markdown link), with
explicit note «outside-repo memory store». Generator updated
accordingly; 31/31 Vitest tests still GREEN.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Aggregates C1/C2/C3 outputs via execFileSync (Security Guidance #40
compliant — uses fixed args array, no shell injection surface) +
observer episode count. Behavioral rule embedded in metric copy.
Per ADR-011 + spec §6.4.
3 Vitest tests GREEN (31/31 total).
Smoke run rebuilds STATUS.md with current state:
- C1 🔴 (l1-watcher surfaces 9 plugins in settings not formalized
in Tooling Прил. Н by exact name@source — see commit 4382de3)
- C2 🔴 (cross-ref-checker surfaces noise from 'наследие' headers
— see commit a780959 DWC)
- C3 ✅ (0 weeks since last read)
- C4 ✅ (this file)
Both 🔴 states surface known pre-existing drift (not regressions).
C5 lefthook wiring will handle WARN-vs-FAIL semantics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verified Stop event collision before B5 registration:
- User-level (~/.claude/settings.json): Stop hook = agent-type
Sonnet-4.6 economy compliance verifier (already wired in
6-component arch).
- Project-level (.claude/settings.json): Stop slot empty.
observer-stop-hook will register as command-type entry in
project-level Stop array. Independent slot from user-level agent;
no overwrite, no collision. Per Pravila ADR-010 HK1 hard-rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Empty infrastructure per ADR-011 + Pravila §16.2. Hook + generators
wire up in subsequent tasks (B2 PII filter, B3 Stop-hook, B5 register
in settings.json, C4 STATUS generator).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>