Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
97 KiB
Observer factor-analysis extension — Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Extend the brain-governance observer so a real factor analysis becomes possible — capture decision provenance, environment factors, task size, process events, and a true outcome — and make observer discipline mechanically enforced.
Architecture: Four layers over the existing observer. Layer 1 — episode schema v2 (new fields). Layer 2 — deterministic transcript parsing + a one-line routing-tag Claude prints when the user dictates a method. Layer 3 — two-sided enforcement: a Stop-hook routing-gate (decision: block when a method was dictated but the tag is missing) and an observer self-discipline controller (C5). Layer 4 — /brain-retro analysis (outcome inference, episode→task grouping, causal chains, factor matrix). All deterministic — 0 LLM calls. Implementation continues on branch feat/parallel-sessions-coordination (where brain-governance work lives) or an isolated worktree if the executing skill creates one.
Tech Stack: Node ESM (tools/*.mjs), Vitest (tools/*.test.mjs, config app/vitest.config.tools.mjs), lefthook, Markdown skills/normative docs.
Source spec
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md (v1.0). Read it before starting — every task below traces to a spec section.
Two deliberate spec-gap resolutions (decided here, not in the spec)
The spec §6 describes Layer-4 logic that needs data §3 does not list. Both are resolved in this plan:
prompt_signal— spec §6 infers an episode's outcome from "the first user-prompt of the next episode". An episode does not store raw prompt text (PII risk). Resolution: each v2 episode storesprompt_signal— a deterministic classification of its own opening user-prompt (correction|approval|new_task|neutral).inferOutcomethen readsnextEpisode.prompt_signal. PII-safe, deterministic.task_size.files— spec §3 saystask_size.files_touchedis a count; spec §6 causal chains need the actual file paths. Resolution:task_sizecarries both —files_touched(count, per §3) andfiles(string array). File paths are not PII (the PII filter covers phones/emails/tokens).
Verification commands (used throughout)
- Full tools test suite (run from repo root
c:\моя\проекты\портал crm\Документация):node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run - Single test file (substring filter): append the basename, e.g.
node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector - lefthook pre-commit:
npx lefthook run pre-commit
File Structure
| File | Responsibility | Action |
|---|---|---|
tools/observer-transcript-parser.mjs |
Deterministic transcript → v2 episode fields | Modify (Tasks 1–2) |
tools/observer-transcript-parser.test.mjs |
Parser unit tests | Modify (Tasks 1–2) |
tools/observer-routing-detector.mjs |
"Was a method dictated?" detector + known-nodes loader | Create (Task 3) |
tools/observer-routing-detector.test.mjs |
Detector unit tests | Create (Task 3) |
tools/observer-known-nodes.txt |
Static list of directable node/skill names | Create (Task 3) |
tools/observer-stop-hook.mjs |
Builds + appends v2 episode; routing-gate; observer_error marker | Modify (Tasks 4–5) |
tools/observer-stop-hook.test.mjs |
Stop-hook unit tests | Modify (Tasks 4–5) |
tools/observer-coverage-checker.mjs |
C5 — coverage + registration-integrity controller | Create (Task 6) |
tools/observer-coverage-checker.test.mjs |
C5 unit tests | Create (Task 6) |
tools/status-md-generator.mjs |
STATUS.md dashboard — adds C5 row + observer_error metric | Modify (Task 7) |
tools/status-md-generator.test.mjs |
STATUS.md unit tests | Modify (Task 7) |
lefthook.yml |
Wire C5 as pre-commit job 15 | Modify (Task 8) |
tools/brain-retro-analyzer.mjs |
Layer-4 deterministic aggregation | Create (Task 9) |
tools/brain-retro-analyzer.test.mjs |
Analyzer unit tests | Create (Task 9) |
.claude/skills/brain-retro/SKILL.md |
/brain-retro procedure — uses the analyzer |
Modify (Task 10) |
.claude/skills/brain-retro/references/aggregation-template.md |
Retro template — v2 factor matrix | Modify (Task 10) |
docs/observer/README.md |
Observer docs — schema v2, observer_error, routing-tag | Modify (Task 10) |
docs/adr/ADR-011-brain-governance.md |
ADR amendment — observer v2, C5 | Modify (Task 11) |
docs/Pravila_raboty_Claude_v1_1.md |
§16 — schema v2, §16.7 routing-tag, §16.8 self-discipline | Modify (Task 11) |
docs/Plugin_stack_rules_v1.md |
R16 — schema v2 sync | Modify (Task 11) |
docs/superpowers/specs/2026-05-19-brain-governance-design.md |
Cross-ref to the factor-analysis spec | Modify (Task 11) |
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md |
Status draft → accepted |
Modify (Task 11) |
CLAUDE.md |
§0 cross-refs + §3.6 note + §9 changelog | Modify via plugin (Task 12) |
Task 1: Parser — environment, task_size, prompt_signal extractors
Spec §3 (environment, task_size, prompt_signal), §4.1. Adds three pure, independently-tested extractor functions and refactors parseLines to also count broken lines (needed for parse_gap in Task 2). parseTranscript is not restructured yet — only its parseLines call site is updated so existing tests stay green.
Files:
-
Modify:
tools/observer-transcript-parser.mjs -
Test:
tools/observer-transcript-parser.test.mjs -
Step 1: Write failing tests for the three extractors + parseLines counts
Append to tools/observer-transcript-parser.test.mjs. First add to the imports at the top:
import {
parseTranscript,
extractEnvironment,
extractTaskSize,
classifyPromptSignal,
} from './observer-transcript-parser.mjs';
Then append this describe block at the end of the file:
describe('extractEnvironment', () => {
it('reads economy_level from the ECONOMY MODE marker', () => {
const entries = [
userPrompt('=== ECONOMY MODE: 0% (пользователь указал явно) ===\nfix it', '2026-05-19T10:00:00Z'),
];
expect(extractEnvironment(entries, 0).economy_level).toBe(0);
});
it('economy_level is null when no marker present', () => {
const entries = [userPrompt('just do it', '2026-05-19T10:00:00Z')];
expect(extractEnvironment(entries, 0).economy_level).toBeNull();
});
it('reads model from an assistant message', () => {
const entries = [
userPrompt('go', '2026-05-19T10:00:00Z'),
{ type: 'assistant', message: { role: 'assistant', model: 'claude-opus-4-7', content: [] }, timestamp: '2026-05-19T10:01:00Z', sessionId: 's1' },
];
expect(extractEnvironment(entries, 0).model).toBe('claude-opus-4-7');
});
it('post_compaction is true when an isCompactSummary entry precedes the turn', () => {
const entries = [
{ type: 'user', isCompactSummary: true, message: { role: 'user', content: 'summary' }, timestamp: '2026-05-19T09:00:00Z' },
userPrompt('the real turn', '2026-05-19T10:00:00Z'),
];
expect(extractEnvironment(entries, 1).post_compaction).toBe(true);
});
it('post_compaction is false with no compaction marker', () => {
const entries = [userPrompt('turn one', '2026-05-19T09:00:00Z'), userPrompt('turn two', '2026-05-19T10:00:00Z')];
expect(extractEnvironment(entries, 1).post_compaction).toBe(false);
});
it('session_turn counts real user prompts up to and including the turn start', () => {
const entries = [
userPrompt('one', '2026-05-19T09:00:00Z'),
userPrompt('two', '2026-05-19T09:30:00Z'),
userPrompt('three', '2026-05-19T10:00:00Z'),
];
expect(extractEnvironment(entries, 2).session_turn).toBe(3);
});
});
describe('extractTaskSize', () => {
it('counts tool calls and unique file paths', () => {
const turn = [
assistantTurn(
[
{ type: 'tool_use', id: 't1', name: 'Read', input: { file_path: '/a.js' } },
{ type: 'tool_use', id: 't2', name: 'Edit', input: { file_path: '/a.js' } },
{ type: 'tool_use', id: 't3', name: 'Write', input: { file_path: '/b.js' } },
{ type: 'tool_use', id: 't4', name: 'Bash', input: {} },
],
'2026-05-19T10:01:00Z'
),
];
const size = extractTaskSize(turn);
expect(size.tool_calls).toBe(4);
expect(size.files_touched).toBe(2);
expect(size.files.sort()).toEqual(['/a.js', '/b.js']);
});
it('returns zeros for an empty turn', () => {
expect(extractTaskSize([])).toEqual({ tool_calls: 0, files_touched: 0, files: [] });
});
});
describe('classifyPromptSignal', () => {
it('detects corrections', () => {
expect(classifyPromptSignal('не то, переделай')).toBe('correction');
expect(classifyPromptSignal('почему ты это сделал')).toBe('correction');
});
it('detects approvals', () => {
expect(classifyPromptSignal('ок, спасибо')).toBe('approval');
expect(classifyPromptSignal('готово, дальше')).toBe('approval');
});
it('detects a new task', () => {
expect(classifyPromptSignal('добавь новую фичу экспорта в CSV')).toBe('new_task');
});
it('falls back to neutral', () => {
expect(classifyPromptSignal('hmm')).toBe('neutral');
});
});
- Step 2: Run the tests to verify they fail
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser
Expected: FAIL — extractEnvironment is not a function, extractTaskSize is not a function, classifyPromptSignal is not a function.
- Step 3: Refactor
parseLinesto count broken lines
In tools/observer-transcript-parser.mjs, replace the parseLines function (currently lines 20–32) with:
function parseLines(text) {
const entries = [];
let broken = 0;
let total = 0;
for (const line of String(text || '').split('\n')) {
const trimmed = line.trim();
if (!trimmed) continue;
total += 1;
try {
entries.push(JSON.parse(trimmed));
} catch {
broken += 1; // broken line — counted for parse_gap, never thrown
}
}
return { entries, broken, total };
}
- Step 4: Update the
parseLinescall site inparseTranscript
In parseTranscript, change the first line of the body from const entries = parseLines(transcriptText); to:
const { entries } = parseLines(transcriptText);
(This keeps parseTranscript working unchanged; the full v2 rewrite happens in Task 2.)
- Step 5: Add the three extractor functions
In tools/observer-transcript-parser.mjs, add these functions after collectToolUse (after line 97). They reuse the existing isRealUserPrompt:
const FILE_TOOLS = new Set(['Read', 'Edit', 'Write', 'MultiEdit', 'NotebookEdit']);
/**
* Deterministic environment factors for the turn that starts at turnStartIdx.
* economy_level / parallel_session are scanned from the stringified turn;
* model / post_compaction / session_turn from structural fields.
*/
export function extractEnvironment(allEntries, turnStartIdx) {
const turn = allEntries.slice(turnStartIdx);
const rawTurn = JSON.stringify(turn);
const econ = rawTurn.match(/=== ECONOMY MODE:\s*(\d+)\s*%/);
const economy_level = econ ? Number(econ[1]) : null;
let model = null;
for (const e of turn) {
if (e && e.message && e.message.model) {
model = e.message.model;
break;
}
}
let post_compaction = false;
for (let i = 0; i < turnStartIdx && i < allEntries.length; i++) {
if (allEntries[i] && allEntries[i].isCompactSummary === true) {
post_compaction = true;
break;
}
}
let session_turn = 0;
for (let i = 0; i <= turnStartIdx && i < allEntries.length; i++) {
if (isRealUserPrompt(allEntries[i])) session_turn += 1;
}
const parallel_session = /параллельн|parallel session|чужой staged|foreign git index/i.test(rawTurn);
return { economy_level, model, post_compaction, session_turn, parallel_session };
}
/** Task size: total tool calls + unique file paths touched (per spec §3, gap-resolution 2). */
export function extractTaskSize(turn) {
let tool_calls = 0;
const files = new Set();
for (const e of turn) {
const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
for (const b of content) {
if (b && b.type === 'tool_use') {
tool_calls += 1;
if (FILE_TOOLS.has(b.name) && b.input) {
const p = b.input.file_path || b.input.notebook_path;
if (p) files.add(String(p));
}
}
}
}
return { tool_calls, files_touched: files.size, files: [...files] };
}
/** Classify the opening user-prompt sentiment (per spec §6 / gap-resolution 1). */
export function classifyPromptSignal(text) {
const t = String(text || '').toLowerCase().trim();
if (/не то\b|не так\b|переделай|отбой|\bстоп\b|почему ты|неверно|не верно|это не /.test(t)) {
return 'correction';
}
if (/^(ок|окей|ok|спасибо|супер|отлично|готово|дальше|идеально)\b/.test(t)) {
return 'approval';
}
if (classifyTask(t) !== 'other' && t.length > 15) return 'new_task';
return 'neutral';
}
- Step 6: Run the tests to verify they pass
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser
Expected: PASS — all existing tests + the new extractEnvironment / extractTaskSize / classifyPromptSignal tests green.
- Step 7: Commit
git add tools/observer-transcript-parser.mjs tools/observer-transcript-parser.test.mjs
git commit -m "feat(observer): parser v2 — environment, task_size, prompt_signal extractors"
Task 2: Parser — process events, routing-tag, v2 episode assembly
Spec §3 (schema_version, decision_provenance, events[], outcome default), §4.1, §4.2. Adds process-event extraction and routing-tag parsing, then rewrites parseTranscript to assemble the full v2 episode. Exports extractLastUserPromptText for the Stop-hook routing-gate (Task 5).
Files:
-
Modify:
tools/observer-transcript-parser.mjs -
Test:
tools/observer-transcript-parser.test.mjs -
Step 1: Write failing tests for process events, routing-tag, v2 assembly
In tools/observer-transcript-parser.test.mjs, extend the import to also bring in the new functions:
import {
parseTranscript,
extractEnvironment,
extractTaskSize,
classifyPromptSignal,
extractProcessEvents,
parseRoutingTag,
extractLastUserPromptText,
} from './observer-transcript-parser.mjs';
Change the existing empty-transcript test (currently expect(ep.outcome).toBe('success')) to expect 'unknown':
it('returns safe defaults for an empty transcript', () => {
const ep = parseTranscript('');
expect(ep.task_id).toBeTruthy();
expect(ep.primary_rationale.node_chosen).toBe('direct');
expect(ep.events).toEqual([]);
expect(ep.outcome).toBe('unknown');
expect(ep.schema_version).toBe(2);
});
Append this describe block at the end of the file:
describe('extractProcessEvents', () => {
it('emits a hook_fired summary with per-hook counts and error count', () => {
const turn = [
{ attachment: { type: 'hook_success', hookName: 'PreToolUse:Read' } },
{ attachment: { type: 'hook_success', hookName: 'PreToolUse:Read' } },
{ attachment: { type: 'hook_error', hookName: 'Stop:observer' } },
];
const ev = extractProcessEvents(turn, 0, 0, 0).find((e) => e.kind === 'hook_fired');
expect(ev.counts).toEqual({ 'PreToolUse:Read': 2, 'Stop:observer': 1 });
expect(ev.errors).toBe(1);
});
it('emits an interrupt event for [Request interrupted by user]', () => {
const turn = [
{ message: { role: 'user', content: [{ type: 'text', text: '[Request interrupted by user]' }] } },
];
expect(extractProcessEvents(turn, 0, 0, 0).filter((e) => e.kind === 'interrupt')).toHaveLength(1);
});
it('emits a retry event when an errored tool is used again later', () => {
const turn = [
{ message: { role: 'assistant', content: [{ type: 'tool_use', id: 'u1', name: 'Bash', input: {} }] } },
{ message: { role: 'user', content: [{ type: 'tool_result', tool_use_id: 'u1', is_error: true }] } },
{ message: { role: 'assistant', content: [{ type: 'tool_use', id: 'u2', name: 'Bash', input: {} }] } },
];
expect(extractProcessEvents(turn, 0, 0, 0).filter((e) => e.kind === 'retry')).toHaveLength(1);
});
it('emits a time_burn event when the turn exceeds the threshold', () => {
const ev = extractProcessEvents([], 0, 0, 1000000).find((e) => e.kind === 'time_burn');
expect(ev.duration_ms).toBe(1000000);
});
it('emits a parse_gap event when the broken-line ratio is above threshold', () => {
const ev = extractProcessEvents([], 3, 10, 0).find((e) => e.kind === 'parse_gap');
expect(ev).toEqual({ kind: 'parse_gap', broken: 3, total: 10 });
});
it('emits nothing for a clean empty turn', () => {
expect(extractProcessEvents([], 0, 0, 0)).toEqual([]);
});
});
describe('parseRoutingTag', () => {
it('parses a user_directed_method routing tag from assistant text', () => {
const turn = [
assistantTurn(
[{ type: 'text', text: 'ok\n<!-- routing: provenance=user_directed_method node=discovery-interview counterfactual=brainstorming -->' }],
'2026-05-19T10:01:00Z'
),
];
expect(parseRoutingTag(turn)).toEqual({
kind: 'user_directed_method',
node: 'discovery-interview',
claude_would_have_chosen: 'brainstorming',
});
});
it('returns null when no tag is present', () => {
const turn = [assistantTurn([{ type: 'text', text: 'plain answer' }], '2026-05-19T10:01:00Z')];
expect(parseRoutingTag(turn)).toBeNull();
});
});
describe('parseTranscript — v2 episode', () => {
it('produces schema_version 2 and all v2 fields', () => {
const t = jsonl([
userPrompt('=== ECONOMY MODE: 0% ===\nдобавь фичу', '2026-05-19T10:00:00Z', 'sess-v2'),
assistantTurn([{ type: 'tool_use', id: 't1', name: 'Read', input: { file_path: '/x.js' } }], '2026-05-19T10:01:00Z', 'sess-v2'),
]);
const ep = parseTranscript(t);
expect(ep.schema_version).toBe(2);
expect(ep.task_ref).toBe('sess-v2');
expect(ep.outcome).toBe('unknown');
expect(ep.prompt_signal).toBe('new_task');
expect(ep.decision_provenance).toEqual({ kind: 'autonomous', claude_would_have_chosen: null });
expect(ep.environment.economy_level).toBe(0);
expect(ep.task_size).toEqual({ tool_calls: 1, files_touched: 1, files: ['/x.js'] });
});
it('records decision_provenance from a routing tag', () => {
const t = jsonl([
userPrompt('запусти discovery-interview', '2026-05-19T10:00:00Z', 'sess-tag'),
assistantTurn(
[{ type: 'text', text: '<!-- routing: provenance=user_directed_method node=discovery-interview counterfactual=brainstorming -->' }],
'2026-05-19T10:01:00Z',
'sess-tag'
),
]);
const ep = parseTranscript(t);
expect(ep.decision_provenance.kind).toBe('user_directed_method');
expect(ep.decision_provenance.claude_would_have_chosen).toBe('brainstorming');
});
});
describe('extractLastUserPromptText', () => {
it('returns the text of the last real user prompt', () => {
const t = jsonl([
userPrompt('first turn', '2026-05-19T09:00:00Z'),
userPrompt('second and last', '2026-05-19T10:00:00Z'),
]);
expect(extractLastUserPromptText(t)).toBe('second and last');
});
});
- Step 2: Run the tests to verify they fail
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser
Expected: FAIL — extractProcessEvents is not a function, parseRoutingTag is not a function, extractLastUserPromptText is not a function, and the v2 episode assertions fail.
- Step 3: Add process-event and routing-tag functions
In tools/observer-transcript-parser.mjs, add after the classifyPromptSignal function from Task 1:
const TIME_BURN_THRESHOLD_MS = 900000; // 15 min — turn wall-clock above this = time_burn
const PARSE_GAP_RATIO = 0.1; // >10% unparseable lines = parse_gap
/** Heuristic retry count: an errored tool whose name is used again later in the turn. */
function detectRetries(turn) {
const idToName = {};
const uses = [];
turn.forEach((entry, idx) => {
const content = entry && entry.message && Array.isArray(entry.message.content) ? entry.message.content : [];
for (const b of content) {
if (b && b.type === 'tool_use') {
idToName[b.id] = b.name;
uses.push({ name: b.name, idx });
}
}
});
const errors = [];
turn.forEach((entry, idx) => {
const content = entry && entry.message && Array.isArray(entry.message.content) ? entry.message.content : [];
for (const b of content) {
if (b && b.type === 'tool_result' && b.is_error === true) {
errors.push({ name: idToName[b.tool_use_id] || null, idx });
}
}
});
let retries = 0;
for (const err of errors) {
if (err.name && uses.some((u) => u.name === err.name && u.idx > err.idx)) retries += 1;
}
return retries;
}
/**
* Process events for the turn: hook_fired (summary), interrupt, retry,
* time_burn, parse_gap. broken/total/durationMs are computed by the caller.
*/
export function extractProcessEvents(turn, broken, total, durationMs) {
const events = [];
const hookCounts = {};
let hookErrors = 0;
for (const e of turn) {
const att = e && e.attachment;
if (att && (att.type === 'hook_success' || att.type === 'hook_error')) {
const name = att.hookName || 'unknown';
hookCounts[name] = (hookCounts[name] || 0) + 1;
if (att.type === 'hook_error') hookErrors += 1;
}
}
if (Object.keys(hookCounts).length > 0) {
events.push({ kind: 'hook_fired', counts: hookCounts, errors: hookErrors });
}
for (const e of turn) {
const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
const isUser = e && e.message && e.message.role === 'user';
if (
isUser &&
content.some((b) => b && b.type === 'text' && String(b.text || '').includes('[Request interrupted by user]'))
) {
events.push({ kind: 'interrupt' });
}
}
const retries = detectRetries(turn);
for (let i = 0; i < retries; i++) events.push({ kind: 'retry' });
if (durationMs > TIME_BURN_THRESHOLD_MS) {
events.push({ kind: 'time_burn', duration_ms: durationMs });
}
if (total > 0 && broken / total > PARSE_GAP_RATIO) {
events.push({ kind: 'parse_gap', broken, total });
}
return events;
}
const ROUTING_TAG_RE =
/<!--\s*routing:\s*provenance=([\w_]+)\s+node=(\S+)\s+counterfactual=(\S+)\s*-->/;
/** Find the routing tag Claude prints when a method was user-directed (spec §4.2). */
export function parseRoutingTag(turn) {
for (const e of turn) {
const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
for (const b of content) {
if (b && b.type === 'text' && typeof b.text === 'string') {
const m = b.text.match(ROUTING_TAG_RE);
if (m) return { kind: m[1], node: m[2], claude_would_have_chosen: m[3] };
}
}
}
return null;
}
/** Text of the last real user prompt — used by the Stop-hook routing-gate (Task 5). */
export function extractLastUserPromptText(transcriptText) {
const { entries } = parseLines(transcriptText);
const start = findTurnStart(entries);
return promptText(entries[start]);
}
- Step 4: Rewrite
parseTranscriptto assemble the v2 episode
In tools/observer-transcript-parser.mjs, replace the entire parseTranscript function (currently lines 99–148, including its JSDoc) with:
/**
* Parse a transcript JSONL string into an observer episode (schema v2).
* @param {string} transcriptText - Raw JSONL transcript contents.
* @param {string|null} fallbackSessionId - Used when the transcript has no sessionId.
* @returns {object} v2 episode.
*/
export function parseTranscript(transcriptText, fallbackSessionId = null) {
const { entries, broken, total } = parseLines(transcriptText);
const withSession = entries.find((e) => e && e.sessionId);
const sessionId =
(withSession && withSession.sessionId) || fallbackSessionId || `unknown-${Date.now()}`;
const start = findTurnStart(entries);
const turn = entries.slice(start);
const stamps = turn.map((e) => e && e.timestamp).filter(Boolean);
const started_at = stamps[0] || new Date().toISOString();
const ended_at = stamps[stamps.length - 1] || started_at;
const durationMs = new Date(ended_at) - new Date(started_at);
const { skills, counts, errorCount } = collectToolUse(turn);
const events = [];
for (const skill of skills) events.push({ kind: 'skill_invoked', skill });
if (Object.keys(counts).length > 0) events.push({ kind: 'tool_summary', counts });
for (let i = 0; i < errorCount; i++) {
events.push({ kind: 'error', message: 'tool_result reported is_error' });
}
events.push(...extractProcessEvents(turn, broken, total, durationMs));
const usedSuperpowers = skills.some((s) => String(s).startsWith(SUPERPOWERS_PREFIX));
const prompt = promptText(entries[start]);
const tag = parseRoutingTag(turn);
const decision_provenance =
tag && tag.kind === 'user_directed_method'
? { kind: 'user_directed_method', claude_would_have_chosen: tag.claude_would_have_chosen }
: { kind: 'autonomous', claude_would_have_chosen: null };
return {
schema_version: 2,
task_id: sessionId,
task_ref: sessionId,
timestamps: { started_at, ended_at },
path_type: usedSuperpowers ? 'regulated' : 'improvised',
outcome: 'unknown',
prompt_signal: classifyPromptSignal(prompt),
decision_provenance,
environment: extractEnvironment(entries, start),
task_size: extractTaskSize(turn),
primary_rationale: {
step: 1,
node_chosen: skills.length > 0 ? skills[0] : 'direct',
triggers_matched: [],
candidates_considered: [],
boundaries_applied: [],
hard_floor: usedSuperpowers
? { invoked: true, rules: ['Pravila §12'] }
: { invoked: false, rules: [] },
task_classification: classifyTask(prompt),
},
events,
};
}
- Step 5: Run the tests to verify they pass
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser
Expected: PASS — all existing tests (with the updated outcome: 'unknown' assertion) + the new process-event / routing-tag / v2-assembly / extractLastUserPromptText tests green.
- Step 6: Commit
git add tools/observer-transcript-parser.mjs tools/observer-transcript-parser.test.mjs
git commit -m "feat(observer): parser v2 — process events, routing-tag, episode assembly"
Task 3: Routing-gate method-direction detector
Spec §5.1 step 2. A pure detector — given a user-prompt text and a list of known node names, decide whether the user dictated a specific method. Conservative-broad (favours false-positives, per spec R1).
Files:
-
Create:
tools/observer-known-nodes.txt -
Create:
tools/observer-routing-detector.mjs -
Test:
tools/observer-routing-detector.test.mjs -
Step 1: Create the known-nodes data file
Create tools/observer-known-nodes.txt:
# Known router nodes — directive targets for the observer routing-gate.
# One node/skill name per line. Lines starting with # and blank lines are ignored.
# Extend this list when a new directable skill/command is added to the brain.
#
# superpowers skills
brainstorming
writing-plans
executing-plans
subagent-driven-development
test-driven-development
systematic-debugging
verification-before-completion
requesting-code-review
using-git-worktrees
finishing-a-development-branch
writing-skills
root-cause-tracing
condition-based-waiting
defense-in-depth
# project skills
discovery-interview
brain-retro
audit-portal
regression
process-modeling
process-analysis
ccpm
# plugins / commands
claude-md-management
security-review
- Step 2: Write the failing test
Create tools/observer-routing-detector.test.mjs:
import { describe, it, expect } from 'vitest';
import { detectMethodDirected, loadKnownNodes } from './observer-routing-detector.mjs';
const NODES = ['brainstorming', 'discovery-interview', 'systematic-debugging'];
describe('detectMethodDirected', () => {
it('detects a directive verb followed by a node name', () => {
expect(detectMethodDirected('запусти discovery-interview по этой фиче', NODES)).toEqual({
directed: true,
node: 'discovery-interview',
});
});
it('detects "используй X"', () => {
expect(detectMethodDirected('используй systematic-debugging здесь', NODES).directed).toBe(true);
});
it('detects a /slash-command form', () => {
expect(detectMethodDirected('сделай это через /brainstorming', NODES)).toEqual({
directed: true,
node: 'brainstorming',
});
});
it('does NOT flag a bare node mention without a directive verb', () => {
expect(detectMethodDirected('почему ты выбрал brainstorming, а не план?', NODES).directed).toBe(false);
});
it('does NOT flag a prompt with no node reference', () => {
expect(detectMethodDirected('добавь колонку Город в таблицу', NODES).directed).toBe(false);
});
it('is empty-input safe', () => {
expect(detectMethodDirected('', NODES).directed).toBe(false);
expect(detectMethodDirected(null, []).directed).toBe(false);
});
});
describe('loadKnownNodes', () => {
it('loads names, skips comments and blank lines', () => {
const nodes = loadKnownNodes('tools/observer-known-nodes.txt');
expect(nodes).toContain('brainstorming');
expect(nodes).toContain('discovery-interview');
expect(nodes.every((n) => !n.startsWith('#') && n.length > 0)).toBe(true);
});
it('returns an empty array for a missing file', () => {
expect(loadKnownNodes('tools/does-not-exist.txt')).toEqual([]);
});
});
- Step 3: Run the test to verify it fails
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector
Expected: FAIL — Failed to load ./observer-routing-detector.mjs.
- Step 4: Create the detector module
Create tools/observer-routing-detector.mjs:
#!/usr/bin/env node
/**
* Routing-gate method-direction detector (brain governance, observer
* factor-analysis spec §5.1). Pure — given a user-prompt text and a list of
* known node names, decides whether the user *dictated* a specific method.
* Conservative-broad: a directive verb within a 40-char window before a node
* name, or a /slash-command form.
*
* Security Guidance #40: pure string ops — no exec/execSync.
*/
import { readFileSync, existsSync } from 'fs';
const KNOWN_NODES_PATH = 'tools/observer-known-nodes.txt';
const DIRECTIVE_VERBS = [
'запусти', 'запускай', 'используй', 'вызови', 'вызывай', 'прогони',
'применяй', 'применить', 'через', 'run', 'use', 'invoke', 'via',
];
/** Load the directable node names from the data file (# comments / blanks skipped). */
export function loadKnownNodes(path = KNOWN_NODES_PATH) {
if (!existsSync(path)) return [];
const out = [];
for (const line of readFileSync(path, 'utf-8').split('\n')) {
const t = line.trim();
if (!t || t.startsWith('#')) continue;
out.push(t);
}
return out;
}
/**
* @returns {{directed: boolean, node: string|null}}
*/
export function detectMethodDirected(promptText, knownNodes) {
const text = String(promptText || '').toLowerCase();
for (const node of knownNodes || []) {
const n = String(node).toLowerCase();
if (!n) continue;
if (text.includes('/' + n)) return { directed: true, node };
const idx = text.indexOf(n);
if (idx === -1) continue;
const before = text.slice(Math.max(0, idx - 40), idx);
if (DIRECTIVE_VERBS.some((v) => before.includes(v))) return { directed: true, node };
}
return { directed: false, node: null };
}
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-routing-detector.mjs')) {
const det = detectMethodDirected(process.argv.slice(2).join(' '), loadKnownNodes());
console.log(JSON.stringify(det));
process.exit(0);
}
- Step 5: Run the test to verify it passes
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector
Expected: PASS — all 8 tests green.
- Step 6: Commit
git add tools/observer-known-nodes.txt tools/observer-routing-detector.mjs tools/observer-routing-detector.test.mjs
git commit -m "feat(observer): routing-gate method-direction detector"
Task 4: Stop-hook — v2 episode + observer_error marker
Spec §3 (observer_error marker), §5.2 (visibility of failure). Updates appendEpisode to validate the v2 schema and accept the minimal observer_error marker; updates buildEpisodeFromContext to produce v2 episodes on the fallback path.
Files:
-
Modify:
tools/observer-stop-hook.mjs -
Test:
tools/observer-stop-hook.test.mjs -
Step 1: Rewrite the test file fixtures + write failing tests
Replace the entire contents of tools/observer-stop-hook.test.mjs with:
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { writeFileSync, readFileSync, existsSync, mkdtempSync, rmSync, mkdirSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
import { appendEpisode, buildEpisodeFromContext, buildObserverError } from './observer-stop-hook.mjs';
let workdir;
beforeEach(() => {
workdir = mkdtempSync(join(tmpdir(), 'observer-test-'));
mkdirSync(join(workdir, 'docs', 'observer'), { recursive: true });
});
afterEach(() => {
rmSync(workdir, { recursive: true, force: true });
});
const defaultRat = () => ({
step: 1,
node_chosen: '#1',
triggers_matched: [],
candidates_considered: [],
boundaries_applied: [],
hard_floor: { invoked: false, rules: [] },
task_classification: 'other',
});
// Full schema-v2 episode fixture.
const v2Episode = (overrides = {}) => ({
schema_version: 2,
task_id: 'abc-123',
task_ref: 'abc-123',
timestamps: { started_at: '2026-05-19T10:00:00+03:00', ended_at: '2026-05-19T10:05:00+03:00' },
path_type: 'regulated',
outcome: 'unknown',
prompt_signal: 'neutral',
decision_provenance: { kind: 'autonomous', claude_would_have_chosen: null },
environment: { economy_level: 0, model: 'claude-opus-4-7', post_compaction: false, session_turn: 1, parallel_session: false },
task_size: { tool_calls: 0, files_touched: 0, files: [] },
primary_rationale: defaultRat(),
events: [],
...overrides,
});
describe('appendEpisode', () => {
it('appends one JSONL line to the monthly file', () => {
appendEpisode(v2Episode(), workdir, '2026-05');
const content = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8');
expect(content).toContain('"task_id":"abc-123"');
expect(content).toContain('"schema_version":2');
expect(content.endsWith('\n')).toBe(true);
});
it('appends to an existing file without overwrite', () => {
appendEpisode(v2Episode({ task_id: 'a' }), workdir, '2026-05');
appendEpisode(v2Episode({ task_id: 'b', outcome: 'partial' }), workdir, '2026-05');
const lines = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8').trim().split('\n');
expect(lines).toHaveLength(2);
expect(JSON.parse(lines[0]).task_id).toBe('a');
expect(JSON.parse(lines[1]).task_id).toBe('b');
});
it('applies the PII filter before write (including events[])', () => {
appendEpisode(
v2Episode({ events: [{ kind: 'error', message: 'call +79991234567 / mail x@y.com' }] }),
workdir,
'2026-05'
);
const content = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8');
expect(content).toContain('+7XXXXXXXXXX');
expect(content).toContain('***@***');
expect(content).not.toContain('79991234567');
});
it('throws on a missing required field', () => {
expect(() => appendEpisode({}, workdir, '2026-05')).toThrow(/required/i);
});
it('throws on a missing schema-v2 field', () => {
const ep = v2Episode();
delete ep.decision_provenance;
expect(() => appendEpisode(ep, workdir, '2026-05')).toThrow(/schema v2 field missing/i);
});
it('throws when schema_version is not 2', () => {
expect(() => appendEpisode(v2Episode({ schema_version: 1 }), workdir, '2026-05')).toThrow(/schema_version/i);
});
it('throws when a primary_rationale sub-field is missing', () => {
expect(() =>
appendEpisode(v2Episode({ primary_rationale: { step: 1, node_chosen: '#1' } }), workdir, '2026-05')
).toThrow(/primary_rationale field missing/i);
});
it('accepts a minimal observer_error marker', () => {
appendEpisode(
{
schema_version: 2,
observer_error: true,
error_message: 'parser blew up',
timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:00:00Z' },
task_id: 'err-1',
},
workdir,
'2026-05'
);
const line = JSON.parse(readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8').trim());
expect(line.observer_error).toBe(true);
expect(line.error_message).toBe('parser blew up');
});
it('throws when an observer_error marker is missing a field', () => {
expect(() =>
appendEpisode({ schema_version: 2, observer_error: true, task_id: 'x' }, workdir, '2026-05')
).toThrow(/observer_error marker field missing/i);
});
});
describe('buildEpisodeFromContext', () => {
it('builds a v2 episode on the fallback path (no transcript)', () => {
const ep = buildEpisodeFromContext({ session_id: 'sess-1', result: 'success' });
expect(ep.schema_version).toBe(2);
expect(ep.task_id).toBe('sess-1');
expect(ep.task_ref).toBe('sess-1');
expect(ep.outcome).toBe('success');
expect(ep.decision_provenance).toEqual({ kind: 'autonomous', claude_would_have_chosen: null });
expect(ep.environment).toEqual({
economy_level: null,
model: null,
post_compaction: false,
session_turn: 0,
parallel_session: false,
});
expect(ep.task_size).toEqual({ tool_calls: 0, files_touched: 0, files: [] });
});
it('defaults outcome to unknown when none supplied', () => {
expect(buildEpisodeFromContext({ session_id: 'x' }).outcome).toBe('unknown');
});
it('derives a v2 episode from transcriptText when provided', () => {
const transcript = [
JSON.stringify({ type: 'user', message: { role: 'user', content: 'fix the bug' }, timestamp: '2026-05-19T10:00:00Z', sessionId: 'sess-t' }),
JSON.stringify({ type: 'assistant', message: { role: 'assistant', content: [{ type: 'tool_use', id: 't1', name: 'Skill', input: { skill: 'superpowers:systematic-debugging' } }] }, timestamp: '2026-05-19T10:01:00Z', sessionId: 'sess-t' }),
].join('\n');
const ep = buildEpisodeFromContext({ session_id: 'sess-t' }, transcript);
expect(ep.schema_version).toBe(2);
expect(ep.task_id).toBe('sess-t');
expect(ep.primary_rationale.node_chosen).toBe('superpowers:systematic-debugging');
});
});
describe('buildObserverError', () => {
it('produces a minimal valid observer_error marker', () => {
const marker = buildObserverError({ session_id: 'sess-e' }, new Error('boom'));
expect(marker.observer_error).toBe(true);
expect(marker.schema_version).toBe(2);
expect(marker.task_id).toBe('sess-e');
expect(marker.error_message).toContain('boom');
expect(marker.timestamps.started_at).toBeTruthy();
});
});
- Step 2: Run the tests to verify they fail
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook
Expected: FAIL — buildObserverError is not exported, v2-validation tests fail (current appendEpisode does not validate v2 fields), buildEpisodeFromContext lacks v2 fields.
- Step 3: Update
appendEpisodewith v2 validation + observer_error branch
In tools/observer-stop-hook.mjs, replace the REQUIRED_FIELDS constant and the appendEpisode function (currently lines 22 and 42–68) with:
const REQUIRED_FIELDS = ['task_id', 'timestamps', 'path_type', 'outcome', 'primary_rationale'];
const V2_FIELDS = ['schema_version', 'decision_provenance', 'environment', 'task_size', 'task_ref'];
const OBSERVER_ERROR_FIELDS = ['schema_version', 'error_message', 'timestamps', 'task_id'];
(Leave the existing RATIONALE_FIELDS constant and validateRationale function unchanged.)
Then replace the appendEpisode function and its JSDoc with:
/**
* Append a single episode to the monthly JSONL file.
* Validates either a full schema-v2 episode or a minimal observer_error marker.
* @param {object} episode - The episode object.
* @param {string} baseDir - Repository root (default: process.cwd()).
* @param {string} month - YYYY-MM string for the file name (default: current UTC month).
*/
export function appendEpisode(episode, baseDir = process.cwd(), month = currentMonth()) {
const dir = join(baseDir, 'docs', 'observer');
if (!existsSync(dir)) {
mkdirSync(dir, { recursive: true });
}
const file = join(dir, `episodes-${month}.jsonl`);
if (episode && episode.observer_error === true) {
for (const f of OBSERVER_ERROR_FIELDS) {
if (episode[f] === undefined) {
throw new Error(`observer_error marker field missing: ${f}`);
}
}
appendFileSync(file, JSON.stringify(sanitize(episode)) + '\n', 'utf-8');
return;
}
for (const f of REQUIRED_FIELDS) {
if (episode[f] === undefined) {
throw new Error(`required field missing: ${f}`);
}
}
for (const f of V2_FIELDS) {
if (episode[f] === undefined) {
throw new Error(`schema v2 field missing: ${f}`);
}
}
if (episode.schema_version !== 2) {
throw new Error(`schema_version must be 2 (got ${episode.schema_version})`);
}
validateRationale(episode.primary_rationale);
appendFileSync(file, JSON.stringify(sanitize(episode)) + '\n', 'utf-8');
}
- Step 4: Update
buildEpisodeFromContextto produce v2 + addbuildObserverError
In tools/observer-stop-hook.mjs, replace the buildEpisodeFromContext function and its JSDoc (currently lines 70–103) with:
/**
* Build a well-formed schema-v2 episode from a Claude Code Stop-event context.
* Preferred path: when `transcriptText` is supplied, the episode is derived
* from the real session transcript via parseTranscript. Fallback path: v2
* defaults from `ctx` (an explicit ctx.primary_rationale is preserved verbatim).
* @param {object} ctx - Raw context from stdin (may be partial).
* @param {string|null} transcriptText - Raw transcript JSONL, if readable.
* @returns {object} v2 episode.
*/
export function buildEpisodeFromContext(ctx = {}, transcriptText = null) {
if (transcriptText) {
return parseTranscript(transcriptText, ctx.session_id || ctx.sessionId || ctx.task_id);
}
const sid = ctx.session_id || ctx.sessionId || ctx.task_id || `unknown-${Date.now()}`;
const now = new Date().toISOString();
return {
schema_version: 2,
task_id: sid,
task_ref: sid,
timestamps: {
started_at: ctx.started || ctx.started_at || now,
ended_at: ctx.ended || ctx.ended_at || now,
},
path_type: ctx.path_type || 'regulated',
outcome: ctx.result || ctx.outcome || 'unknown',
prompt_signal: ctx.prompt_signal || 'neutral',
decision_provenance: ctx.decision_provenance || { kind: 'autonomous', claude_would_have_chosen: null },
environment: ctx.environment || {
economy_level: null,
model: null,
post_compaction: false,
session_turn: 0,
parallel_session: false,
},
task_size: ctx.task_size || { tool_calls: 0, files_touched: 0, files: [] },
primary_rationale: ctx.primary_rationale || {
step: 1,
node_chosen: ctx.node_chosen || ctx.skill_id || 'unknown',
triggers_matched: [],
candidates_considered: [],
boundaries_applied: [],
hard_floor: ctx.hard_floor || { invoked: false, rules: [] },
task_classification: ctx.task_classification || 'other',
},
events: ctx.events || [],
};
}
/**
* Build a minimal observer_error marker — written instead of a silent skip
* when the Stop-hook fails internally (spec §3 / §5.2).
*/
export function buildObserverError(ctx = {}, err) {
const now = new Date().toISOString();
return {
schema_version: 2,
observer_error: true,
error_message: String((err && err.message) || err),
timestamps: { started_at: now, ended_at: now },
task_id: ctx.session_id || ctx.sessionId || ctx.task_id || `unknown-${Date.now()}`,
};
}
- Step 5: Run the tests to verify they pass
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook
Expected: PASS — all appendEpisode / buildEpisodeFromContext / buildObserverError tests green.
- Step 6: Commit
git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs
git commit -m "feat(observer): Stop-hook v2 episode + observer_error marker"
Task 5: Stop-hook — routing-gate enforcement
Spec §5.1 (3a routing-gate). Adds the pure routingGateDecision function and wires it + the observer_error fallback into the CLI block. The gate blocks at most once per turn (stop_hook_active guard prevents an infinite loop).
Files:
-
Modify:
tools/observer-stop-hook.mjs -
Test:
tools/observer-stop-hook.test.mjs -
Step 1: Write failing tests for
routingGateDecision
In tools/observer-stop-hook.test.mjs, extend the import line to add routingGateDecision:
import { appendEpisode, buildEpisodeFromContext, buildObserverError, routingGateDecision } from './observer-stop-hook.mjs';
Append this describe block at the end of the file:
describe('routingGateDecision', () => {
const NODES = ['discovery-interview', 'brainstorming'];
const autonomousEp = v2Episode();
const taggedEp = v2Episode({ decision_provenance: { kind: 'user_directed_method', claude_would_have_chosen: 'brainstorming' } });
it('blocks when a method was directed but no routing tag is present', () => {
const gate = routingGateDecision(autonomousEp, 'запусти discovery-interview', NODES, false);
expect(gate.block).toBe(true);
expect(gate.reason).toContain('discovery-interview');
});
it('does not block when the routing tag is present', () => {
const gate = routingGateDecision(taggedEp, 'запусти discovery-interview', NODES, false);
expect(gate.block).toBe(false);
});
it('does not block when no method was directed', () => {
const gate = routingGateDecision(autonomousEp, 'добавь колонку Город', NODES, false);
expect(gate.block).toBe(false);
});
it('does not block when stop_hook_active is true (loop guard)', () => {
const gate = routingGateDecision(autonomousEp, 'запусти discovery-interview', NODES, true);
expect(gate.block).toBe(false);
});
});
- Step 2: Run the tests to verify they fail
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook
Expected: FAIL — routingGateDecision is not exported.
- Step 3: Add the imports and
routingGateDecisionfunction
In tools/observer-stop-hook.mjs, add to the imports at the top (after the existing parseTranscript import line):
import { parseTranscript, extractLastUserPromptText } from './observer-transcript-parser.mjs';
import { detectMethodDirected, loadKnownNodes } from './observer-routing-detector.mjs';
(Replace the existing import { parseTranscript } from './observer-transcript-parser.mjs'; line — extractLastUserPromptText is now also imported.)
Add the routingGateDecision function after buildObserverError:
/**
* Routing-gate decision (spec §5.1, 3a). Pure — the CLI calls this.
* Blocks the Stop-event (decision: block) when the user dictated a method
* but the turn carries no routing tag. Skipped when stop_hook_active is true
* (the gate fires at most once per turn — no infinite loop).
* @returns {{block: boolean, reason: string|null}}
*/
export function routingGateDecision(episode, promptText, knownNodes, stopHookActive) {
if (stopHookActive) return { block: false, reason: null };
const det = detectMethodDirected(promptText, knownNodes);
if (!det.directed) return { block: false, reason: null };
if (episode && episode.decision_provenance && episode.decision_provenance.kind === 'user_directed_method') {
return { block: false, reason: null };
}
return {
block: true,
reason:
`[observer routing-gate] Похоже, метод навязан пользователем (узел "${det.node}"), ` +
`но routing-тег в этом ходе отсутствует. Добавь в свой ответ ровно одну строку:\n` +
`<!-- routing: provenance=user_directed_method node=${det.node} ` +
`counterfactual=<узел, который ты выбрал бы автономно> -->`,
};
}
- Step 4: Rewrite the CLI block to wire the gate + observer_error fallback
In tools/observer-stop-hook.mjs, replace the entire CLI block (currently lines 110–142, from if (process.argv[1] && ... to the closing }) with:
// CLI entry point: read JSON context from stdin (Claude Code Stop-event hook contract)
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-stop-hook.mjs')) {
const chunks = [];
process.stdin.on('data', (c) => chunks.push(c));
process.stdin.on('end', () => {
let ctx = {};
try {
const raw = Buffer.concat(chunks).toString('utf-8');
if (raw.trim()) ctx = JSON.parse(raw);
} catch (_e) {
// best-effort: build a minimal episode even if stdin is malformed
}
// Claude Code's Stop-event supplies transcript_path — the real source of
// session data. Read it best-effort; fall back to ctx-only on any error.
let transcriptText = null;
const tp = ctx.transcript_path || ctx.transcriptPath;
if (tp) {
try {
if (existsSync(tp)) transcriptText = readFileSync(tp, 'utf-8');
} catch (_e) {
transcriptText = null;
}
}
try {
const ep = buildEpisodeFromContext(ctx, transcriptText);
// Always write the episode first — exit-0-safe (spec §5.1 step 1).
appendEpisode(ep);
// Then the routing-gate (spec §5.1 steps 2-4).
if (transcriptText) {
const promptText = extractLastUserPromptText(transcriptText);
const gate = routingGateDecision(ep, promptText, loadKnownNodes(), ctx.stop_hook_active === true);
if (gate.block) {
process.stdout.write(JSON.stringify({ decision: 'block', reason: gate.reason }));
process.exit(0);
}
}
process.exit(0);
} catch (err) {
// Visible failure (spec §5.2): write an observer_error marker, never a silent skip.
try {
appendEpisode(buildObserverError(ctx, err));
} catch (_e2) {
// last-resort: even the marker failed — do not crash the Stop-event
}
console.error(`[observer-stop-hook] error: ${err.message}`);
process.exit(0); // never block the Stop-event on an internal error
}
});
}
- Step 5: Run the tests to verify they pass
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook
Expected: PASS — all tests including the 4 routingGateDecision tests green.
- Step 6: Manual CLI smoke against the real transcript
In PowerShell, from the repo root, run (replace the path with the current session's real transcript JSONL under C:\Users\Administrator\.claude\projects\...):
'{"session_id":"smoke","transcript_path":"C:/Users/Administrator/.claude/projects/c---------------------crm-------------/553717ec-bf55-43dc-8b9c-b9812711023a.jsonl"}' | node tools/observer-stop-hook.mjs
Expected: the command exits 0; the last line of docs/observer/episodes-2026-05.jsonl is a populated v2 episode ("schema_version":2, a real task_id, non-empty environment / task_size). Then revert that smoke line so it does not pollute the evidence log:
$f = 'docs/observer/episodes-2026-05.jsonl'
$lines = Get-Content $f
Set-Content $f -Value ($lines[0..($lines.Count - 2)]) -Encoding utf8
(If episodes-2026-05.jsonl had only the smoke line, delete the file instead.)
- Step 7: Commit
git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs
git commit -m "feat(observer): Stop-hook routing-gate enforcement"
Task 6: C5 — observer-coverage-checker
Spec §5.2 (3b — coverage control + registration integrity). A warn-only controller (always exits 0): flags observer coverage gaps and broken registration. Surfaced in STATUS.md by Task 7; never blocks a commit (spec §5.2 says "флаг", not "блокирует" — the unbypassable enforcement is the routing-gate of Task 5).
Files:
-
Create:
tools/observer-coverage-checker.mjs -
Test:
tools/observer-coverage-checker.test.mjs -
Step 1: Write the failing test
Create tools/observer-coverage-checker.test.mjs:
import { describe, it, expect } from 'vitest';
import { checkCoverage, checkRegistration } from './observer-coverage-checker.mjs';
describe('checkCoverage', () => {
it('flags recent commits but zero episodes', () => {
const r = checkCoverage(0, 7);
expect(r.ok).toBe(false);
expect(r.detail).toContain('0 observer episodes');
});
it('is ok when episodes exist', () => {
expect(checkCoverage(5, 7).ok).toBe(true);
});
it('is ok when there is no recent git activity', () => {
expect(checkCoverage(0, 0).ok).toBe(true);
});
});
describe('checkRegistration', () => {
const goodSettings = {
hooks: { Stop: [{ hooks: [{ type: 'command', command: 'node tools/observer-stop-hook.mjs' }] }] },
};
it('is ok when the Stop-hook is registered and post-commit exists', () => {
const r = checkRegistration(goodSettings, true);
expect(r.ok).toBe(true);
});
it('flags a missing Stop-hook registration', () => {
const r = checkRegistration({ hooks: { Stop: [] } }, true);
expect(r.ok).toBe(false);
expect(r.detail).toContain('observer-stop-hook NOT registered');
});
it('flags a missing post-commit hook', () => {
const r = checkRegistration(goodSettings, false);
expect(r.ok).toBe(false);
expect(r.detail).toContain('post-commit');
});
it('handles an empty settings object', () => {
expect(checkRegistration({}, false).ok).toBe(false);
});
});
- Step 2: Run the test to verify it fails
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-coverage-checker
Expected: FAIL — Failed to load ./observer-coverage-checker.mjs.
- Step 3: Create the controller module
Create tools/observer-coverage-checker.mjs:
#!/usr/bin/env node
/**
* C5 observer-coverage-checker (brain governance, observer factor-analysis
* spec §5.2). Warn-only — always exits 0. Two checks:
* 1. Coverage — recent git commits but 0 observer episodes this month.
* 2. Registration integrity — observer Stop-hook present in
* .claude/settings.json and .git/hooks/post-commit installed.
* Findings are surfaced in docs/observer/STATUS.md (C4 generator); this
* controller never blocks a commit.
*
* Security Guidance #40: git is invoked via execFileSync (argument array,
* no shell) — no exec/execSync.
*/
import { readFileSync, existsSync } from 'fs';
import { join } from 'path';
import { execFileSync } from 'child_process';
const RECENT_WINDOW = '14 days ago';
/** @returns {{ok: boolean, detail: string}} */
export function checkCoverage(episodeCount, recentCommitCount) {
if (recentCommitCount > 0 && episodeCount === 0) {
return {
ok: false,
detail: `${recentCommitCount} commit(s) in the last 2 weeks but 0 observer episodes this month`,
};
}
return { ok: true, detail: `${episodeCount} episode(s), ${recentCommitCount} recent commit(s)` };
}
/** @returns {{ok: boolean, detail: string}} */
export function checkRegistration(settingsJson, postCommitExists) {
const problems = [];
const stopHooks = (((settingsJson || {}).hooks || {}).Stop) || [];
const hasObserverStop = stopHooks.some((entry) =>
((entry && entry.hooks) || []).some((h) => String((h && h.command) || '').includes('observer-stop-hook'))
);
if (!hasObserverStop) {
problems.push('observer-stop-hook NOT registered in .claude/settings.json Stop hook');
}
if (!postCommitExists) {
problems.push('.git/hooks/post-commit not installed (run: npx lefthook install --force)');
}
return {
ok: problems.length === 0,
detail: problems.length ? problems.join('; ') : 'Stop-hook + post-commit OK',
};
}
function countEpisodes(root) {
const month = new Date().toISOString().slice(0, 7);
const file = join(root, 'docs', 'observer', `episodes-${month}.jsonl`);
if (!existsSync(file)) return 0;
return readFileSync(file, 'utf-8').trim().split('\n').filter(Boolean).length;
}
function countRecentCommits(root) {
try {
const out = execFileSync('git', ['log', `--since=${RECENT_WINDOW}`, '--oneline'], {
cwd: root,
encoding: 'utf-8',
stdio: ['ignore', 'pipe', 'ignore'],
});
return out.trim() ? out.trim().split('\n').length : 0;
} catch {
return 0;
}
}
export function runCoverageChecker(root = process.cwd()) {
const coverage = checkCoverage(countEpisodes(root), countRecentCommits(root));
let settings = {};
try {
settings = JSON.parse(readFileSync(join(root, '.claude', 'settings.json'), 'utf-8'));
} catch {
settings = {};
}
const registration = checkRegistration(settings, existsSync(join(root, '.git', 'hooks', 'post-commit')));
return { coverage, registration };
}
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-coverage-checker.mjs')) {
const { coverage, registration } = runCoverageChecker();
if (!coverage.ok) console.warn(`[observer-coverage-checker] WARN — coverage: ${coverage.detail}`);
if (!registration.ok) console.warn(`[observer-coverage-checker] WARN — registration: ${registration.detail}`);
if (coverage.ok && registration.ok) {
console.log(`[observer-coverage-checker] OK — ${coverage.detail}; ${registration.detail}`);
}
process.exit(0); // warn-only — never blocks a commit
}
- Step 4: Run the test to verify it passes
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-coverage-checker
Expected: PASS — all 7 tests green.
- Step 5: Commit
git add tools/observer-coverage-checker.mjs tools/observer-coverage-checker.test.mjs
git commit -m "feat(observer): coverage + registration-integrity controller (C5)"
Task 7: STATUS.md generator — C5 row + observer_error metric
Spec §5.2 (visibility surfaced in STATUS.md). Adds a C5 row to the dashboard table and an observer_error count to the metrics block.
Files:
-
Modify:
tools/status-md-generator.mjs -
Test:
tools/status-md-generator.test.mjs -
Step 1: Update the test file
Replace the entire contents of tools/status-md-generator.test.mjs with:
import { describe, it, expect } from 'vitest';
import { renderStatus } from './status-md-generator.mjs';
const baseInputs = (overrides = {}) => ({
now: '2026-05-19T10:00:00+03:00',
c1: { status: 'ok', detail: 'no drift' },
c2: { status: 'ok', detail: '0 version drift' },
c3: { status: 'ok', detail: 'last read today' },
c5: { status: 'ok', detail: 'coverage OK · registration OK' },
observer: { episodeCount: 12, observerErrors: 0, piiMatches: 0 },
...overrides,
});
describe('renderStatus', () => {
it('renders all 5 controllers + metrics', () => {
const md = renderStatus(baseInputs());
expect(md).toContain('# Brain Status');
expect(md).toContain('| C1 L1-watcher | ✅');
expect(md).toContain('| C2 Cross-ref consistency | ✅');
expect(md).toContain('| C3 Observer-of-observer | ✅');
expect(md).toContain('| C4 Сигнальный статус | ✅');
expect(md).toContain('| C5 Observer-coverage | ✅');
expect(md).toContain('12 episodes');
});
it('shows a warn status for the coverage controller', () => {
const md = renderStatus(baseInputs({ c5: { status: 'warn', detail: '3 commits, 0 episodes' } }));
expect(md).toContain('| C5 Observer-coverage | ⚠️');
});
it('shows the observer_error count in the metrics block', () => {
const md = renderStatus(baseInputs({ observer: { episodeCount: 4, observerErrors: 2, piiMatches: 0 } }));
expect(md).toContain('2 observer_error markers');
});
it('shows a red status for failing controllers', () => {
const md = renderStatus(baseInputs({ c1: { status: 'fail', detail: '2 plugins not formalized' } }));
expect(md).toContain('| C1 L1-watcher | 🔴');
});
it('mentions the capability-readiness behavioral rule', () => {
const md = renderStatus(baseInputs());
expect(md).toContain('capability-readiness');
expect(md).toContain('feedback_brain_unused_tools_not_problem');
});
});
- Step 2: Run the test to verify it fails
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run status-md-generator
Expected: FAIL — the C5 row and observer_error markers text are not yet rendered.
- Step 3: Update
renderStatusto render C5 + observer_error
In tools/status-md-generator.mjs, replace the renderStatus function (currently lines 10–32) with:
export function renderStatus(inputs) {
const { now, c1, c2, c3, c5, observer } = inputs;
return `# Brain Status (auto-generated)
Last updated: ${now}
| Контролёр | Состояние | Детали |
|---|---|---|
| C1 L1-watcher | ${iconFor(c1.status)} | ${c1.detail || '—'} |
| C2 Cross-ref consistency | ${iconFor(c2.status)} | ${c2.detail || '—'} |
| C3 Observer-of-observer | ${iconFor(c3.status)} | ${c3.detail || '—'} |
| C4 Сигнальный статус | ✅ | This file (self-reference) |
| C5 Observer-coverage | ${iconFor(c5.status)} | ${c5.detail || '—'} |
## Метрики (информационные, не алерты)
- Observer evidence: ${observer.episodeCount} episodes this month, ${observer.observerErrors} observer_error markers, ${observer.piiMatches} PII matches before filter
- Использование узлов: см. \`/brain-retro\` (раз в спринт). **Неиспользованные узлы — не проблема** (capability-readiness; см. memory \`feedback_brain_unused_tools_not_problem\` — outside-repo memory store).
## Алерт-индикаторы
✅ — норма ・ ⚠️ — внимание ・ 🔴 — действие требуется ・ ⚪ — не запускалось
`;
}
- Step 4: Update the CLI block to compute C5 + observer_error count
In tools/status-md-generator.mjs, add the import at the top (after the existing import { execFileSync } line):
import { runCoverageChecker } from './observer-coverage-checker.mjs';
Add a countObserverErrors function after the existing countEpisodes function:
function countObserverErrors() {
const dir = 'docs/observer';
if (!existsSync(dir)) return 0;
const month = new Date().toISOString().slice(0, 7);
const file = join(dir, `episodes-${month}.jsonl`);
if (!existsSync(file)) return 0;
return readFileSync(file, 'utf-8')
.trim()
.split('\n')
.filter((l) => l.includes('"observer_error":true')).length;
}
Replace the CLI block (currently lines 52–63, from if (process.argv[1] && ...) with:
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/status-md-generator.mjs')) {
const cov = runCoverageChecker();
const c5ok = cov.coverage.ok && cov.registration.ok;
const inputs = {
now: new Date().toISOString(),
c1: runControllerNode(['tools/l1-watcher.mjs']),
c2: runControllerNode(['tools/cross-ref-checker.mjs']),
c3: runControllerNode(['tools/observer-of-observer.mjs', 'check']),
c5: {
status: c5ok ? 'ok' : 'warn',
detail: [cov.coverage.detail, cov.registration.detail].join(' · '),
},
observer: {
episodeCount: countEpisodes(),
observerErrors: countObserverErrors(),
piiMatches: 0,
},
};
const md = renderStatus(inputs);
writeFileSync('docs/observer/STATUS.md', md);
console.log(`[status-md-generator] OK — wrote docs/observer/STATUS.md`);
}
- Step 5: Run the test to verify it passes
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run status-md-generator
Expected: PASS — all 5 tests green.
- Step 6: Commit
git add tools/status-md-generator.mjs tools/status-md-generator.test.mjs
git commit -m "feat(observer): STATUS.md — C5 row + observer_error metric"
Task 8: Wire C5 into lefthook
Spec §5.2. Adds the C5 observer-coverage-checker as pre-commit job 15. Warn-only — the script self-guarantees exit 0, so the job never blocks a commit.
Files:
-
Modify:
lefthook.yml -
Step 1: Add job 15
In lefthook.yml, inside pre-commit.jobs, add this entry directly after the existing job 13 (observer-of-observer) and before the # Post-commit: comment line:
# 15. observer-coverage-checker — brain governance C5 (observer factor-
# analysis spec §5.2). Warn-only (script always exits 0). Flags observer
# coverage gaps (git activity but 0 episodes) + registration-integrity
# breaks (Stop-hook missing from settings.json, post-commit not installed).
# Findings surface in docs/observer/STATUS.md C5 row — never blocks a commit.
- name: observer-coverage-checker
run: node tools/observer-coverage-checker.mjs
fail_text: |
observer-coverage-checker reports a gap (coverage or registration).
See docs/observer/STATUS.md C5 row for details.
(Job numbering note: the C4 status-md generator is post-commit job 14; the new pre-commit job is numbered 15 to keep brain-governance jobs 11–15 contiguous.)
- Step 2: Verify lefthook accepts the config and the job runs
Run: npx lefthook run pre-commit
Expected: lefthook lists observer-coverage-checker among the jobs; it prints either [observer-coverage-checker] OK — ... or a WARN line, and the overall pre-commit run does not fail on this job (exit 0 from the script).
- Step 3: Commit
git add lefthook.yml
git commit -m "chore(observer): wire C5 coverage-checker into lefthook (job 15)"
Task 9: brain-retro analyzer
Spec §6 (Layer 4). A pure, deterministic aggregation module — outcome inference, episode-double-write dedup, episode→task grouping, causal-chain candidates, factor matrix. Read-only — never writes JSONL. The /brain-retro skill (Task 10) calls its CLI.
Files:
-
Create:
tools/brain-retro-analyzer.mjs -
Test:
tools/brain-retro-analyzer.test.mjs -
Step 1: Write the failing test
Create tools/brain-retro-analyzer.test.mjs:
import { describe, it, expect } from 'vitest';
import {
dedupeEpisodes,
inferOutcome,
groupEpisodesToTasks,
findCausalChains,
buildFactorMatrix,
analyze,
} from './brain-retro-analyzer.mjs';
// Minimal v2 episode for tests.
const ep = (overrides = {}) => ({
schema_version: 2,
task_id: 's1',
task_ref: 's1',
timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:05:00Z' },
path_type: 'regulated',
outcome: 'unknown',
prompt_signal: 'neutral',
decision_provenance: { kind: 'autonomous', claude_would_have_chosen: null },
environment: { economy_level: 0, model: 'claude-opus-4-7', post_compaction: false, session_turn: 1, parallel_session: false },
task_size: { tool_calls: 5, files_touched: 1, files: ['/a.js'] },
primary_rationale: { step: 1, node_chosen: 'direct', triggers_matched: [], candidates_considered: [], boundaries_applied: [], hard_floor: { invoked: false, rules: [] }, task_classification: 'feature' },
events: [],
...overrides,
});
describe('dedupeEpisodes', () => {
it('keeps the last of two episodes with the same task_id + started_at', () => {
const a = ep({ outcome: 'unknown' });
const b = ep({ outcome: 'partial' }); // same task_id + started_at — routing-gate double-write
const out = dedupeEpisodes([a, b]);
expect(out).toHaveLength(1);
expect(out[0].outcome).toBe('partial');
});
it('keeps all observer_error markers', () => {
const out = dedupeEpisodes([ep(), { observer_error: true, task_id: 'e' }, { observer_error: true, task_id: 'e2' }]);
expect(out.filter((e) => e.observer_error)).toHaveLength(2);
});
});
describe('inferOutcome', () => {
it('infers rework when the next episode opens with a correction', () => {
expect(inferOutcome(ep(), ep({ prompt_signal: 'correction' }))).toBe('rework');
});
it('infers success when the next episode opens with approval', () => {
expect(inferOutcome(ep(), ep({ prompt_signal: 'approval' }))).toBe('success');
});
it('infers partial when the episode has an interrupt event', () => {
expect(inferOutcome(ep({ events: [{ kind: 'interrupt' }] }), ep())).toBe('partial');
});
it('infers unknown when there is no next episode', () => {
expect(inferOutcome(ep(), null)).toBe('unknown');
});
});
describe('groupEpisodesToTasks', () => {
it('starts a new task after a success and on a new_task prompt', () => {
const eps = [
ep({ timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:01:00Z' }, prompt_signal: 'new_task' }),
ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, prompt_signal: 'approval' }),
ep({ timestamps: { started_at: '2026-05-19T10:04:00Z', ended_at: '2026-05-19T10:05:00Z' }, prompt_signal: 'new_task' }),
];
const tasks = groupEpisodesToTasks(eps);
expect(tasks.length).toBeGreaterThanOrEqual(2);
});
});
describe('findCausalChains', () => {
it('links an errored episode to a later episode that shares a file', () => {
const a = ep({ timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:01:00Z' }, events: [{ kind: 'error', message: 'x' }], task_size: { tool_calls: 1, files_touched: 1, files: ['/shared.js'] } });
const b = ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, task_size: { tool_calls: 1, files_touched: 1, files: ['/shared.js'] } });
const chains = findCausalChains([a, b]);
expect(chains).toHaveLength(1);
expect(chains[0].sharedFiles).toEqual(['/shared.js']);
});
it('returns no chain when no files are shared', () => {
const a = ep({ events: [{ kind: 'error', message: 'x' }], task_size: { tool_calls: 1, files_touched: 1, files: ['/a.js'] } });
const b = ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, task_size: { tool_calls: 1, files_touched: 1, files: ['/b.js'] } });
expect(findCausalChains([a, b])).toHaveLength(0);
});
});
describe('buildFactorMatrix', () => {
it('tabulates outcome distribution per factor value', () => {
const eps = [
{ ...ep(), _inferredOutcome: 'rework', decision_provenance: { kind: 'user_directed_method' } },
{ ...ep(), _inferredOutcome: 'success', decision_provenance: { kind: 'autonomous' } },
];
const m = buildFactorMatrix(eps);
expect(m.decision_provenance.user_directed_method.rework).toBe(1);
expect(m.decision_provenance.autonomous.success).toBe(1);
});
});
describe('analyze', () => {
it('returns episodeCount, tasks, causalChains and factorMatrix', () => {
const result = analyze([ep(), ep({ timestamps: { started_at: '2026-05-19T11:00:00Z', ended_at: '2026-05-19T11:01:00Z' }, prompt_signal: 'correction' })]);
expect(result.episodeCount).toBe(2);
expect(result.factorMatrix).toBeDefined();
expect(Array.isArray(result.tasks)).toBe(true);
expect(Array.isArray(result.causalChains)).toBe(true);
});
});
- Step 2: Run the test to verify it fails
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run brain-retro-analyzer
Expected: FAIL — Failed to load ./brain-retro-analyzer.mjs.
- Step 3: Create the analyzer module
Create tools/brain-retro-analyzer.mjs:
#!/usr/bin/env node
/**
* Brain-retro analyzer (brain governance, observer factor-analysis spec §6).
* Pure, deterministic Layer-4 aggregation over observer episodes for the
* /brain-retro skill. Read-only — never writes JSONL. No LLM.
*
* Security Guidance #40: pure parsing — no exec/execSync.
*/
import { readFileSync, existsSync } from 'fs';
const SIZE_SMALL = 20;
const SIZE_LARGE = 60;
/**
* Deduplicate the routing-gate double-write: a turn that was blocked then
* re-stopped yields two episodes with the same task_id + started_at. Keep the
* last (most complete). observer_error markers are all kept.
*/
export function dedupeEpisodes(episodes) {
const errors = episodes.filter((e) => e && e.observer_error);
const normal = episodes.filter((e) => e && !e.observer_error);
const byKey = new Map();
for (const e of normal) {
byKey.set(`${e.task_id}|${(e.timestamps || {}).started_at}`, e);
}
return [...byKey.values(), ...errors];
}
/** Infer the true outcome of an episode from the next episode's opening prompt. */
export function inferOutcome(episode, nextEpisode) {
if (episode && Array.isArray(episode.events) && episode.events.some((e) => e.kind === 'interrupt')) {
return 'partial';
}
if (!nextEpisode) return 'unknown';
if (nextEpisode.prompt_signal === 'correction') return 'rework';
if (nextEpisode.prompt_signal === 'approval' || nextEpisode.prompt_signal === 'new_task') return 'success';
return 'unknown';
}
function bySessionSorted(episodes) {
const map = new Map();
for (const e of episodes) {
if (e.observer_error) continue;
const sid = e.task_id || 'unknown';
if (!map.has(sid)) map.set(sid, []);
map.get(sid).push(e);
}
for (const eps of map.values()) {
eps.sort((a, b) =>
String((a.timestamps || {}).started_at).localeCompare(String((b.timestamps || {}).started_at))
);
}
return map;
}
/** Group episodes into tasks: a new task starts after a success or on a new_task prompt. */
export function groupEpisodesToTasks(episodes) {
const tasks = [];
for (const [sid, eps] of bySessionSorted(episodes)) {
let current = null;
eps.forEach((episode, i) => {
const prev = eps[i - 1];
const prevOutcome = prev ? inferOutcome(prev, episode) : null;
const isNewTask = i === 0 || prevOutcome === 'success' || episode.prompt_signal === 'new_task';
if (isNewTask) {
current = { task_ref: `${sid}#${tasks.length + 1}`, episodes: [] };
tasks.push(current);
}
current.episodes.push(episode);
});
}
return tasks;
}
/** Causal-chain candidates: an errored episode → a later episode sharing a file. */
export function findCausalChains(episodes) {
const sorted = episodes
.filter((e) => !e.observer_error)
.slice()
.sort((a, b) =>
String((a.timestamps || {}).started_at).localeCompare(String((b.timestamps || {}).started_at))
);
const chains = [];
for (let i = 0; i < sorted.length - 1; i++) {
const a = sorted[i];
const hasError = Array.isArray(a.events) && a.events.some((e) => e.kind === 'error');
if (!hasError) continue;
const filesA = new Set(((a.task_size || {}).files) || []);
if (filesA.size === 0) continue;
for (let j = i + 1; j < sorted.length; j++) {
const b = sorted[j];
const shared = (((b.task_size || {}).files) || []).filter((f) => filesA.has(f));
if (shared.length > 0) {
chains.push({
from: `${a.task_id}|${(a.timestamps || {}).started_at}`,
to: `${b.task_id}|${(b.timestamps || {}).started_at}`,
sharedFiles: shared,
});
break;
}
}
}
return chains;
}
function sizeBucket(toolCalls) {
const n = Number(toolCalls) || 0;
return n < SIZE_SMALL ? 'small' : n <= SIZE_LARGE ? 'medium' : 'large';
}
const FACTOR_FNS = {
decision_provenance: (e) => (e.decision_provenance || {}).kind || 'unknown',
economy_level: (e) => String((e.environment || {}).economy_level ?? 'null'),
model: (e) => (e.environment || {}).model || 'null',
post_compaction: (e) => String((e.environment || {}).post_compaction ?? false),
task_size: (e) => sizeBucket((e.task_size || {}).tool_calls),
node_chosen: (e) => (e.primary_rationale || {}).node_chosen || 'direct',
task_classification: (e) => (e.primary_rationale || {}).task_classification || 'other',
};
/** Factor matrix: rows = factor values, columns = outcome distribution (spec §6). */
export function buildFactorMatrix(episodesWithOutcome) {
const matrix = {};
for (const [fname, fn] of Object.entries(FACTOR_FNS)) {
matrix[fname] = {};
for (const e of episodesWithOutcome) {
const val = fn(e);
const outcome = e._inferredOutcome || 'unknown';
matrix[fname][val] = matrix[fname][val] || {};
matrix[fname][val][outcome] = (matrix[fname][val][outcome] || 0) + 1;
}
}
return matrix;
}
/** Full deterministic aggregation: dedup → infer outcomes → group → chains → matrix. */
export function analyze(episodes) {
const deduped = dedupeEpisodes(episodes);
const normal = deduped.filter((e) => !e.observer_error);
for (const eps of bySessionSorted(normal).values()) {
eps.forEach((episode, i) => {
episode._inferredOutcome = inferOutcome(episode, eps[i + 1]);
});
}
return {
episodeCount: normal.length,
observerErrorCount: deduped.length - normal.length,
tasks: groupEpisodesToTasks(normal),
causalChains: findCausalChains(normal),
factorMatrix: buildFactorMatrix(normal),
};
}
function loadEpisodes(files) {
const eps = [];
for (const f of files) {
if (!existsSync(f)) continue;
for (const line of readFileSync(f, 'utf-8').split('\n')) {
const t = line.trim();
if (!t) continue;
try {
eps.push(JSON.parse(t));
} catch {
// skip broken line
}
}
}
return eps;
}
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/brain-retro-analyzer.mjs')) {
const result = analyze(loadEpisodes(process.argv.slice(2)));
console.log(JSON.stringify(result, null, 2));
process.exit(0);
}
- Step 4: Run the test to verify it passes
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run brain-retro-analyzer
Expected: PASS — all tests across the 6 describe blocks green.
- Step 5: Commit
git add tools/brain-retro-analyzer.mjs tools/brain-retro-analyzer.test.mjs
git commit -m "feat(observer): brain-retro analyzer — outcome inference + factor matrix"
Task 10: brain-retro skill + aggregation template + observer README
Spec §6 (Layer 4 wiring). Updates the /brain-retro skill procedure to run the analyzer, refreshes the aggregation template for the v2 factor matrix, and documents schema v2 / observer_error / routing-tag in the observer README. No code — Markdown only.
Files:
-
Modify:
.claude/skills/brain-retro/SKILL.md -
Modify:
.claude/skills/brain-retro/references/aggregation-template.md -
Modify:
docs/observer/README.md -
Step 1: Update the
/brain-retroskill procedure
In .claude/skills/brain-retro/SKILL.md, replace step 5 of the ## Procedure section (currently 5. **Aggregate** per references/aggregation-template.md — includes Factor analysis matrix (v1.1+) on 5 axes.) with:
5. **Run the deterministic analyzer**: `node tools/brain-retro-analyzer.mjs docs/observer/episodes-YYYY-MM.jsonl` (pass every monthly file in the period). It returns JSON with `episodeCount`, `observerErrorCount`, `tasks` (episodes grouped into tasks), `causalChains` (error→fix candidates) and `factorMatrix` (outcome distribution per factor). The analyzer deduplicates the routing-gate double-write and infers the true `outcome` of each episode from the next episode's `prompt_signal` — never trust the stored `outcome` (it is `unknown` at write time).
6. **Aggregate** per `references/aggregation-template.md` — fill the Factor analysis matrix from the analyzer's `factorMatrix`, the task groups from `tasks`, the causal-chain candidates from `causalChains`.
Then renumber the subsequent steps: the old step 6 (Propose candidates) becomes step 7, old step 7 (Save retro note) becomes step 8, old step 8 (Report to user) becomes step 9.
- Step 2: Update the aggregation template for the v2 factor matrix
In .claude/skills/brain-retro/references/aggregation-template.md, replace the ## Factor analysis matrix (v1.1+ ...) section (the heading at line 30 down to and including the ### Cross-tab: factor × factor block, ending at line 74) with:
## Factor analysis matrix (v2 — from `tools/brain-retro-analyzer.mjs`)
Outcome distribution per factor value. Source: the analyzer's `factorMatrix`.
Outcome is the *inferred* outcome (next-prompt sentiment), not the stored
`unknown`. The factor `decision_provenance` directly answers the owner's
question — "is the rework mine or the router's?"
For each factor below, render a table: factor value × outcome counts
(`success` / `partial` / `rework` / `unknown`).
### decision_provenance (autonomous vs user_directed_method)
| provenance | success | partial | rework | unknown |
|---|---|---|---|---|
### economy_level
| economy_level | success | partial | rework | unknown |
|---|---|---|---|---|
### model · post_compaction · task_size bucket
(one table each — same columns)
### node_chosen · task_classification
(one table each — same columns)
## Episodes → tasks (from analyzer `tasks`)
| task_ref | episodes | turns that are rework |
|---|---|---|
## Causal-chain candidates (from analyzer `causalChains`)
| from (errored episode) | to (later episode) | shared files |
|---|---|---|
## Observer health
- `observerErrorCount` from the analyzer — observer_error markers in the period.
Non-zero = the observer failed silently somewhere; investigate.
- Step 3: Update the observer README
In docs/observer/README.md, replace the ## Files section's first bullet (the episodes-YYYY-MM.jsonl line) with:
- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event. Schema **v2** (`schema_version: 2`): the 5 mandatory fields + `decision_provenance` (who chose the node), `environment` (economy_level / model / post_compaction / session_turn / parallel_session), `task_size`, `task_ref`, `prompt_signal`, and an `outcome` that is `unknown` at write time (refined by `/brain-retro`). On an internal hook failure a minimal `observer_error` marker line is written instead of a silent skip. Written by `tools/observer-stop-hook.mjs` via `tools/observer-transcript-parser.mjs`.
Then add a new section after the ## Lifecycle section:
## Routing-tag discipline
When the user dictates a specific method/node (e.g. «запусти discovery-interview»), Claude must emit one line in its response:
The Stop-hook routing-gate (`tools/observer-routing-detector.mjs` + `routingGateDecision`) detects a dictated method; if the tag is missing it returns `decision: block`, so the turn cannot end without the tag. The gate fires at most once per turn (`stop_hook_active` guard). This makes `decision_provenance` reliable — factor analysis can separate a router error from a user-dictated one.
- Step 4: Verify the Markdown lints
Run: npx markdownlint-cli2 ".claude/skills/brain-retro/SKILL.md" ".claude/skills/brain-retro/references/aggregation-template.md" "docs/observer/README.md"
Expected: 0 errors (the PostToolUse hook auto-fixes most issues on write; this confirms).
- Step 5: Commit
git add .claude/skills/brain-retro/SKILL.md .claude/skills/brain-retro/references/aggregation-template.md docs/observer/README.md
git commit -m "docs(observer): brain-retro skill + README for schema v2"
Task 11: Normative sync — ADR-011, Pravila §16, PSR_v1 R16, spec cross-refs
Spec §7. Amends ADR-011, extends Pravila §16, syncs PSR_v1 R16, cross-links the brain-governance spec, and flips this spec's status. Pre-flight sync is mandatory before any edit (Pravila §15.2).
Files:
-
Modify:
docs/adr/ADR-011-brain-governance.md -
Modify:
docs/Pravila_raboty_Claude_v1_1.md -
Modify:
docs/Plugin_stack_rules_v1.md -
Modify:
docs/superpowers/specs/2026-05-19-brain-governance-design.md -
Modify:
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md -
Step 1: Pre-flight sync (Pravila §15.2)
Run: git fetch origin && git log HEAD..origin/main --oneline
Expected: empty output (local branch up to date). If output is non-empty — origin/main moved; integrate (git rebase origin/main or merge) before editing any normative file, since Pravila / PSR_v1 / the ADR are on the 8-file sync list.
- Step 2: Amend ADR-011
In docs/adr/ADR-011-brain-governance.md:
(a) Replace the ## Status body (Accepted (2026-05-19).) with:
Accepted (2026-05-19). **Amended 2026-05-19** — observer factor-analysis extension: episode schema v2, two-sided enforcement (routing-gate + C5 controller). See Decision §5.
(b) In the ## Decision section, change the ### 3. heading from ### 3. 4 mechanical controllers (first wave) to ### 3. 5 mechanical controllers, and add a 5th bullet after the C4 bullet:
- **C5 Observer-coverage-checker** — lefthook warn-only job. Flags observer coverage gaps (git activity but 0 episodes) and registration-integrity breaks (Stop-hook missing from `settings.json`, `post-commit` not installed). Surfaced in STATUS.md.
Change the line All 4 are mechanical (regex/diff/JSON math). to All 5 are mechanical (regex/diff/JSON math).
(c) Add a new Decision subsection after ### 4. Behavioral rule «unused ≠ problem»:
### 5. Observer factor-analysis extension (v2)
The observer episode is extended to `schema_version: 2` so a real factor analysis becomes possible: `decision_provenance` (autonomous vs user-dictated method, with a counterfactual), `environment` factors, `task_size`, `prompt_signal`, and an honest `outcome` of `unknown` at write time. Four layers — schema v2, deterministic capture + a routing-tag, two-sided enforcement (Stop-hook routing-gate + C5 self-discipline controller), `/brain-retro` analysis. The routing-gate makes provenance reliable: when the user dictates a method and the routing-tag is missing, the Stop-hook returns `decision: block`. Spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
(d) In the ## Enforcement section, add a bullet after the C4 bullet:
- Observer routing-gate runs inside `observer-stop-hook.mjs` (`decision: block` when a method is dictated without a routing-tag); C5 observer-coverage-checker is a warn-only lefthook job.
(e) In the YAML front-matter related: list and in ## References, add a line for the factor-analysis spec:
- docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
- Step 3: Extend Pravila §16
In docs/Pravila_raboty_Claude_v1_1.md:
(a) In ### 16.2. Observer (scope B), add this paragraph after the existing **Граница**: line:
**Схема эпизода v2 (2026-05-19, factor-analysis extension):** эпизод несёт `schema_version: 2` и поля для факторного анализа — `decision_provenance` (кто выбрал узел: автономно или навязанный метод + контрфактуал), `environment` (`economy_level` / `model` / `post_compaction` / `session_turn` / `parallel_session`), `task_size`, `task_ref`, `prompt_signal`; `outcome` при записи — `unknown` (уточняется `/brain-retro`). Виды событий расширены: `hook_fired` / `interrupt` / `retry` / `time_burn` / `parse_gap`. Spec — `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
(b) In ### 16.3. 4 контролёра, change the heading to ### 16.3. 5 контролёров, add a C5 row at the end of the table, and change Все 4 — механические to Все 5 — механические:
| C5 | Observer-coverage-checker | пропуски наблюдателя + целостность регистрации | lefthook warn-only + STATUS.md |
(c) Add two new subsections after ### 16.6. Cross-refs (i.e. before the closing --- of section 16). First renumber: the new subsections are §16.7 and §16.8, placed after §16.6:
### 16.7. Routing-тег-дисциплина
Когда заказчик навязал конкретный метод/узел (директива `запусти X` / `используй X` / `через X` / `/команда`), Claude ОБЯЗАН в том же ходе эмитить routing-тег — одну строку-HTML-комментарий:
`<!-- routing: provenance=user_directed_method node=<выбранный> counterfactual=<узел, который Claude выбрал бы автономно> -->`
Enforcement — механический, не поведенческая просьба: `tools/observer-stop-hook.mjs` содержит routing-gate. Детектор видит навязанный метод, тега нет → Stop-хук возвращает `decision: block`, и ход не завершается без тега. Это хук, а не tier-§13-правило — обойти рационализацией нельзя. Гейт срабатывает не более одного раза за ход (`stop_hook_active` guard против петли).
### 16.8. Самодисциплина наблюдателя
Наблюдатель фиксирует каждый Stop без молчаливых пропусков:
- Внутренний отказ хука → строка-маркер `observer_error` в JSONL (не тихий `exit 0` без записи).
- Доля непарсибельных строк транскрипта выше порога → событие `parse_gap`.
- Контролёр **C5 observer-coverage-checker** (lefthook, warn-only) сверяет покрытие (git-активность без эпизодов) и целостность регистрации (Stop-хук в `.claude/settings.json`, `post-commit` установлен); расхождение — флаг в `docs/observer/STATUS.md`.
(d) In ### 16.6. Cross-refs, add a line:
- factor-analysis spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`
(e) Bump the header version. In the file header **Версия:** line, change v1.31 to v1.32. In the ## Что сделано после утверждения changelog section, add an entry at the top:
- **v1.32 (2026-05-19)** — observer factor-analysis extension (ADR-011 amend): §16.2 +абзац «Схема эпизода v2»; §16.3 4→5 контролёров (+C5 observer-coverage-checker); §16.7 (новое) routing-тег-дисциплина — механический Stop-gate; §16.8 (новое) самодисциплина наблюдателя; §16.6 +cross-ref. Spec `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
- Step 4: Sync PSR_v1 R16
In docs/Plugin_stack_rules_v1.md:
(a) In ### 16.1. Observer scope, append a sentence:
Схема v2 (2026-05-19, ADR-011 amend): эпизод несёт `schema_version`, `decision_provenance`, `environment`, `task_size`, `task_ref`, `prompt_signal`; события расширены `hook_fired` / `interrupt` / `retry` / `time_burn` / `parse_gap`. Spec — `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
(b) In ### 16.4. Cross-refs, add a line:
- factor-analysis spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`
(c) Bump the header version v3.16 → v3.17. In the ## История версий section, add an entry at the top:
- **v3.17 (2026-05-19)** — observer schema v2 sync (ADR-011 amend): R16.1 +предложение про `schema_version` / `decision_provenance` / `environment` / `task_size` / `prompt_signal` + расширенные события; R16.4 +cross-ref на factor-analysis spec. R0–R15 без изменений. Связано: ADR-011, Pravila §16 (v1.32), spec `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
- Step 5: Cross-ref the brain-governance spec
In docs/superpowers/specs/2026-05-19-brain-governance-design.md, add a line to its **Связано:** / cross-refs header area (near the top of the file):
- Расширение: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md` (observer factor-analysis, schema v2).
- Step 6: Flip this spec's status
In docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md, change the header **Статус:** draft — на ревью заказчика to:
**Статус:** accepted — реализуется по плану `docs/superpowers/plans/2026-05-19-observer-factor-analysis.md`
- Step 7: Verify normative Markdown lints + cross-ref-checker passes
Run: npx markdownlint-cli2 "docs/adr/ADR-011-brain-governance.md" "docs/Pravila_raboty_Claude_v1_1.md" "docs/Plugin_stack_rules_v1.md" "docs/superpowers/specs/2026-05-19-brain-governance-design.md" "docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md"
Expected: 0 errors.
Run: node tools/cross-ref-checker.mjs
Expected: [cross-ref-checker] OK — 0 drift (the Pravila v1.32 / PSR_v1 v3.17 header bumps must match any cross-refs; if it fails, the offending cross-ref points at an old version — fix it).
- Step 8: Commit
git add docs/adr/ADR-011-brain-governance.md docs/Pravila_raboty_Claude_v1_1.md docs/Plugin_stack_rules_v1.md docs/superpowers/specs/2026-05-19-brain-governance-design.md docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
git commit -m "docs(brain): normative sync — ADR-011 amend + Pravila §16 + PSR_v1 R17"
Task 12: CLAUDE.md sync via claude-md-management plugin
Spec §7. CLAUDE.md is edited only via the plugin (CLAUDE.md §5 п.10) — never by direct Edit.
Files:
-
Modify:
CLAUDE.md(via/claude-md-management:claude-md-improver) -
Step 1: Invoke the plugin with the targeted update
Invoke /claude-md-management:claude-md-improver with this instruction:
Apply targeted updates to CLAUDE.md for the observer factor-analysis extension (ADR-011 amendment): §0 cross-refs Pravila v1.31→v1.32, PSR_v1 v3.16→v3.17 (Tooling unchanged); §3.6 «Brain governance» — add a sentence that the observer now writes schema-v2 episodes (decision provenance + environment factors + factor matrix) and that a routing-gate enforces the routing-tag, plus C5 observer-coverage-checker as a 5th controller; §9 changelog — add a v2.19 entry summarizing the observer factor-analysis extension, spec
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md, plandocs/superpowers/plans/2026-05-19-observer-factor-analysis.md.
- Step 2: Verify cross-ref-checker passes after the CLAUDE.md bump
Run: node tools/cross-ref-checker.mjs
Expected: [cross-ref-checker] OK — 0 drift — the CLAUDE.md §0 cross-refs (Pravila v1.32 / PSR_v1 v3.17) must match the headers bumped in Task 11.
- Step 3: Commit
git add CLAUDE.md docs/CHANGELOG_claude_md.md
git commit -m "docs(claude-md): observer factor-analysis extension cross-refs (v2.19)"
(If the plugin already committed CLAUDE.md, skip this step — verify with git status.)
Final verification
After all 12 tasks:
- Full tools test suite GREEN
Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run
Expected: 0 failures. New + modified files covered: observer-transcript-parser, observer-routing-detector, observer-stop-hook, observer-coverage-checker, status-md-generator, brain-retro-analyzer. Report the exact pass count.
- lefthook pre-commit GREEN
Run: npx lefthook run pre-commit
Expected: all jobs pass — including job 11 l1-watcher (strict), job 12 cross-ref-checker (strict), job 15 observer-coverage-checker (warn-only). Report each job's status.
- Requirements checklist vs spec
Re-read docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md §3–§8 and confirm every item maps to a completed task. Report any gap explicitly.
- Push — only on explicit user approval. Pattern:
git push origin feat/parallel-sessions-coordination:main(FF). The pre-push gate (gitleaks full history + lychee) must be green.