Files
brain/docs/superpowers/plans/2026-05-19-observer-factor-analysis.md
T

97 KiB
Raw Blame History

Observer factor-analysis extension — Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Extend the brain-governance observer so a real factor analysis becomes possible — capture decision provenance, environment factors, task size, process events, and a true outcome — and make observer discipline mechanically enforced.

Architecture: Four layers over the existing observer. Layer 1 — episode schema v2 (new fields). Layer 2 — deterministic transcript parsing + a one-line routing-tag Claude prints when the user dictates a method. Layer 3 — two-sided enforcement: a Stop-hook routing-gate (decision: block when a method was dictated but the tag is missing) and an observer self-discipline controller (C5). Layer 4 — /brain-retro analysis (outcome inference, episode→task grouping, causal chains, factor matrix). All deterministic — 0 LLM calls. Implementation continues on branch feat/parallel-sessions-coordination (where brain-governance work lives) or an isolated worktree if the executing skill creates one.

Tech Stack: Node ESM (tools/*.mjs), Vitest (tools/*.test.mjs, config app/vitest.config.tools.mjs), lefthook, Markdown skills/normative docs.


Source spec

docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md (v1.0). Read it before starting — every task below traces to a spec section.

Two deliberate spec-gap resolutions (decided here, not in the spec)

The spec §6 describes Layer-4 logic that needs data §3 does not list. Both are resolved in this plan:

  1. prompt_signal — spec §6 infers an episode's outcome from "the first user-prompt of the next episode". An episode does not store raw prompt text (PII risk). Resolution: each v2 episode stores prompt_signal — a deterministic classification of its own opening user-prompt (correction | approval | new_task | neutral). inferOutcome then reads nextEpisode.prompt_signal. PII-safe, deterministic.
  2. task_size.files — spec §3 says task_size.files_touched is a count; spec §6 causal chains need the actual file paths. Resolution: task_size carries both — files_touched (count, per §3) and files (string array). File paths are not PII (the PII filter covers phones/emails/tokens).

Verification commands (used throughout)

  • Full tools test suite (run from repo root c:\моя\проекты\портал crm\Документация): node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run
  • Single test file (substring filter): append the basename, e.g. node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector
  • lefthook pre-commit: npx lefthook run pre-commit

File Structure

File Responsibility Action
tools/observer-transcript-parser.mjs Deterministic transcript → v2 episode fields Modify (Tasks 12)
tools/observer-transcript-parser.test.mjs Parser unit tests Modify (Tasks 12)
tools/observer-routing-detector.mjs "Was a method dictated?" detector + known-nodes loader Create (Task 3)
tools/observer-routing-detector.test.mjs Detector unit tests Create (Task 3)
tools/observer-known-nodes.txt Static list of directable node/skill names Create (Task 3)
tools/observer-stop-hook.mjs Builds + appends v2 episode; routing-gate; observer_error marker Modify (Tasks 45)
tools/observer-stop-hook.test.mjs Stop-hook unit tests Modify (Tasks 45)
tools/observer-coverage-checker.mjs C5 — coverage + registration-integrity controller Create (Task 6)
tools/observer-coverage-checker.test.mjs C5 unit tests Create (Task 6)
tools/status-md-generator.mjs STATUS.md dashboard — adds C5 row + observer_error metric Modify (Task 7)
tools/status-md-generator.test.mjs STATUS.md unit tests Modify (Task 7)
lefthook.yml Wire C5 as pre-commit job 15 Modify (Task 8)
tools/brain-retro-analyzer.mjs Layer-4 deterministic aggregation Create (Task 9)
tools/brain-retro-analyzer.test.mjs Analyzer unit tests Create (Task 9)
.claude/skills/brain-retro/SKILL.md /brain-retro procedure — uses the analyzer Modify (Task 10)
.claude/skills/brain-retro/references/aggregation-template.md Retro template — v2 factor matrix Modify (Task 10)
docs/observer/README.md Observer docs — schema v2, observer_error, routing-tag Modify (Task 10)
docs/adr/ADR-011-brain-governance.md ADR amendment — observer v2, C5 Modify (Task 11)
docs/Pravila_raboty_Claude_v1_1.md §16 — schema v2, §16.7 routing-tag, §16.8 self-discipline Modify (Task 11)
docs/Plugin_stack_rules_v1.md R16 — schema v2 sync Modify (Task 11)
docs/superpowers/specs/2026-05-19-brain-governance-design.md Cross-ref to the factor-analysis spec Modify (Task 11)
docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md Status draftaccepted Modify (Task 11)
CLAUDE.md §0 cross-refs + §3.6 note + §9 changelog Modify via plugin (Task 12)

Task 1: Parser — environment, task_size, prompt_signal extractors

Spec §3 (environment, task_size, prompt_signal), §4.1. Adds three pure, independently-tested extractor functions and refactors parseLines to also count broken lines (needed for parse_gap in Task 2). parseTranscript is not restructured yet — only its parseLines call site is updated so existing tests stay green.

Files:

  • Modify: tools/observer-transcript-parser.mjs

  • Test: tools/observer-transcript-parser.test.mjs

  • Step 1: Write failing tests for the three extractors + parseLines counts

Append to tools/observer-transcript-parser.test.mjs. First add to the imports at the top:

import {
  parseTranscript,
  extractEnvironment,
  extractTaskSize,
  classifyPromptSignal,
} from './observer-transcript-parser.mjs';

Then append this describe block at the end of the file:

describe('extractEnvironment', () => {
  it('reads economy_level from the ECONOMY MODE marker', () => {
    const entries = [
      userPrompt('=== ECONOMY MODE: 0% (пользователь указал явно) ===\nfix it', '2026-05-19T10:00:00Z'),
    ];
    expect(extractEnvironment(entries, 0).economy_level).toBe(0);
  });

  it('economy_level is null when no marker present', () => {
    const entries = [userPrompt('just do it', '2026-05-19T10:00:00Z')];
    expect(extractEnvironment(entries, 0).economy_level).toBeNull();
  });

  it('reads model from an assistant message', () => {
    const entries = [
      userPrompt('go', '2026-05-19T10:00:00Z'),
      { type: 'assistant', message: { role: 'assistant', model: 'claude-opus-4-7', content: [] }, timestamp: '2026-05-19T10:01:00Z', sessionId: 's1' },
    ];
    expect(extractEnvironment(entries, 0).model).toBe('claude-opus-4-7');
  });

  it('post_compaction is true when an isCompactSummary entry precedes the turn', () => {
    const entries = [
      { type: 'user', isCompactSummary: true, message: { role: 'user', content: 'summary' }, timestamp: '2026-05-19T09:00:00Z' },
      userPrompt('the real turn', '2026-05-19T10:00:00Z'),
    ];
    expect(extractEnvironment(entries, 1).post_compaction).toBe(true);
  });

  it('post_compaction is false with no compaction marker', () => {
    const entries = [userPrompt('turn one', '2026-05-19T09:00:00Z'), userPrompt('turn two', '2026-05-19T10:00:00Z')];
    expect(extractEnvironment(entries, 1).post_compaction).toBe(false);
  });

  it('session_turn counts real user prompts up to and including the turn start', () => {
    const entries = [
      userPrompt('one', '2026-05-19T09:00:00Z'),
      userPrompt('two', '2026-05-19T09:30:00Z'),
      userPrompt('three', '2026-05-19T10:00:00Z'),
    ];
    expect(extractEnvironment(entries, 2).session_turn).toBe(3);
  });
});

describe('extractTaskSize', () => {
  it('counts tool calls and unique file paths', () => {
    const turn = [
      assistantTurn(
        [
          { type: 'tool_use', id: 't1', name: 'Read', input: { file_path: '/a.js' } },
          { type: 'tool_use', id: 't2', name: 'Edit', input: { file_path: '/a.js' } },
          { type: 'tool_use', id: 't3', name: 'Write', input: { file_path: '/b.js' } },
          { type: 'tool_use', id: 't4', name: 'Bash', input: {} },
        ],
        '2026-05-19T10:01:00Z'
      ),
    ];
    const size = extractTaskSize(turn);
    expect(size.tool_calls).toBe(4);
    expect(size.files_touched).toBe(2);
    expect(size.files.sort()).toEqual(['/a.js', '/b.js']);
  });

  it('returns zeros for an empty turn', () => {
    expect(extractTaskSize([])).toEqual({ tool_calls: 0, files_touched: 0, files: [] });
  });
});

describe('classifyPromptSignal', () => {
  it('detects corrections', () => {
    expect(classifyPromptSignal('не то, переделай')).toBe('correction');
    expect(classifyPromptSignal('почему ты это сделал')).toBe('correction');
  });
  it('detects approvals', () => {
    expect(classifyPromptSignal('ок, спасибо')).toBe('approval');
    expect(classifyPromptSignal('готово, дальше')).toBe('approval');
  });
  it('detects a new task', () => {
    expect(classifyPromptSignal('добавь новую фичу экспорта в CSV')).toBe('new_task');
  });
  it('falls back to neutral', () => {
    expect(classifyPromptSignal('hmm')).toBe('neutral');
  });
});
  • Step 2: Run the tests to verify they fail

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser Expected: FAIL — extractEnvironment is not a function, extractTaskSize is not a function, classifyPromptSignal is not a function.

  • Step 3: Refactor parseLines to count broken lines

In tools/observer-transcript-parser.mjs, replace the parseLines function (currently lines 2032) with:

function parseLines(text) {
  const entries = [];
  let broken = 0;
  let total = 0;
  for (const line of String(text || '').split('\n')) {
    const trimmed = line.trim();
    if (!trimmed) continue;
    total += 1;
    try {
      entries.push(JSON.parse(trimmed));
    } catch {
      broken += 1; // broken line — counted for parse_gap, never thrown
    }
  }
  return { entries, broken, total };
}
  • Step 4: Update the parseLines call site in parseTranscript

In parseTranscript, change the first line of the body from const entries = parseLines(transcriptText); to:

  const { entries } = parseLines(transcriptText);

(This keeps parseTranscript working unchanged; the full v2 rewrite happens in Task 2.)

  • Step 5: Add the three extractor functions

In tools/observer-transcript-parser.mjs, add these functions after collectToolUse (after line 97). They reuse the existing isRealUserPrompt:

const FILE_TOOLS = new Set(['Read', 'Edit', 'Write', 'MultiEdit', 'NotebookEdit']);

/**
 * Deterministic environment factors for the turn that starts at turnStartIdx.
 * economy_level / parallel_session are scanned from the stringified turn;
 * model / post_compaction / session_turn from structural fields.
 */
export function extractEnvironment(allEntries, turnStartIdx) {
  const turn = allEntries.slice(turnStartIdx);
  const rawTurn = JSON.stringify(turn);

  const econ = rawTurn.match(/=== ECONOMY MODE:\s*(\d+)\s*%/);
  const economy_level = econ ? Number(econ[1]) : null;

  let model = null;
  for (const e of turn) {
    if (e && e.message && e.message.model) {
      model = e.message.model;
      break;
    }
  }

  let post_compaction = false;
  for (let i = 0; i < turnStartIdx && i < allEntries.length; i++) {
    if (allEntries[i] && allEntries[i].isCompactSummary === true) {
      post_compaction = true;
      break;
    }
  }

  let session_turn = 0;
  for (let i = 0; i <= turnStartIdx && i < allEntries.length; i++) {
    if (isRealUserPrompt(allEntries[i])) session_turn += 1;
  }

  const parallel_session = /параллельн|parallel session|чужой staged|foreign git index/i.test(rawTurn);

  return { economy_level, model, post_compaction, session_turn, parallel_session };
}

/** Task size: total tool calls + unique file paths touched (per spec §3, gap-resolution 2). */
export function extractTaskSize(turn) {
  let tool_calls = 0;
  const files = new Set();
  for (const e of turn) {
    const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
    for (const b of content) {
      if (b && b.type === 'tool_use') {
        tool_calls += 1;
        if (FILE_TOOLS.has(b.name) && b.input) {
          const p = b.input.file_path || b.input.notebook_path;
          if (p) files.add(String(p));
        }
      }
    }
  }
  return { tool_calls, files_touched: files.size, files: [...files] };
}

/** Classify the opening user-prompt sentiment (per spec §6 / gap-resolution 1). */
export function classifyPromptSignal(text) {
  const t = String(text || '').toLowerCase().trim();
  if (/не то\b|не так\b|переделай|отбой|\bстоп\b|почему ты|неверно|не верно|это не /.test(t)) {
    return 'correction';
  }
  if (/^(ок|окей|ok|спасибо|супер|отлично|готово|дальше|идеально)\b/.test(t)) {
    return 'approval';
  }
  if (classifyTask(t) !== 'other' && t.length > 15) return 'new_task';
  return 'neutral';
}
  • Step 6: Run the tests to verify they pass

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser Expected: PASS — all existing tests + the new extractEnvironment / extractTaskSize / classifyPromptSignal tests green.

  • Step 7: Commit
git add tools/observer-transcript-parser.mjs tools/observer-transcript-parser.test.mjs
git commit -m "feat(observer): parser v2 — environment, task_size, prompt_signal extractors"

Task 2: Parser — process events, routing-tag, v2 episode assembly

Spec §3 (schema_version, decision_provenance, events[], outcome default), §4.1, §4.2. Adds process-event extraction and routing-tag parsing, then rewrites parseTranscript to assemble the full v2 episode. Exports extractLastUserPromptText for the Stop-hook routing-gate (Task 5).

Files:

  • Modify: tools/observer-transcript-parser.mjs

  • Test: tools/observer-transcript-parser.test.mjs

  • Step 1: Write failing tests for process events, routing-tag, v2 assembly

In tools/observer-transcript-parser.test.mjs, extend the import to also bring in the new functions:

import {
  parseTranscript,
  extractEnvironment,
  extractTaskSize,
  classifyPromptSignal,
  extractProcessEvents,
  parseRoutingTag,
  extractLastUserPromptText,
} from './observer-transcript-parser.mjs';

Change the existing empty-transcript test (currently expect(ep.outcome).toBe('success')) to expect 'unknown':

  it('returns safe defaults for an empty transcript', () => {
    const ep = parseTranscript('');
    expect(ep.task_id).toBeTruthy();
    expect(ep.primary_rationale.node_chosen).toBe('direct');
    expect(ep.events).toEqual([]);
    expect(ep.outcome).toBe('unknown');
    expect(ep.schema_version).toBe(2);
  });

Append this describe block at the end of the file:

describe('extractProcessEvents', () => {
  it('emits a hook_fired summary with per-hook counts and error count', () => {
    const turn = [
      { attachment: { type: 'hook_success', hookName: 'PreToolUse:Read' } },
      { attachment: { type: 'hook_success', hookName: 'PreToolUse:Read' } },
      { attachment: { type: 'hook_error', hookName: 'Stop:observer' } },
    ];
    const ev = extractProcessEvents(turn, 0, 0, 0).find((e) => e.kind === 'hook_fired');
    expect(ev.counts).toEqual({ 'PreToolUse:Read': 2, 'Stop:observer': 1 });
    expect(ev.errors).toBe(1);
  });

  it('emits an interrupt event for [Request interrupted by user]', () => {
    const turn = [
      { message: { role: 'user', content: [{ type: 'text', text: '[Request interrupted by user]' }] } },
    ];
    expect(extractProcessEvents(turn, 0, 0, 0).filter((e) => e.kind === 'interrupt')).toHaveLength(1);
  });

  it('emits a retry event when an errored tool is used again later', () => {
    const turn = [
      { message: { role: 'assistant', content: [{ type: 'tool_use', id: 'u1', name: 'Bash', input: {} }] } },
      { message: { role: 'user', content: [{ type: 'tool_result', tool_use_id: 'u1', is_error: true }] } },
      { message: { role: 'assistant', content: [{ type: 'tool_use', id: 'u2', name: 'Bash', input: {} }] } },
    ];
    expect(extractProcessEvents(turn, 0, 0, 0).filter((e) => e.kind === 'retry')).toHaveLength(1);
  });

  it('emits a time_burn event when the turn exceeds the threshold', () => {
    const ev = extractProcessEvents([], 0, 0, 1000000).find((e) => e.kind === 'time_burn');
    expect(ev.duration_ms).toBe(1000000);
  });

  it('emits a parse_gap event when the broken-line ratio is above threshold', () => {
    const ev = extractProcessEvents([], 3, 10, 0).find((e) => e.kind === 'parse_gap');
    expect(ev).toEqual({ kind: 'parse_gap', broken: 3, total: 10 });
  });

  it('emits nothing for a clean empty turn', () => {
    expect(extractProcessEvents([], 0, 0, 0)).toEqual([]);
  });
});

describe('parseRoutingTag', () => {
  it('parses a user_directed_method routing tag from assistant text', () => {
    const turn = [
      assistantTurn(
        [{ type: 'text', text: 'ok\n<!-- routing: provenance=user_directed_method node=discovery-interview counterfactual=brainstorming -->' }],
        '2026-05-19T10:01:00Z'
      ),
    ];
    expect(parseRoutingTag(turn)).toEqual({
      kind: 'user_directed_method',
      node: 'discovery-interview',
      claude_would_have_chosen: 'brainstorming',
    });
  });

  it('returns null when no tag is present', () => {
    const turn = [assistantTurn([{ type: 'text', text: 'plain answer' }], '2026-05-19T10:01:00Z')];
    expect(parseRoutingTag(turn)).toBeNull();
  });
});

describe('parseTranscript — v2 episode', () => {
  it('produces schema_version 2 and all v2 fields', () => {
    const t = jsonl([
      userPrompt('=== ECONOMY MODE: 0% ===\nдобавь фичу', '2026-05-19T10:00:00Z', 'sess-v2'),
      assistantTurn([{ type: 'tool_use', id: 't1', name: 'Read', input: { file_path: '/x.js' } }], '2026-05-19T10:01:00Z', 'sess-v2'),
    ]);
    const ep = parseTranscript(t);
    expect(ep.schema_version).toBe(2);
    expect(ep.task_ref).toBe('sess-v2');
    expect(ep.outcome).toBe('unknown');
    expect(ep.prompt_signal).toBe('new_task');
    expect(ep.decision_provenance).toEqual({ kind: 'autonomous', claude_would_have_chosen: null });
    expect(ep.environment.economy_level).toBe(0);
    expect(ep.task_size).toEqual({ tool_calls: 1, files_touched: 1, files: ['/x.js'] });
  });

  it('records decision_provenance from a routing tag', () => {
    const t = jsonl([
      userPrompt('запусти discovery-interview', '2026-05-19T10:00:00Z', 'sess-tag'),
      assistantTurn(
        [{ type: 'text', text: '<!-- routing: provenance=user_directed_method node=discovery-interview counterfactual=brainstorming -->' }],
        '2026-05-19T10:01:00Z',
        'sess-tag'
      ),
    ]);
    const ep = parseTranscript(t);
    expect(ep.decision_provenance.kind).toBe('user_directed_method');
    expect(ep.decision_provenance.claude_would_have_chosen).toBe('brainstorming');
  });
});

describe('extractLastUserPromptText', () => {
  it('returns the text of the last real user prompt', () => {
    const t = jsonl([
      userPrompt('first turn', '2026-05-19T09:00:00Z'),
      userPrompt('second and last', '2026-05-19T10:00:00Z'),
    ]);
    expect(extractLastUserPromptText(t)).toBe('second and last');
  });
});
  • Step 2: Run the tests to verify they fail

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser Expected: FAIL — extractProcessEvents is not a function, parseRoutingTag is not a function, extractLastUserPromptText is not a function, and the v2 episode assertions fail.

  • Step 3: Add process-event and routing-tag functions

In tools/observer-transcript-parser.mjs, add after the classifyPromptSignal function from Task 1:

const TIME_BURN_THRESHOLD_MS = 900000; // 15 min — turn wall-clock above this = time_burn
const PARSE_GAP_RATIO = 0.1; // >10% unparseable lines = parse_gap

/** Heuristic retry count: an errored tool whose name is used again later in the turn. */
function detectRetries(turn) {
  const idToName = {};
  const uses = [];
  turn.forEach((entry, idx) => {
    const content = entry && entry.message && Array.isArray(entry.message.content) ? entry.message.content : [];
    for (const b of content) {
      if (b && b.type === 'tool_use') {
        idToName[b.id] = b.name;
        uses.push({ name: b.name, idx });
      }
    }
  });
  const errors = [];
  turn.forEach((entry, idx) => {
    const content = entry && entry.message && Array.isArray(entry.message.content) ? entry.message.content : [];
    for (const b of content) {
      if (b && b.type === 'tool_result' && b.is_error === true) {
        errors.push({ name: idToName[b.tool_use_id] || null, idx });
      }
    }
  });
  let retries = 0;
  for (const err of errors) {
    if (err.name && uses.some((u) => u.name === err.name && u.idx > err.idx)) retries += 1;
  }
  return retries;
}

/**
 * Process events for the turn: hook_fired (summary), interrupt, retry,
 * time_burn, parse_gap. broken/total/durationMs are computed by the caller.
 */
export function extractProcessEvents(turn, broken, total, durationMs) {
  const events = [];

  const hookCounts = {};
  let hookErrors = 0;
  for (const e of turn) {
    const att = e && e.attachment;
    if (att && (att.type === 'hook_success' || att.type === 'hook_error')) {
      const name = att.hookName || 'unknown';
      hookCounts[name] = (hookCounts[name] || 0) + 1;
      if (att.type === 'hook_error') hookErrors += 1;
    }
  }
  if (Object.keys(hookCounts).length > 0) {
    events.push({ kind: 'hook_fired', counts: hookCounts, errors: hookErrors });
  }

  for (const e of turn) {
    const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
    const isUser = e && e.message && e.message.role === 'user';
    if (
      isUser &&
      content.some((b) => b && b.type === 'text' && String(b.text || '').includes('[Request interrupted by user]'))
    ) {
      events.push({ kind: 'interrupt' });
    }
  }

  const retries = detectRetries(turn);
  for (let i = 0; i < retries; i++) events.push({ kind: 'retry' });

  if (durationMs > TIME_BURN_THRESHOLD_MS) {
    events.push({ kind: 'time_burn', duration_ms: durationMs });
  }

  if (total > 0 && broken / total > PARSE_GAP_RATIO) {
    events.push({ kind: 'parse_gap', broken, total });
  }

  return events;
}

const ROUTING_TAG_RE =
  /<!--\s*routing:\s*provenance=([\w_]+)\s+node=(\S+)\s+counterfactual=(\S+)\s*-->/;

/** Find the routing tag Claude prints when a method was user-directed (spec §4.2). */
export function parseRoutingTag(turn) {
  for (const e of turn) {
    const content = e && e.message && Array.isArray(e.message.content) ? e.message.content : [];
    for (const b of content) {
      if (b && b.type === 'text' && typeof b.text === 'string') {
        const m = b.text.match(ROUTING_TAG_RE);
        if (m) return { kind: m[1], node: m[2], claude_would_have_chosen: m[3] };
      }
    }
  }
  return null;
}

/** Text of the last real user prompt — used by the Stop-hook routing-gate (Task 5). */
export function extractLastUserPromptText(transcriptText) {
  const { entries } = parseLines(transcriptText);
  const start = findTurnStart(entries);
  return promptText(entries[start]);
}
  • Step 4: Rewrite parseTranscript to assemble the v2 episode

In tools/observer-transcript-parser.mjs, replace the entire parseTranscript function (currently lines 99148, including its JSDoc) with:

/**
 * Parse a transcript JSONL string into an observer episode (schema v2).
 * @param {string} transcriptText - Raw JSONL transcript contents.
 * @param {string|null} fallbackSessionId - Used when the transcript has no sessionId.
 * @returns {object} v2 episode.
 */
export function parseTranscript(transcriptText, fallbackSessionId = null) {
  const { entries, broken, total } = parseLines(transcriptText);

  const withSession = entries.find((e) => e && e.sessionId);
  const sessionId =
    (withSession && withSession.sessionId) || fallbackSessionId || `unknown-${Date.now()}`;

  const start = findTurnStart(entries);
  const turn = entries.slice(start);

  const stamps = turn.map((e) => e && e.timestamp).filter(Boolean);
  const started_at = stamps[0] || new Date().toISOString();
  const ended_at = stamps[stamps.length - 1] || started_at;
  const durationMs = new Date(ended_at) - new Date(started_at);

  const { skills, counts, errorCount } = collectToolUse(turn);

  const events = [];
  for (const skill of skills) events.push({ kind: 'skill_invoked', skill });
  if (Object.keys(counts).length > 0) events.push({ kind: 'tool_summary', counts });
  for (let i = 0; i < errorCount; i++) {
    events.push({ kind: 'error', message: 'tool_result reported is_error' });
  }
  events.push(...extractProcessEvents(turn, broken, total, durationMs));

  const usedSuperpowers = skills.some((s) => String(s).startsWith(SUPERPOWERS_PREFIX));
  const prompt = promptText(entries[start]);

  const tag = parseRoutingTag(turn);
  const decision_provenance =
    tag && tag.kind === 'user_directed_method'
      ? { kind: 'user_directed_method', claude_would_have_chosen: tag.claude_would_have_chosen }
      : { kind: 'autonomous', claude_would_have_chosen: null };

  return {
    schema_version: 2,
    task_id: sessionId,
    task_ref: sessionId,
    timestamps: { started_at, ended_at },
    path_type: usedSuperpowers ? 'regulated' : 'improvised',
    outcome: 'unknown',
    prompt_signal: classifyPromptSignal(prompt),
    decision_provenance,
    environment: extractEnvironment(entries, start),
    task_size: extractTaskSize(turn),
    primary_rationale: {
      step: 1,
      node_chosen: skills.length > 0 ? skills[0] : 'direct',
      triggers_matched: [],
      candidates_considered: [],
      boundaries_applied: [],
      hard_floor: usedSuperpowers
        ? { invoked: true, rules: ['Pravila §12'] }
        : { invoked: false, rules: [] },
      task_classification: classifyTask(prompt),
    },
    events,
  };
}
  • Step 5: Run the tests to verify they pass

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-transcript-parser Expected: PASS — all existing tests (with the updated outcome: 'unknown' assertion) + the new process-event / routing-tag / v2-assembly / extractLastUserPromptText tests green.

  • Step 6: Commit
git add tools/observer-transcript-parser.mjs tools/observer-transcript-parser.test.mjs
git commit -m "feat(observer): parser v2 — process events, routing-tag, episode assembly"

Task 3: Routing-gate method-direction detector

Spec §5.1 step 2. A pure detector — given a user-prompt text and a list of known node names, decide whether the user dictated a specific method. Conservative-broad (favours false-positives, per spec R1).

Files:

  • Create: tools/observer-known-nodes.txt

  • Create: tools/observer-routing-detector.mjs

  • Test: tools/observer-routing-detector.test.mjs

  • Step 1: Create the known-nodes data file

Create tools/observer-known-nodes.txt:

# Known router nodes — directive targets for the observer routing-gate.
# One node/skill name per line. Lines starting with # and blank lines are ignored.
# Extend this list when a new directable skill/command is added to the brain.
#
# superpowers skills
brainstorming
writing-plans
executing-plans
subagent-driven-development
test-driven-development
systematic-debugging
verification-before-completion
requesting-code-review
using-git-worktrees
finishing-a-development-branch
writing-skills
root-cause-tracing
condition-based-waiting
defense-in-depth
# project skills
discovery-interview
brain-retro
audit-portal
regression
process-modeling
process-analysis
ccpm
# plugins / commands
claude-md-management
security-review
  • Step 2: Write the failing test

Create tools/observer-routing-detector.test.mjs:

import { describe, it, expect } from 'vitest';
import { detectMethodDirected, loadKnownNodes } from './observer-routing-detector.mjs';

const NODES = ['brainstorming', 'discovery-interview', 'systematic-debugging'];

describe('detectMethodDirected', () => {
  it('detects a directive verb followed by a node name', () => {
    expect(detectMethodDirected('запусти discovery-interview по этой фиче', NODES)).toEqual({
      directed: true,
      node: 'discovery-interview',
    });
  });

  it('detects "используй X"', () => {
    expect(detectMethodDirected('используй systematic-debugging здесь', NODES).directed).toBe(true);
  });

  it('detects a /slash-command form', () => {
    expect(detectMethodDirected('сделай это через /brainstorming', NODES)).toEqual({
      directed: true,
      node: 'brainstorming',
    });
  });

  it('does NOT flag a bare node mention without a directive verb', () => {
    expect(detectMethodDirected('почему ты выбрал brainstorming, а не план?', NODES).directed).toBe(false);
  });

  it('does NOT flag a prompt with no node reference', () => {
    expect(detectMethodDirected('добавь колонку Город в таблицу', NODES).directed).toBe(false);
  });

  it('is empty-input safe', () => {
    expect(detectMethodDirected('', NODES).directed).toBe(false);
    expect(detectMethodDirected(null, []).directed).toBe(false);
  });
});

describe('loadKnownNodes', () => {
  it('loads names, skips comments and blank lines', () => {
    const nodes = loadKnownNodes('tools/observer-known-nodes.txt');
    expect(nodes).toContain('brainstorming');
    expect(nodes).toContain('discovery-interview');
    expect(nodes.every((n) => !n.startsWith('#') && n.length > 0)).toBe(true);
  });

  it('returns an empty array for a missing file', () => {
    expect(loadKnownNodes('tools/does-not-exist.txt')).toEqual([]);
  });
});
  • Step 3: Run the test to verify it fails

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector Expected: FAIL — Failed to load ./observer-routing-detector.mjs.

  • Step 4: Create the detector module

Create tools/observer-routing-detector.mjs:

#!/usr/bin/env node
/**
 * Routing-gate method-direction detector (brain governance, observer
 * factor-analysis spec §5.1). Pure — given a user-prompt text and a list of
 * known node names, decides whether the user *dictated* a specific method.
 * Conservative-broad: a directive verb within a 40-char window before a node
 * name, or a /slash-command form.
 *
 * Security Guidance #40: pure string ops — no exec/execSync.
 */
import { readFileSync, existsSync } from 'fs';

const KNOWN_NODES_PATH = 'tools/observer-known-nodes.txt';

const DIRECTIVE_VERBS = [
  'запусти', 'запускай', 'используй', 'вызови', 'вызывай', 'прогони',
  'применяй', 'применить', 'через', 'run', 'use', 'invoke', 'via',
];

/** Load the directable node names from the data file (# comments / blanks skipped). */
export function loadKnownNodes(path = KNOWN_NODES_PATH) {
  if (!existsSync(path)) return [];
  const out = [];
  for (const line of readFileSync(path, 'utf-8').split('\n')) {
    const t = line.trim();
    if (!t || t.startsWith('#')) continue;
    out.push(t);
  }
  return out;
}

/**
 * @returns {{directed: boolean, node: string|null}}
 */
export function detectMethodDirected(promptText, knownNodes) {
  const text = String(promptText || '').toLowerCase();
  for (const node of knownNodes || []) {
    const n = String(node).toLowerCase();
    if (!n) continue;
    if (text.includes('/' + n)) return { directed: true, node };
    const idx = text.indexOf(n);
    if (idx === -1) continue;
    const before = text.slice(Math.max(0, idx - 40), idx);
    if (DIRECTIVE_VERBS.some((v) => before.includes(v))) return { directed: true, node };
  }
  return { directed: false, node: null };
}

if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-routing-detector.mjs')) {
  const det = detectMethodDirected(process.argv.slice(2).join(' '), loadKnownNodes());
  console.log(JSON.stringify(det));
  process.exit(0);
}
  • Step 5: Run the test to verify it passes

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-routing-detector Expected: PASS — all 8 tests green.

  • Step 6: Commit
git add tools/observer-known-nodes.txt tools/observer-routing-detector.mjs tools/observer-routing-detector.test.mjs
git commit -m "feat(observer): routing-gate method-direction detector"

Task 4: Stop-hook — v2 episode + observer_error marker

Spec §3 (observer_error marker), §5.2 (visibility of failure). Updates appendEpisode to validate the v2 schema and accept the minimal observer_error marker; updates buildEpisodeFromContext to produce v2 episodes on the fallback path.

Files:

  • Modify: tools/observer-stop-hook.mjs

  • Test: tools/observer-stop-hook.test.mjs

  • Step 1: Rewrite the test file fixtures + write failing tests

Replace the entire contents of tools/observer-stop-hook.test.mjs with:

import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { writeFileSync, readFileSync, existsSync, mkdtempSync, rmSync, mkdirSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
import { appendEpisode, buildEpisodeFromContext, buildObserverError } from './observer-stop-hook.mjs';

let workdir;

beforeEach(() => {
  workdir = mkdtempSync(join(tmpdir(), 'observer-test-'));
  mkdirSync(join(workdir, 'docs', 'observer'), { recursive: true });
});

afterEach(() => {
  rmSync(workdir, { recursive: true, force: true });
});

const defaultRat = () => ({
  step: 1,
  node_chosen: '#1',
  triggers_matched: [],
  candidates_considered: [],
  boundaries_applied: [],
  hard_floor: { invoked: false, rules: [] },
  task_classification: 'other',
});

// Full schema-v2 episode fixture.
const v2Episode = (overrides = {}) => ({
  schema_version: 2,
  task_id: 'abc-123',
  task_ref: 'abc-123',
  timestamps: { started_at: '2026-05-19T10:00:00+03:00', ended_at: '2026-05-19T10:05:00+03:00' },
  path_type: 'regulated',
  outcome: 'unknown',
  prompt_signal: 'neutral',
  decision_provenance: { kind: 'autonomous', claude_would_have_chosen: null },
  environment: { economy_level: 0, model: 'claude-opus-4-7', post_compaction: false, session_turn: 1, parallel_session: false },
  task_size: { tool_calls: 0, files_touched: 0, files: [] },
  primary_rationale: defaultRat(),
  events: [],
  ...overrides,
});

describe('appendEpisode', () => {
  it('appends one JSONL line to the monthly file', () => {
    appendEpisode(v2Episode(), workdir, '2026-05');
    const content = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8');
    expect(content).toContain('"task_id":"abc-123"');
    expect(content).toContain('"schema_version":2');
    expect(content.endsWith('\n')).toBe(true);
  });

  it('appends to an existing file without overwrite', () => {
    appendEpisode(v2Episode({ task_id: 'a' }), workdir, '2026-05');
    appendEpisode(v2Episode({ task_id: 'b', outcome: 'partial' }), workdir, '2026-05');
    const lines = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8').trim().split('\n');
    expect(lines).toHaveLength(2);
    expect(JSON.parse(lines[0]).task_id).toBe('a');
    expect(JSON.parse(lines[1]).task_id).toBe('b');
  });

  it('applies the PII filter before write (including events[])', () => {
    appendEpisode(
      v2Episode({ events: [{ kind: 'error', message: 'call +79991234567 / mail x@y.com' }] }),
      workdir,
      '2026-05'
    );
    const content = readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8');
    expect(content).toContain('+7XXXXXXXXXX');
    expect(content).toContain('***@***');
    expect(content).not.toContain('79991234567');
  });

  it('throws on a missing required field', () => {
    expect(() => appendEpisode({}, workdir, '2026-05')).toThrow(/required/i);
  });

  it('throws on a missing schema-v2 field', () => {
    const ep = v2Episode();
    delete ep.decision_provenance;
    expect(() => appendEpisode(ep, workdir, '2026-05')).toThrow(/schema v2 field missing/i);
  });

  it('throws when schema_version is not 2', () => {
    expect(() => appendEpisode(v2Episode({ schema_version: 1 }), workdir, '2026-05')).toThrow(/schema_version/i);
  });

  it('throws when a primary_rationale sub-field is missing', () => {
    expect(() =>
      appendEpisode(v2Episode({ primary_rationale: { step: 1, node_chosen: '#1' } }), workdir, '2026-05')
    ).toThrow(/primary_rationale field missing/i);
  });

  it('accepts a minimal observer_error marker', () => {
    appendEpisode(
      {
        schema_version: 2,
        observer_error: true,
        error_message: 'parser blew up',
        timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:00:00Z' },
        task_id: 'err-1',
      },
      workdir,
      '2026-05'
    );
    const line = JSON.parse(readFileSync(join(workdir, 'docs', 'observer', 'episodes-2026-05.jsonl'), 'utf-8').trim());
    expect(line.observer_error).toBe(true);
    expect(line.error_message).toBe('parser blew up');
  });

  it('throws when an observer_error marker is missing a field', () => {
    expect(() =>
      appendEpisode({ schema_version: 2, observer_error: true, task_id: 'x' }, workdir, '2026-05')
    ).toThrow(/observer_error marker field missing/i);
  });
});

describe('buildEpisodeFromContext', () => {
  it('builds a v2 episode on the fallback path (no transcript)', () => {
    const ep = buildEpisodeFromContext({ session_id: 'sess-1', result: 'success' });
    expect(ep.schema_version).toBe(2);
    expect(ep.task_id).toBe('sess-1');
    expect(ep.task_ref).toBe('sess-1');
    expect(ep.outcome).toBe('success');
    expect(ep.decision_provenance).toEqual({ kind: 'autonomous', claude_would_have_chosen: null });
    expect(ep.environment).toEqual({
      economy_level: null,
      model: null,
      post_compaction: false,
      session_turn: 0,
      parallel_session: false,
    });
    expect(ep.task_size).toEqual({ tool_calls: 0, files_touched: 0, files: [] });
  });

  it('defaults outcome to unknown when none supplied', () => {
    expect(buildEpisodeFromContext({ session_id: 'x' }).outcome).toBe('unknown');
  });

  it('derives a v2 episode from transcriptText when provided', () => {
    const transcript = [
      JSON.stringify({ type: 'user', message: { role: 'user', content: 'fix the bug' }, timestamp: '2026-05-19T10:00:00Z', sessionId: 'sess-t' }),
      JSON.stringify({ type: 'assistant', message: { role: 'assistant', content: [{ type: 'tool_use', id: 't1', name: 'Skill', input: { skill: 'superpowers:systematic-debugging' } }] }, timestamp: '2026-05-19T10:01:00Z', sessionId: 'sess-t' }),
    ].join('\n');
    const ep = buildEpisodeFromContext({ session_id: 'sess-t' }, transcript);
    expect(ep.schema_version).toBe(2);
    expect(ep.task_id).toBe('sess-t');
    expect(ep.primary_rationale.node_chosen).toBe('superpowers:systematic-debugging');
  });
});

describe('buildObserverError', () => {
  it('produces a minimal valid observer_error marker', () => {
    const marker = buildObserverError({ session_id: 'sess-e' }, new Error('boom'));
    expect(marker.observer_error).toBe(true);
    expect(marker.schema_version).toBe(2);
    expect(marker.task_id).toBe('sess-e');
    expect(marker.error_message).toContain('boom');
    expect(marker.timestamps.started_at).toBeTruthy();
  });
});
  • Step 2: Run the tests to verify they fail

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook Expected: FAIL — buildObserverError is not exported, v2-validation tests fail (current appendEpisode does not validate v2 fields), buildEpisodeFromContext lacks v2 fields.

  • Step 3: Update appendEpisode with v2 validation + observer_error branch

In tools/observer-stop-hook.mjs, replace the REQUIRED_FIELDS constant and the appendEpisode function (currently lines 22 and 4268) with:

const REQUIRED_FIELDS = ['task_id', 'timestamps', 'path_type', 'outcome', 'primary_rationale'];
const V2_FIELDS = ['schema_version', 'decision_provenance', 'environment', 'task_size', 'task_ref'];
const OBSERVER_ERROR_FIELDS = ['schema_version', 'error_message', 'timestamps', 'task_id'];

(Leave the existing RATIONALE_FIELDS constant and validateRationale function unchanged.)

Then replace the appendEpisode function and its JSDoc with:

/**
 * Append a single episode to the monthly JSONL file.
 * Validates either a full schema-v2 episode or a minimal observer_error marker.
 * @param {object} episode - The episode object.
 * @param {string} baseDir - Repository root (default: process.cwd()).
 * @param {string} month   - YYYY-MM string for the file name (default: current UTC month).
 */
export function appendEpisode(episode, baseDir = process.cwd(), month = currentMonth()) {
  const dir = join(baseDir, 'docs', 'observer');
  if (!existsSync(dir)) {
    mkdirSync(dir, { recursive: true });
  }
  const file = join(dir, `episodes-${month}.jsonl`);

  if (episode && episode.observer_error === true) {
    for (const f of OBSERVER_ERROR_FIELDS) {
      if (episode[f] === undefined) {
        throw new Error(`observer_error marker field missing: ${f}`);
      }
    }
    appendFileSync(file, JSON.stringify(sanitize(episode)) + '\n', 'utf-8');
    return;
  }

  for (const f of REQUIRED_FIELDS) {
    if (episode[f] === undefined) {
      throw new Error(`required field missing: ${f}`);
    }
  }
  for (const f of V2_FIELDS) {
    if (episode[f] === undefined) {
      throw new Error(`schema v2 field missing: ${f}`);
    }
  }
  if (episode.schema_version !== 2) {
    throw new Error(`schema_version must be 2 (got ${episode.schema_version})`);
  }
  validateRationale(episode.primary_rationale);

  appendFileSync(file, JSON.stringify(sanitize(episode)) + '\n', 'utf-8');
}
  • Step 4: Update buildEpisodeFromContext to produce v2 + add buildObserverError

In tools/observer-stop-hook.mjs, replace the buildEpisodeFromContext function and its JSDoc (currently lines 70103) with:

/**
 * Build a well-formed schema-v2 episode from a Claude Code Stop-event context.
 * Preferred path: when `transcriptText` is supplied, the episode is derived
 * from the real session transcript via parseTranscript. Fallback path: v2
 * defaults from `ctx` (an explicit ctx.primary_rationale is preserved verbatim).
 * @param {object} ctx - Raw context from stdin (may be partial).
 * @param {string|null} transcriptText - Raw transcript JSONL, if readable.
 * @returns {object} v2 episode.
 */
export function buildEpisodeFromContext(ctx = {}, transcriptText = null) {
  if (transcriptText) {
    return parseTranscript(transcriptText, ctx.session_id || ctx.sessionId || ctx.task_id);
  }
  const sid = ctx.session_id || ctx.sessionId || ctx.task_id || `unknown-${Date.now()}`;
  const now = new Date().toISOString();
  return {
    schema_version: 2,
    task_id: sid,
    task_ref: sid,
    timestamps: {
      started_at: ctx.started || ctx.started_at || now,
      ended_at: ctx.ended || ctx.ended_at || now,
    },
    path_type: ctx.path_type || 'regulated',
    outcome: ctx.result || ctx.outcome || 'unknown',
    prompt_signal: ctx.prompt_signal || 'neutral',
    decision_provenance: ctx.decision_provenance || { kind: 'autonomous', claude_would_have_chosen: null },
    environment: ctx.environment || {
      economy_level: null,
      model: null,
      post_compaction: false,
      session_turn: 0,
      parallel_session: false,
    },
    task_size: ctx.task_size || { tool_calls: 0, files_touched: 0, files: [] },
    primary_rationale: ctx.primary_rationale || {
      step: 1,
      node_chosen: ctx.node_chosen || ctx.skill_id || 'unknown',
      triggers_matched: [],
      candidates_considered: [],
      boundaries_applied: [],
      hard_floor: ctx.hard_floor || { invoked: false, rules: [] },
      task_classification: ctx.task_classification || 'other',
    },
    events: ctx.events || [],
  };
}

/**
 * Build a minimal observer_error marker — written instead of a silent skip
 * when the Stop-hook fails internally (spec §3 / §5.2).
 */
export function buildObserverError(ctx = {}, err) {
  const now = new Date().toISOString();
  return {
    schema_version: 2,
    observer_error: true,
    error_message: String((err && err.message) || err),
    timestamps: { started_at: now, ended_at: now },
    task_id: ctx.session_id || ctx.sessionId || ctx.task_id || `unknown-${Date.now()}`,
  };
}
  • Step 5: Run the tests to verify they pass

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook Expected: PASS — all appendEpisode / buildEpisodeFromContext / buildObserverError tests green.

  • Step 6: Commit
git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs
git commit -m "feat(observer): Stop-hook v2 episode + observer_error marker"

Task 5: Stop-hook — routing-gate enforcement

Spec §5.1 (3a routing-gate). Adds the pure routingGateDecision function and wires it + the observer_error fallback into the CLI block. The gate blocks at most once per turn (stop_hook_active guard prevents an infinite loop).

Files:

  • Modify: tools/observer-stop-hook.mjs

  • Test: tools/observer-stop-hook.test.mjs

  • Step 1: Write failing tests for routingGateDecision

In tools/observer-stop-hook.test.mjs, extend the import line to add routingGateDecision:

import { appendEpisode, buildEpisodeFromContext, buildObserverError, routingGateDecision } from './observer-stop-hook.mjs';

Append this describe block at the end of the file:

describe('routingGateDecision', () => {
  const NODES = ['discovery-interview', 'brainstorming'];
  const autonomousEp = v2Episode();
  const taggedEp = v2Episode({ decision_provenance: { kind: 'user_directed_method', claude_would_have_chosen: 'brainstorming' } });

  it('blocks when a method was directed but no routing tag is present', () => {
    const gate = routingGateDecision(autonomousEp, 'запусти discovery-interview', NODES, false);
    expect(gate.block).toBe(true);
    expect(gate.reason).toContain('discovery-interview');
  });

  it('does not block when the routing tag is present', () => {
    const gate = routingGateDecision(taggedEp, 'запусти discovery-interview', NODES, false);
    expect(gate.block).toBe(false);
  });

  it('does not block when no method was directed', () => {
    const gate = routingGateDecision(autonomousEp, 'добавь колонку Город', NODES, false);
    expect(gate.block).toBe(false);
  });

  it('does not block when stop_hook_active is true (loop guard)', () => {
    const gate = routingGateDecision(autonomousEp, 'запусти discovery-interview', NODES, true);
    expect(gate.block).toBe(false);
  });
});
  • Step 2: Run the tests to verify they fail

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook Expected: FAIL — routingGateDecision is not exported.

  • Step 3: Add the imports and routingGateDecision function

In tools/observer-stop-hook.mjs, add to the imports at the top (after the existing parseTranscript import line):

import { parseTranscript, extractLastUserPromptText } from './observer-transcript-parser.mjs';
import { detectMethodDirected, loadKnownNodes } from './observer-routing-detector.mjs';

(Replace the existing import { parseTranscript } from './observer-transcript-parser.mjs'; line — extractLastUserPromptText is now also imported.)

Add the routingGateDecision function after buildObserverError:

/**
 * Routing-gate decision (spec §5.1, 3a). Pure — the CLI calls this.
 * Blocks the Stop-event (decision: block) when the user dictated a method
 * but the turn carries no routing tag. Skipped when stop_hook_active is true
 * (the gate fires at most once per turn — no infinite loop).
 * @returns {{block: boolean, reason: string|null}}
 */
export function routingGateDecision(episode, promptText, knownNodes, stopHookActive) {
  if (stopHookActive) return { block: false, reason: null };
  const det = detectMethodDirected(promptText, knownNodes);
  if (!det.directed) return { block: false, reason: null };
  if (episode && episode.decision_provenance && episode.decision_provenance.kind === 'user_directed_method') {
    return { block: false, reason: null };
  }
  return {
    block: true,
    reason:
      `[observer routing-gate] Похоже, метод навязан пользователем (узел "${det.node}"), ` +
      `но routing-тег в этом ходе отсутствует. Добавь в свой ответ ровно одну строку:\n` +
      `<!-- routing: provenance=user_directed_method node=${det.node} ` +
      `counterfactual=<узел, который ты выбрал бы автономно> -->`,
  };
}
  • Step 4: Rewrite the CLI block to wire the gate + observer_error fallback

In tools/observer-stop-hook.mjs, replace the entire CLI block (currently lines 110142, from if (process.argv[1] && ... to the closing }) with:

// CLI entry point: read JSON context from stdin (Claude Code Stop-event hook contract)
if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-stop-hook.mjs')) {
  const chunks = [];
  process.stdin.on('data', (c) => chunks.push(c));
  process.stdin.on('end', () => {
    let ctx = {};
    try {
      const raw = Buffer.concat(chunks).toString('utf-8');
      if (raw.trim()) ctx = JSON.parse(raw);
    } catch (_e) {
      // best-effort: build a minimal episode even if stdin is malformed
    }
    // Claude Code's Stop-event supplies transcript_path — the real source of
    // session data. Read it best-effort; fall back to ctx-only on any error.
    let transcriptText = null;
    const tp = ctx.transcript_path || ctx.transcriptPath;
    if (tp) {
      try {
        if (existsSync(tp)) transcriptText = readFileSync(tp, 'utf-8');
      } catch (_e) {
        transcriptText = null;
      }
    }
    try {
      const ep = buildEpisodeFromContext(ctx, transcriptText);
      // Always write the episode first — exit-0-safe (spec §5.1 step 1).
      appendEpisode(ep);
      // Then the routing-gate (spec §5.1 steps 2-4).
      if (transcriptText) {
        const promptText = extractLastUserPromptText(transcriptText);
        const gate = routingGateDecision(ep, promptText, loadKnownNodes(), ctx.stop_hook_active === true);
        if (gate.block) {
          process.stdout.write(JSON.stringify({ decision: 'block', reason: gate.reason }));
          process.exit(0);
        }
      }
      process.exit(0);
    } catch (err) {
      // Visible failure (spec §5.2): write an observer_error marker, never a silent skip.
      try {
        appendEpisode(buildObserverError(ctx, err));
      } catch (_e2) {
        // last-resort: even the marker failed — do not crash the Stop-event
      }
      console.error(`[observer-stop-hook] error: ${err.message}`);
      process.exit(0); // never block the Stop-event on an internal error
    }
  });
}
  • Step 5: Run the tests to verify they pass

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-stop-hook Expected: PASS — all tests including the 4 routingGateDecision tests green.

  • Step 6: Manual CLI smoke against the real transcript

In PowerShell, from the repo root, run (replace the path with the current session's real transcript JSONL under C:\Users\Administrator\.claude\projects\...):

'{"session_id":"smoke","transcript_path":"C:/Users/Administrator/.claude/projects/c---------------------crm-------------/553717ec-bf55-43dc-8b9c-b9812711023a.jsonl"}' | node tools/observer-stop-hook.mjs

Expected: the command exits 0; the last line of docs/observer/episodes-2026-05.jsonl is a populated v2 episode ("schema_version":2, a real task_id, non-empty environment / task_size). Then revert that smoke line so it does not pollute the evidence log:

$f = 'docs/observer/episodes-2026-05.jsonl'
$lines = Get-Content $f
Set-Content $f -Value ($lines[0..($lines.Count - 2)]) -Encoding utf8

(If episodes-2026-05.jsonl had only the smoke line, delete the file instead.)

  • Step 7: Commit
git add tools/observer-stop-hook.mjs tools/observer-stop-hook.test.mjs
git commit -m "feat(observer): Stop-hook routing-gate enforcement"

Task 6: C5 — observer-coverage-checker

Spec §5.2 (3b — coverage control + registration integrity). A warn-only controller (always exits 0): flags observer coverage gaps and broken registration. Surfaced in STATUS.md by Task 7; never blocks a commit (spec §5.2 says "флаг", not "блокирует" — the unbypassable enforcement is the routing-gate of Task 5).

Files:

  • Create: tools/observer-coverage-checker.mjs

  • Test: tools/observer-coverage-checker.test.mjs

  • Step 1: Write the failing test

Create tools/observer-coverage-checker.test.mjs:

import { describe, it, expect } from 'vitest';
import { checkCoverage, checkRegistration } from './observer-coverage-checker.mjs';

describe('checkCoverage', () => {
  it('flags recent commits but zero episodes', () => {
    const r = checkCoverage(0, 7);
    expect(r.ok).toBe(false);
    expect(r.detail).toContain('0 observer episodes');
  });

  it('is ok when episodes exist', () => {
    expect(checkCoverage(5, 7).ok).toBe(true);
  });

  it('is ok when there is no recent git activity', () => {
    expect(checkCoverage(0, 0).ok).toBe(true);
  });
});

describe('checkRegistration', () => {
  const goodSettings = {
    hooks: { Stop: [{ hooks: [{ type: 'command', command: 'node tools/observer-stop-hook.mjs' }] }] },
  };

  it('is ok when the Stop-hook is registered and post-commit exists', () => {
    const r = checkRegistration(goodSettings, true);
    expect(r.ok).toBe(true);
  });

  it('flags a missing Stop-hook registration', () => {
    const r = checkRegistration({ hooks: { Stop: [] } }, true);
    expect(r.ok).toBe(false);
    expect(r.detail).toContain('observer-stop-hook NOT registered');
  });

  it('flags a missing post-commit hook', () => {
    const r = checkRegistration(goodSettings, false);
    expect(r.ok).toBe(false);
    expect(r.detail).toContain('post-commit');
  });

  it('handles an empty settings object', () => {
    expect(checkRegistration({}, false).ok).toBe(false);
  });
});
  • Step 2: Run the test to verify it fails

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-coverage-checker Expected: FAIL — Failed to load ./observer-coverage-checker.mjs.

  • Step 3: Create the controller module

Create tools/observer-coverage-checker.mjs:

#!/usr/bin/env node
/**
 * C5 observer-coverage-checker (brain governance, observer factor-analysis
 * spec §5.2). Warn-only — always exits 0. Two checks:
 *   1. Coverage — recent git commits but 0 observer episodes this month.
 *   2. Registration integrity — observer Stop-hook present in
 *      .claude/settings.json and .git/hooks/post-commit installed.
 * Findings are surfaced in docs/observer/STATUS.md (C4 generator); this
 * controller never blocks a commit.
 *
 * Security Guidance #40: git is invoked via execFileSync (argument array,
 * no shell) — no exec/execSync.
 */
import { readFileSync, existsSync } from 'fs';
import { join } from 'path';
import { execFileSync } from 'child_process';

const RECENT_WINDOW = '14 days ago';

/** @returns {{ok: boolean, detail: string}} */
export function checkCoverage(episodeCount, recentCommitCount) {
  if (recentCommitCount > 0 && episodeCount === 0) {
    return {
      ok: false,
      detail: `${recentCommitCount} commit(s) in the last 2 weeks but 0 observer episodes this month`,
    };
  }
  return { ok: true, detail: `${episodeCount} episode(s), ${recentCommitCount} recent commit(s)` };
}

/** @returns {{ok: boolean, detail: string}} */
export function checkRegistration(settingsJson, postCommitExists) {
  const problems = [];
  const stopHooks = (((settingsJson || {}).hooks || {}).Stop) || [];
  const hasObserverStop = stopHooks.some((entry) =>
    ((entry && entry.hooks) || []).some((h) => String((h && h.command) || '').includes('observer-stop-hook'))
  );
  if (!hasObserverStop) {
    problems.push('observer-stop-hook NOT registered in .claude/settings.json Stop hook');
  }
  if (!postCommitExists) {
    problems.push('.git/hooks/post-commit not installed (run: npx lefthook install --force)');
  }
  return {
    ok: problems.length === 0,
    detail: problems.length ? problems.join('; ') : 'Stop-hook + post-commit OK',
  };
}

function countEpisodes(root) {
  const month = new Date().toISOString().slice(0, 7);
  const file = join(root, 'docs', 'observer', `episodes-${month}.jsonl`);
  if (!existsSync(file)) return 0;
  return readFileSync(file, 'utf-8').trim().split('\n').filter(Boolean).length;
}

function countRecentCommits(root) {
  try {
    const out = execFileSync('git', ['log', `--since=${RECENT_WINDOW}`, '--oneline'], {
      cwd: root,
      encoding: 'utf-8',
      stdio: ['ignore', 'pipe', 'ignore'],
    });
    return out.trim() ? out.trim().split('\n').length : 0;
  } catch {
    return 0;
  }
}

export function runCoverageChecker(root = process.cwd()) {
  const coverage = checkCoverage(countEpisodes(root), countRecentCommits(root));
  let settings = {};
  try {
    settings = JSON.parse(readFileSync(join(root, '.claude', 'settings.json'), 'utf-8'));
  } catch {
    settings = {};
  }
  const registration = checkRegistration(settings, existsSync(join(root, '.git', 'hooks', 'post-commit')));
  return { coverage, registration };
}

if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/observer-coverage-checker.mjs')) {
  const { coverage, registration } = runCoverageChecker();
  if (!coverage.ok) console.warn(`[observer-coverage-checker] WARN — coverage: ${coverage.detail}`);
  if (!registration.ok) console.warn(`[observer-coverage-checker] WARN — registration: ${registration.detail}`);
  if (coverage.ok && registration.ok) {
    console.log(`[observer-coverage-checker] OK — ${coverage.detail}; ${registration.detail}`);
  }
  process.exit(0); // warn-only — never blocks a commit
}
  • Step 4: Run the test to verify it passes

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run observer-coverage-checker Expected: PASS — all 7 tests green.

  • Step 5: Commit
git add tools/observer-coverage-checker.mjs tools/observer-coverage-checker.test.mjs
git commit -m "feat(observer): coverage + registration-integrity controller (C5)"

Task 7: STATUS.md generator — C5 row + observer_error metric

Spec §5.2 (visibility surfaced in STATUS.md). Adds a C5 row to the dashboard table and an observer_error count to the metrics block.

Files:

  • Modify: tools/status-md-generator.mjs

  • Test: tools/status-md-generator.test.mjs

  • Step 1: Update the test file

Replace the entire contents of tools/status-md-generator.test.mjs with:

import { describe, it, expect } from 'vitest';
import { renderStatus } from './status-md-generator.mjs';

const baseInputs = (overrides = {}) => ({
  now: '2026-05-19T10:00:00+03:00',
  c1: { status: 'ok', detail: 'no drift' },
  c2: { status: 'ok', detail: '0 version drift' },
  c3: { status: 'ok', detail: 'last read today' },
  c5: { status: 'ok', detail: 'coverage OK · registration OK' },
  observer: { episodeCount: 12, observerErrors: 0, piiMatches: 0 },
  ...overrides,
});

describe('renderStatus', () => {
  it('renders all 5 controllers + metrics', () => {
    const md = renderStatus(baseInputs());
    expect(md).toContain('# Brain Status');
    expect(md).toContain('| C1 L1-watcher | ✅');
    expect(md).toContain('| C2 Cross-ref consistency | ✅');
    expect(md).toContain('| C3 Observer-of-observer | ✅');
    expect(md).toContain('| C4 Сигнальный статус | ✅');
    expect(md).toContain('| C5 Observer-coverage | ✅');
    expect(md).toContain('12 episodes');
  });

  it('shows a warn status for the coverage controller', () => {
    const md = renderStatus(baseInputs({ c5: { status: 'warn', detail: '3 commits, 0 episodes' } }));
    expect(md).toContain('| C5 Observer-coverage | ⚠️');
  });

  it('shows the observer_error count in the metrics block', () => {
    const md = renderStatus(baseInputs({ observer: { episodeCount: 4, observerErrors: 2, piiMatches: 0 } }));
    expect(md).toContain('2 observer_error markers');
  });

  it('shows a red status for failing controllers', () => {
    const md = renderStatus(baseInputs({ c1: { status: 'fail', detail: '2 plugins not formalized' } }));
    expect(md).toContain('| C1 L1-watcher | 🔴');
  });

  it('mentions the capability-readiness behavioral rule', () => {
    const md = renderStatus(baseInputs());
    expect(md).toContain('capability-readiness');
    expect(md).toContain('feedback_brain_unused_tools_not_problem');
  });
});
  • Step 2: Run the test to verify it fails

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run status-md-generator Expected: FAIL — the C5 row and observer_error markers text are not yet rendered.

  • Step 3: Update renderStatus to render C5 + observer_error

In tools/status-md-generator.mjs, replace the renderStatus function (currently lines 1032) with:

export function renderStatus(inputs) {
  const { now, c1, c2, c3, c5, observer } = inputs;
  return `# Brain Status (auto-generated)

Last updated: ${now}

| Контролёр | Состояние | Детали |
|---|---|---|
| C1 L1-watcher | ${iconFor(c1.status)} | ${c1.detail || '—'} |
| C2 Cross-ref consistency | ${iconFor(c2.status)} | ${c2.detail || '—'} |
| C3 Observer-of-observer | ${iconFor(c3.status)} | ${c3.detail || '—'} |
| C4 Сигнальный статус | ✅ | This file (self-reference) |
| C5 Observer-coverage | ${iconFor(c5.status)} | ${c5.detail || '—'} |

## Метрики (информационные, не алерты)

- Observer evidence: ${observer.episodeCount} episodes this month, ${observer.observerErrors} observer_error markers, ${observer.piiMatches} PII matches before filter
- Использование узлов: см. \`/brain-retro\` (раз в спринт). **Неиспользованные узлы — не проблема** (capability-readiness; см. memory \`feedback_brain_unused_tools_not_problem\` — outside-repo memory store).

## Алерт-индикаторы

✅ — норма ・ ⚠️ — внимание ・ 🔴 — действие требуется ・ ⚪ — не запускалось
`;
}
  • Step 4: Update the CLI block to compute C5 + observer_error count

In tools/status-md-generator.mjs, add the import at the top (after the existing import { execFileSync } line):

import { runCoverageChecker } from './observer-coverage-checker.mjs';

Add a countObserverErrors function after the existing countEpisodes function:

function countObserverErrors() {
  const dir = 'docs/observer';
  if (!existsSync(dir)) return 0;
  const month = new Date().toISOString().slice(0, 7);
  const file = join(dir, `episodes-${month}.jsonl`);
  if (!existsSync(file)) return 0;
  return readFileSync(file, 'utf-8')
    .trim()
    .split('\n')
    .filter((l) => l.includes('"observer_error":true')).length;
}

Replace the CLI block (currently lines 5263, from if (process.argv[1] && ...) with:

if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/status-md-generator.mjs')) {
  const cov = runCoverageChecker();
  const c5ok = cov.coverage.ok && cov.registration.ok;
  const inputs = {
    now: new Date().toISOString(),
    c1: runControllerNode(['tools/l1-watcher.mjs']),
    c2: runControllerNode(['tools/cross-ref-checker.mjs']),
    c3: runControllerNode(['tools/observer-of-observer.mjs', 'check']),
    c5: {
      status: c5ok ? 'ok' : 'warn',
      detail: [cov.coverage.detail, cov.registration.detail].join(' · '),
    },
    observer: {
      episodeCount: countEpisodes(),
      observerErrors: countObserverErrors(),
      piiMatches: 0,
    },
  };
  const md = renderStatus(inputs);
  writeFileSync('docs/observer/STATUS.md', md);
  console.log(`[status-md-generator] OK — wrote docs/observer/STATUS.md`);
}
  • Step 5: Run the test to verify it passes

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run status-md-generator Expected: PASS — all 5 tests green.

  • Step 6: Commit
git add tools/status-md-generator.mjs tools/status-md-generator.test.mjs
git commit -m "feat(observer): STATUS.md — C5 row + observer_error metric"

Task 8: Wire C5 into lefthook

Spec §5.2. Adds the C5 observer-coverage-checker as pre-commit job 15. Warn-only — the script self-guarantees exit 0, so the job never blocks a commit.

Files:

  • Modify: lefthook.yml

  • Step 1: Add job 15

In lefthook.yml, inside pre-commit.jobs, add this entry directly after the existing job 13 (observer-of-observer) and before the # Post-commit: comment line:

    # 15. observer-coverage-checker — brain governance C5 (observer factor-
    # analysis spec §5.2). Warn-only (script always exits 0). Flags observer
    # coverage gaps (git activity but 0 episodes) + registration-integrity
    # breaks (Stop-hook missing from settings.json, post-commit not installed).
    # Findings surface in docs/observer/STATUS.md C5 row — never blocks a commit.
    - name: observer-coverage-checker
      run: node tools/observer-coverage-checker.mjs
      fail_text: |
        observer-coverage-checker reports a gap (coverage or registration).
        See docs/observer/STATUS.md C5 row for details.

(Job numbering note: the C4 status-md generator is post-commit job 14; the new pre-commit job is numbered 15 to keep brain-governance jobs 1115 contiguous.)

  • Step 2: Verify lefthook accepts the config and the job runs

Run: npx lefthook run pre-commit Expected: lefthook lists observer-coverage-checker among the jobs; it prints either [observer-coverage-checker] OK — ... or a WARN line, and the overall pre-commit run does not fail on this job (exit 0 from the script).

  • Step 3: Commit
git add lefthook.yml
git commit -m "chore(observer): wire C5 coverage-checker into lefthook (job 15)"

Task 9: brain-retro analyzer

Spec §6 (Layer 4). A pure, deterministic aggregation module — outcome inference, episode-double-write dedup, episode→task grouping, causal-chain candidates, factor matrix. Read-only — never writes JSONL. The /brain-retro skill (Task 10) calls its CLI.

Files:

  • Create: tools/brain-retro-analyzer.mjs

  • Test: tools/brain-retro-analyzer.test.mjs

  • Step 1: Write the failing test

Create tools/brain-retro-analyzer.test.mjs:

import { describe, it, expect } from 'vitest';
import {
  dedupeEpisodes,
  inferOutcome,
  groupEpisodesToTasks,
  findCausalChains,
  buildFactorMatrix,
  analyze,
} from './brain-retro-analyzer.mjs';

// Minimal v2 episode for tests.
const ep = (overrides = {}) => ({
  schema_version: 2,
  task_id: 's1',
  task_ref: 's1',
  timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:05:00Z' },
  path_type: 'regulated',
  outcome: 'unknown',
  prompt_signal: 'neutral',
  decision_provenance: { kind: 'autonomous', claude_would_have_chosen: null },
  environment: { economy_level: 0, model: 'claude-opus-4-7', post_compaction: false, session_turn: 1, parallel_session: false },
  task_size: { tool_calls: 5, files_touched: 1, files: ['/a.js'] },
  primary_rationale: { step: 1, node_chosen: 'direct', triggers_matched: [], candidates_considered: [], boundaries_applied: [], hard_floor: { invoked: false, rules: [] }, task_classification: 'feature' },
  events: [],
  ...overrides,
});

describe('dedupeEpisodes', () => {
  it('keeps the last of two episodes with the same task_id + started_at', () => {
    const a = ep({ outcome: 'unknown' });
    const b = ep({ outcome: 'partial' }); // same task_id + started_at — routing-gate double-write
    const out = dedupeEpisodes([a, b]);
    expect(out).toHaveLength(1);
    expect(out[0].outcome).toBe('partial');
  });

  it('keeps all observer_error markers', () => {
    const out = dedupeEpisodes([ep(), { observer_error: true, task_id: 'e' }, { observer_error: true, task_id: 'e2' }]);
    expect(out.filter((e) => e.observer_error)).toHaveLength(2);
  });
});

describe('inferOutcome', () => {
  it('infers rework when the next episode opens with a correction', () => {
    expect(inferOutcome(ep(), ep({ prompt_signal: 'correction' }))).toBe('rework');
  });
  it('infers success when the next episode opens with approval', () => {
    expect(inferOutcome(ep(), ep({ prompt_signal: 'approval' }))).toBe('success');
  });
  it('infers partial when the episode has an interrupt event', () => {
    expect(inferOutcome(ep({ events: [{ kind: 'interrupt' }] }), ep())).toBe('partial');
  });
  it('infers unknown when there is no next episode', () => {
    expect(inferOutcome(ep(), null)).toBe('unknown');
  });
});

describe('groupEpisodesToTasks', () => {
  it('starts a new task after a success and on a new_task prompt', () => {
    const eps = [
      ep({ timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:01:00Z' }, prompt_signal: 'new_task' }),
      ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, prompt_signal: 'approval' }),
      ep({ timestamps: { started_at: '2026-05-19T10:04:00Z', ended_at: '2026-05-19T10:05:00Z' }, prompt_signal: 'new_task' }),
    ];
    const tasks = groupEpisodesToTasks(eps);
    expect(tasks.length).toBeGreaterThanOrEqual(2);
  });
});

describe('findCausalChains', () => {
  it('links an errored episode to a later episode that shares a file', () => {
    const a = ep({ timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:01:00Z' }, events: [{ kind: 'error', message: 'x' }], task_size: { tool_calls: 1, files_touched: 1, files: ['/shared.js'] } });
    const b = ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, task_size: { tool_calls: 1, files_touched: 1, files: ['/shared.js'] } });
    const chains = findCausalChains([a, b]);
    expect(chains).toHaveLength(1);
    expect(chains[0].sharedFiles).toEqual(['/shared.js']);
  });

  it('returns no chain when no files are shared', () => {
    const a = ep({ events: [{ kind: 'error', message: 'x' }], task_size: { tool_calls: 1, files_touched: 1, files: ['/a.js'] } });
    const b = ep({ timestamps: { started_at: '2026-05-19T10:02:00Z', ended_at: '2026-05-19T10:03:00Z' }, task_size: { tool_calls: 1, files_touched: 1, files: ['/b.js'] } });
    expect(findCausalChains([a, b])).toHaveLength(0);
  });
});

describe('buildFactorMatrix', () => {
  it('tabulates outcome distribution per factor value', () => {
    const eps = [
      { ...ep(), _inferredOutcome: 'rework', decision_provenance: { kind: 'user_directed_method' } },
      { ...ep(), _inferredOutcome: 'success', decision_provenance: { kind: 'autonomous' } },
    ];
    const m = buildFactorMatrix(eps);
    expect(m.decision_provenance.user_directed_method.rework).toBe(1);
    expect(m.decision_provenance.autonomous.success).toBe(1);
  });
});

describe('analyze', () => {
  it('returns episodeCount, tasks, causalChains and factorMatrix', () => {
    const result = analyze([ep(), ep({ timestamps: { started_at: '2026-05-19T11:00:00Z', ended_at: '2026-05-19T11:01:00Z' }, prompt_signal: 'correction' })]);
    expect(result.episodeCount).toBe(2);
    expect(result.factorMatrix).toBeDefined();
    expect(Array.isArray(result.tasks)).toBe(true);
    expect(Array.isArray(result.causalChains)).toBe(true);
  });
});
  • Step 2: Run the test to verify it fails

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run brain-retro-analyzer Expected: FAIL — Failed to load ./brain-retro-analyzer.mjs.

  • Step 3: Create the analyzer module

Create tools/brain-retro-analyzer.mjs:

#!/usr/bin/env node
/**
 * Brain-retro analyzer (brain governance, observer factor-analysis spec §6).
 * Pure, deterministic Layer-4 aggregation over observer episodes for the
 * /brain-retro skill. Read-only — never writes JSONL. No LLM.
 *
 * Security Guidance #40: pure parsing — no exec/execSync.
 */
import { readFileSync, existsSync } from 'fs';

const SIZE_SMALL = 20;
const SIZE_LARGE = 60;

/**
 * Deduplicate the routing-gate double-write: a turn that was blocked then
 * re-stopped yields two episodes with the same task_id + started_at. Keep the
 * last (most complete). observer_error markers are all kept.
 */
export function dedupeEpisodes(episodes) {
  const errors = episodes.filter((e) => e && e.observer_error);
  const normal = episodes.filter((e) => e && !e.observer_error);
  const byKey = new Map();
  for (const e of normal) {
    byKey.set(`${e.task_id}|${(e.timestamps || {}).started_at}`, e);
  }
  return [...byKey.values(), ...errors];
}

/** Infer the true outcome of an episode from the next episode's opening prompt. */
export function inferOutcome(episode, nextEpisode) {
  if (episode && Array.isArray(episode.events) && episode.events.some((e) => e.kind === 'interrupt')) {
    return 'partial';
  }
  if (!nextEpisode) return 'unknown';
  if (nextEpisode.prompt_signal === 'correction') return 'rework';
  if (nextEpisode.prompt_signal === 'approval' || nextEpisode.prompt_signal === 'new_task') return 'success';
  return 'unknown';
}

function bySessionSorted(episodes) {
  const map = new Map();
  for (const e of episodes) {
    if (e.observer_error) continue;
    const sid = e.task_id || 'unknown';
    if (!map.has(sid)) map.set(sid, []);
    map.get(sid).push(e);
  }
  for (const eps of map.values()) {
    eps.sort((a, b) =>
      String((a.timestamps || {}).started_at).localeCompare(String((b.timestamps || {}).started_at))
    );
  }
  return map;
}

/** Group episodes into tasks: a new task starts after a success or on a new_task prompt. */
export function groupEpisodesToTasks(episodes) {
  const tasks = [];
  for (const [sid, eps] of bySessionSorted(episodes)) {
    let current = null;
    eps.forEach((episode, i) => {
      const prev = eps[i - 1];
      const prevOutcome = prev ? inferOutcome(prev, episode) : null;
      const isNewTask = i === 0 || prevOutcome === 'success' || episode.prompt_signal === 'new_task';
      if (isNewTask) {
        current = { task_ref: `${sid}#${tasks.length + 1}`, episodes: [] };
        tasks.push(current);
      }
      current.episodes.push(episode);
    });
  }
  return tasks;
}

/** Causal-chain candidates: an errored episode → a later episode sharing a file. */
export function findCausalChains(episodes) {
  const sorted = episodes
    .filter((e) => !e.observer_error)
    .slice()
    .sort((a, b) =>
      String((a.timestamps || {}).started_at).localeCompare(String((b.timestamps || {}).started_at))
    );
  const chains = [];
  for (let i = 0; i < sorted.length - 1; i++) {
    const a = sorted[i];
    const hasError = Array.isArray(a.events) && a.events.some((e) => e.kind === 'error');
    if (!hasError) continue;
    const filesA = new Set(((a.task_size || {}).files) || []);
    if (filesA.size === 0) continue;
    for (let j = i + 1; j < sorted.length; j++) {
      const b = sorted[j];
      const shared = (((b.task_size || {}).files) || []).filter((f) => filesA.has(f));
      if (shared.length > 0) {
        chains.push({
          from: `${a.task_id}|${(a.timestamps || {}).started_at}`,
          to: `${b.task_id}|${(b.timestamps || {}).started_at}`,
          sharedFiles: shared,
        });
        break;
      }
    }
  }
  return chains;
}

function sizeBucket(toolCalls) {
  const n = Number(toolCalls) || 0;
  return n < SIZE_SMALL ? 'small' : n <= SIZE_LARGE ? 'medium' : 'large';
}

const FACTOR_FNS = {
  decision_provenance: (e) => (e.decision_provenance || {}).kind || 'unknown',
  economy_level: (e) => String((e.environment || {}).economy_level ?? 'null'),
  model: (e) => (e.environment || {}).model || 'null',
  post_compaction: (e) => String((e.environment || {}).post_compaction ?? false),
  task_size: (e) => sizeBucket((e.task_size || {}).tool_calls),
  node_chosen: (e) => (e.primary_rationale || {}).node_chosen || 'direct',
  task_classification: (e) => (e.primary_rationale || {}).task_classification || 'other',
};

/** Factor matrix: rows = factor values, columns = outcome distribution (spec §6). */
export function buildFactorMatrix(episodesWithOutcome) {
  const matrix = {};
  for (const [fname, fn] of Object.entries(FACTOR_FNS)) {
    matrix[fname] = {};
    for (const e of episodesWithOutcome) {
      const val = fn(e);
      const outcome = e._inferredOutcome || 'unknown';
      matrix[fname][val] = matrix[fname][val] || {};
      matrix[fname][val][outcome] = (matrix[fname][val][outcome] || 0) + 1;
    }
  }
  return matrix;
}

/** Full deterministic aggregation: dedup → infer outcomes → group → chains → matrix. */
export function analyze(episodes) {
  const deduped = dedupeEpisodes(episodes);
  const normal = deduped.filter((e) => !e.observer_error);
  for (const eps of bySessionSorted(normal).values()) {
    eps.forEach((episode, i) => {
      episode._inferredOutcome = inferOutcome(episode, eps[i + 1]);
    });
  }
  return {
    episodeCount: normal.length,
    observerErrorCount: deduped.length - normal.length,
    tasks: groupEpisodesToTasks(normal),
    causalChains: findCausalChains(normal),
    factorMatrix: buildFactorMatrix(normal),
  };
}

function loadEpisodes(files) {
  const eps = [];
  for (const f of files) {
    if (!existsSync(f)) continue;
    for (const line of readFileSync(f, 'utf-8').split('\n')) {
      const t = line.trim();
      if (!t) continue;
      try {
        eps.push(JSON.parse(t));
      } catch {
        // skip broken line
      }
    }
  }
  return eps;
}

if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/brain-retro-analyzer.mjs')) {
  const result = analyze(loadEpisodes(process.argv.slice(2)));
  console.log(JSON.stringify(result, null, 2));
  process.exit(0);
}
  • Step 4: Run the test to verify it passes

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run brain-retro-analyzer Expected: PASS — all tests across the 6 describe blocks green.

  • Step 5: Commit
git add tools/brain-retro-analyzer.mjs tools/brain-retro-analyzer.test.mjs
git commit -m "feat(observer): brain-retro analyzer — outcome inference + factor matrix"

Task 10: brain-retro skill + aggregation template + observer README

Spec §6 (Layer 4 wiring). Updates the /brain-retro skill procedure to run the analyzer, refreshes the aggregation template for the v2 factor matrix, and documents schema v2 / observer_error / routing-tag in the observer README. No code — Markdown only.

Files:

  • Modify: .claude/skills/brain-retro/SKILL.md

  • Modify: .claude/skills/brain-retro/references/aggregation-template.md

  • Modify: docs/observer/README.md

  • Step 1: Update the /brain-retro skill procedure

In .claude/skills/brain-retro/SKILL.md, replace step 5 of the ## Procedure section (currently 5. **Aggregate** per references/aggregation-template.md — includes Factor analysis matrix (v1.1+) on 5 axes.) with:

5. **Run the deterministic analyzer**: `node tools/brain-retro-analyzer.mjs docs/observer/episodes-YYYY-MM.jsonl` (pass every monthly file in the period). It returns JSON with `episodeCount`, `observerErrorCount`, `tasks` (episodes grouped into tasks), `causalChains` (error→fix candidates) and `factorMatrix` (outcome distribution per factor). The analyzer deduplicates the routing-gate double-write and infers the true `outcome` of each episode from the next episode's `prompt_signal` — never trust the stored `outcome` (it is `unknown` at write time).
6. **Aggregate** per `references/aggregation-template.md` — fill the Factor analysis matrix from the analyzer's `factorMatrix`, the task groups from `tasks`, the causal-chain candidates from `causalChains`.

Then renumber the subsequent steps: the old step 6 (Propose candidates) becomes step 7, old step 7 (Save retro note) becomes step 8, old step 8 (Report to user) becomes step 9.

  • Step 2: Update the aggregation template for the v2 factor matrix

In .claude/skills/brain-retro/references/aggregation-template.md, replace the ## Factor analysis matrix (v1.1+ ...) section (the heading at line 30 down to and including the ### Cross-tab: factor × factor block, ending at line 74) with:

## Factor analysis matrix (v2 — from `tools/brain-retro-analyzer.mjs`)

Outcome distribution per factor value. Source: the analyzer's `factorMatrix`.
Outcome is the *inferred* outcome (next-prompt sentiment), not the stored
`unknown`. The factor `decision_provenance` directly answers the owner's
question — "is the rework mine or the router's?"

For each factor below, render a table: factor value × outcome counts
(`success` / `partial` / `rework` / `unknown`).

### decision_provenance (autonomous vs user_directed_method)

| provenance | success | partial | rework | unknown |
|---|---|---|---|---|

### economy_level

| economy_level | success | partial | rework | unknown |
|---|---|---|---|---|

### model · post_compaction · task_size bucket

(one table each — same columns)

### node_chosen · task_classification

(one table each — same columns)

## Episodes → tasks (from analyzer `tasks`)

| task_ref | episodes | turns that are rework |
|---|---|---|

## Causal-chain candidates (from analyzer `causalChains`)

| from (errored episode) | to (later episode) | shared files |
|---|---|---|

## Observer health

- `observerErrorCount` from the analyzer — observer_error markers in the period.
  Non-zero = the observer failed silently somewhere; investigate.
  • Step 3: Update the observer README

In docs/observer/README.md, replace the ## Files section's first bullet (the episodes-YYYY-MM.jsonl line) with:

- `episodes-YYYY-MM.jsonl` — append-only JSONL, one line per Stop-event. Schema **v2** (`schema_version: 2`): the 5 mandatory fields + `decision_provenance` (who chose the node), `environment` (economy_level / model / post_compaction / session_turn / parallel_session), `task_size`, `task_ref`, `prompt_signal`, and an `outcome` that is `unknown` at write time (refined by `/brain-retro`). On an internal hook failure a minimal `observer_error` marker line is written instead of a silent skip. Written by `tools/observer-stop-hook.mjs` via `tools/observer-transcript-parser.mjs`.

Then add a new section after the ## Lifecycle section:

## Routing-tag discipline

When the user dictates a specific method/node (e.g. «запусти discovery-interview»), Claude must emit one line in its response:


The Stop-hook routing-gate (`tools/observer-routing-detector.mjs` + `routingGateDecision`) detects a dictated method; if the tag is missing it returns `decision: block`, so the turn cannot end without the tag. The gate fires at most once per turn (`stop_hook_active` guard). This makes `decision_provenance` reliable — factor analysis can separate a router error from a user-dictated one.
  • Step 4: Verify the Markdown lints

Run: npx markdownlint-cli2 ".claude/skills/brain-retro/SKILL.md" ".claude/skills/brain-retro/references/aggregation-template.md" "docs/observer/README.md" Expected: 0 errors (the PostToolUse hook auto-fixes most issues on write; this confirms).

  • Step 5: Commit
git add .claude/skills/brain-retro/SKILL.md .claude/skills/brain-retro/references/aggregation-template.md docs/observer/README.md
git commit -m "docs(observer): brain-retro skill + README for schema v2"

Task 11: Normative sync — ADR-011, Pravila §16, PSR_v1 R16, spec cross-refs

Spec §7. Amends ADR-011, extends Pravila §16, syncs PSR_v1 R16, cross-links the brain-governance spec, and flips this spec's status. Pre-flight sync is mandatory before any edit (Pravila §15.2).

Files:

  • Modify: docs/adr/ADR-011-brain-governance.md

  • Modify: docs/Pravila_raboty_Claude_v1_1.md

  • Modify: docs/Plugin_stack_rules_v1.md

  • Modify: docs/superpowers/specs/2026-05-19-brain-governance-design.md

  • Modify: docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md

  • Step 1: Pre-flight sync (Pravila §15.2)

Run: git fetch origin && git log HEAD..origin/main --oneline Expected: empty output (local branch up to date). If output is non-empty — origin/main moved; integrate (git rebase origin/main or merge) before editing any normative file, since Pravila / PSR_v1 / the ADR are on the 8-file sync list.

  • Step 2: Amend ADR-011

In docs/adr/ADR-011-brain-governance.md:

(a) Replace the ## Status body (Accepted (2026-05-19).) with:

Accepted (2026-05-19). **Amended 2026-05-19** — observer factor-analysis extension: episode schema v2, two-sided enforcement (routing-gate + C5 controller). See Decision §5.

(b) In the ## Decision section, change the ### 3. heading from ### 3. 4 mechanical controllers (first wave) to ### 3. 5 mechanical controllers, and add a 5th bullet after the C4 bullet:

- **C5 Observer-coverage-checker** — lefthook warn-only job. Flags observer coverage gaps (git activity but 0 episodes) and registration-integrity breaks (Stop-hook missing from `settings.json`, `post-commit` not installed). Surfaced in STATUS.md.

Change the line All 4 are mechanical (regex/diff/JSON math). to All 5 are mechanical (regex/diff/JSON math).

(c) Add a new Decision subsection after ### 4. Behavioral rule «unused ≠ problem»:

### 5. Observer factor-analysis extension (v2)

The observer episode is extended to `schema_version: 2` so a real factor analysis becomes possible: `decision_provenance` (autonomous vs user-dictated method, with a counterfactual), `environment` factors, `task_size`, `prompt_signal`, and an honest `outcome` of `unknown` at write time. Four layers — schema v2, deterministic capture + a routing-tag, two-sided enforcement (Stop-hook routing-gate + C5 self-discipline controller), `/brain-retro` analysis. The routing-gate makes provenance reliable: when the user dictates a method and the routing-tag is missing, the Stop-hook returns `decision: block`. Spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.

(d) In the ## Enforcement section, add a bullet after the C4 bullet:

- Observer routing-gate runs inside `observer-stop-hook.mjs` (`decision: block` when a method is dictated without a routing-tag); C5 observer-coverage-checker is a warn-only lefthook job.

(e) In the YAML front-matter related: list and in ## References, add a line for the factor-analysis spec:

  - docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
  • Step 3: Extend Pravila §16

In docs/Pravila_raboty_Claude_v1_1.md:

(a) In ### 16.2. Observer (scope B), add this paragraph after the existing **Граница**: line:

**Схема эпизода v2 (2026-05-19, factor-analysis extension):** эпизод несёт `schema_version: 2` и поля для факторного анализа — `decision_provenance` (кто выбрал узел: автономно или навязанный метод + контрфактуал), `environment` (`economy_level` / `model` / `post_compaction` / `session_turn` / `parallel_session`), `task_size`, `task_ref`, `prompt_signal`; `outcome` при записи — `unknown` (уточняется `/brain-retro`). Виды событий расширены: `hook_fired` / `interrupt` / `retry` / `time_burn` / `parse_gap`. Spec — `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.

(b) In ### 16.3. 4 контролёра, change the heading to ### 16.3. 5 контролёров, add a C5 row at the end of the table, and change Все 4 — механические to Все 5 — механические:

| C5 | Observer-coverage-checker | пропуски наблюдателя + целостность регистрации | lefthook warn-only + STATUS.md |

(c) Add two new subsections after ### 16.6. Cross-refs (i.e. before the closing --- of section 16). First renumber: the new subsections are §16.7 and §16.8, placed after §16.6:

### 16.7. Routing-тег-дисциплина

Когда заказчик навязал конкретный метод/узел (директива `запусти X` / `используй X` / `через X` / `/команда`), Claude ОБЯЗАН в том же ходе эмитить routing-тег — одну строку-HTML-комментарий:

`<!-- routing: provenance=user_directed_method node=<выбранный> counterfactual=<узел, который Claude выбрал бы автономно> -->`

Enforcement — механический, не поведенческая просьба: `tools/observer-stop-hook.mjs` содержит routing-gate. Детектор видит навязанный метод, тега нет → Stop-хук возвращает `decision: block`, и ход не завершается без тега. Это хук, а не tier-§13-правило — обойти рационализацией нельзя. Гейт срабатывает не более одного раза за ход (`stop_hook_active` guard против петли).

### 16.8. Самодисциплина наблюдателя

Наблюдатель фиксирует каждый Stop без молчаливых пропусков:

- Внутренний отказ хука → строка-маркер `observer_error` в JSONL (не тихий `exit 0` без записи).
- Доля непарсибельных строк транскрипта выше порога → событие `parse_gap`.
- Контролёр **C5 observer-coverage-checker** (lefthook, warn-only) сверяет покрытие (git-активность без эпизодов) и целостность регистрации (Stop-хук в `.claude/settings.json`, `post-commit` установлен); расхождение — флаг в `docs/observer/STATUS.md`.

(d) In ### 16.6. Cross-refs, add a line:

- factor-analysis spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`

(e) Bump the header version. In the file header **Версия:** line, change v1.31 to v1.32. In the ## Что сделано после утверждения changelog section, add an entry at the top:

- **v1.32 (2026-05-19)** — observer factor-analysis extension (ADR-011 amend): §16.2 +абзац «Схема эпизода v2»; §16.3 4→5 контролёров (+C5 observer-coverage-checker); §16.7 (новое) routing-тег-дисциплина — механический Stop-gate; §16.8 (новое) самодисциплина наблюдателя; §16.6 +cross-ref. Spec `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
  • Step 4: Sync PSR_v1 R16

In docs/Plugin_stack_rules_v1.md:

(a) In ### 16.1. Observer scope, append a sentence:

Схема v2 (2026-05-19, ADR-011 amend): эпизод несёт `schema_version`, `decision_provenance`, `environment`, `task_size`, `task_ref`, `prompt_signal`; события расширены `hook_fired` / `interrupt` / `retry` / `time_burn` / `parse_gap`. Spec — `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.

(b) In ### 16.4. Cross-refs, add a line:

- factor-analysis spec: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`

(c) Bump the header version v3.16v3.17. In the ## История версий section, add an entry at the top:

- **v3.17 (2026-05-19)** — observer schema v2 sync (ADR-011 amend): R16.1 +предложение про `schema_version` / `decision_provenance` / `environment` / `task_size` / `prompt_signal` + расширенные события; R16.4 +cross-ref на factor-analysis spec. R0R15 без изменений. Связано: ADR-011, Pravila §16 (v1.32), spec `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md`.
  • Step 5: Cross-ref the brain-governance spec

In docs/superpowers/specs/2026-05-19-brain-governance-design.md, add a line to its **Связано:** / cross-refs header area (near the top of the file):

- Расширение: `docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md` (observer factor-analysis, schema v2).
  • Step 6: Flip this spec's status

In docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md, change the header **Статус:** draft — на ревью заказчика to:

**Статус:** accepted — реализуется по плану `docs/superpowers/plans/2026-05-19-observer-factor-analysis.md`
  • Step 7: Verify normative Markdown lints + cross-ref-checker passes

Run: npx markdownlint-cli2 "docs/adr/ADR-011-brain-governance.md" "docs/Pravila_raboty_Claude_v1_1.md" "docs/Plugin_stack_rules_v1.md" "docs/superpowers/specs/2026-05-19-brain-governance-design.md" "docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md" Expected: 0 errors.

Run: node tools/cross-ref-checker.mjs Expected: [cross-ref-checker] OK — 0 drift (the Pravila v1.32 / PSR_v1 v3.17 header bumps must match any cross-refs; if it fails, the offending cross-ref points at an old version — fix it).

  • Step 8: Commit
git add docs/adr/ADR-011-brain-governance.md docs/Pravila_raboty_Claude_v1_1.md docs/Plugin_stack_rules_v1.md docs/superpowers/specs/2026-05-19-brain-governance-design.md docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md
git commit -m "docs(brain): normative sync — ADR-011 amend + Pravila §16 + PSR_v1 R17"

Task 12: CLAUDE.md sync via claude-md-management plugin

Spec §7. CLAUDE.md is edited only via the plugin (CLAUDE.md §5 п.10) — never by direct Edit.

Files:

  • Modify: CLAUDE.md (via /claude-md-management:claude-md-improver)

  • Step 1: Invoke the plugin with the targeted update

Invoke /claude-md-management:claude-md-improver with this instruction:

Apply targeted updates to CLAUDE.md for the observer factor-analysis extension (ADR-011 amendment): §0 cross-refs Pravila v1.31→v1.32, PSR_v1 v3.16→v3.17 (Tooling unchanged); §3.6 «Brain governance» — add a sentence that the observer now writes schema-v2 episodes (decision provenance + environment factors + factor matrix) and that a routing-gate enforces the routing-tag, plus C5 observer-coverage-checker as a 5th controller; §9 changelog — add a v2.19 entry summarizing the observer factor-analysis extension, spec docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md, plan docs/superpowers/plans/2026-05-19-observer-factor-analysis.md.

  • Step 2: Verify cross-ref-checker passes after the CLAUDE.md bump

Run: node tools/cross-ref-checker.mjs Expected: [cross-ref-checker] OK — 0 drift — the CLAUDE.md §0 cross-refs (Pravila v1.32 / PSR_v1 v3.17) must match the headers bumped in Task 11.

  • Step 3: Commit
git add CLAUDE.md docs/CHANGELOG_claude_md.md
git commit -m "docs(claude-md): observer factor-analysis extension cross-refs (v2.19)"

(If the plugin already committed CLAUDE.md, skip this step — verify with git status.)


Final verification

After all 12 tasks:

  • Full tools test suite GREEN

Run: node app/node_modules/vitest/vitest.mjs --config app/vitest.config.tools.mjs run Expected: 0 failures. New + modified files covered: observer-transcript-parser, observer-routing-detector, observer-stop-hook, observer-coverage-checker, status-md-generator, brain-retro-analyzer. Report the exact pass count.

  • lefthook pre-commit GREEN

Run: npx lefthook run pre-commit Expected: all jobs pass — including job 11 l1-watcher (strict), job 12 cross-ref-checker (strict), job 15 observer-coverage-checker (warn-only). Report each job's status.

  • Requirements checklist vs spec

Re-read docs/superpowers/specs/2026-05-19-observer-factor-analysis-design.md §3–§8 and confirm every item maps to a completed task. Report any gap explicitly.

  • Push — only on explicit user approval. Pattern: git push origin feat/parallel-sessions-coordination:main (FF). The pre-push gate (gitleaks full history + lychee) must be green.