feat(brain-retro): extend mandatory digital analysis 7 → 10 cuts

SKILL.md MANDATORY DIGITAL ANALYSIS block grows by three cuts: 8. Class × canon coverage (analyzer: buildClassCanonCoverage) 9. Router vs Opus (analyzer: buildRouterVsOpus, sections A / B / C — A and C are mutually exclusive by construction) 10. Chain-ignore breakdown (analyzer: buildChainIgnoreBreakdown, bucketed by chain length 1 / 2 / 3+) All three are wired into analyzer analyze() output as result.classCanonCoverage / result.routerVsOpus / result.chainIgnoreBreakdown and produced automatically on every retro run (no manual step). +216 lines analyzer / +288 lines tests covering the three functions in isolation and via analyze(). Driven by retro #8 manual analysis: the three cuts surface signal the existing 7 cuts missed — router-vs-Opus disagreement, canon coverage by classification, chain-vs-singleton ignore rate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 18:08:53 +03:00
parent e184ffe212
commit b139888376
3 changed files with 516 additions and 4 deletions
@@ -22,7 +22,8 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
 ## Procedure

 > **MANDATORY DIGITAL ANALYSIS (added 2026-05-26 after retro #6 feedback).**
-> Каждый прогон /brain-retro ОБЯЗАН включать **количественные срезы**, не только causal narrative. Минимум 7 цифровых таблиц:
+> Каждый прогон /brain-retro ОБЯЗАН включать **количественные срезы**, не только causal narrative. Минимум 10 цифровых таблиц:
+>
 > 1. **Path-type breakdown** (regulated vs improvised, со счётчиками и %).
 > 2. **node_chosen distribution** (топ-15 узлов с count + %).
 > 3. **recommended_node distribution** (что классификатор предложил, count + %).
@@ -30,11 +31,16 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
 > 5. **outcome × node_chosen group**: 3 группы (skill_used / direct_no_rec / direct_ignored_rec) со счётчиками + rework rate per group.
 > 6. **classifier_output presence by source** (prefilter / llm / regex / cache / NULL) — даёт диагностику здоровья самого классификатора.
 > 7. **Per-classification trigger-match + via-skill** (analysis / planning / bugfix / feature / refactor / security).
+> 8. **Class × canon coverage** — таблица класс задач × канонические узлы из мозга (`observer-classification-map.json`) × роутер рекомендовал × я реально взял × попало ли в канон. Источник — `result.classCanonCoverage` из analyzer.
+> 9. **Router vs Opus** — три секции: A (роутер дал → Opus оценил, расхождение видно сразу), B (роутер молчал → Opus сказал «надо был скил»), C (роутер дал → Opus согласился что скил излишен). Источник — `result.routerVsOpus`.
+> 10. **Chain-ignore breakdown** — отдельный срез: сколько раз роутер рекомендовал цепочку vs одиночный узел, какой % я игнорировал, и rework-rate каждого; bucket по длине цепочки (1/2/3+). Источник — `result.chainIgnoreBreakdown`.
 >
-> Без этих 7 таблиц retro считается недоделанным. Narrative-выводы должны опираться на цифры из них, не на «общие ощущения». **Если classifier_output=NULL > 30% эпизодов** — это сигнал, что классификатор сломан; в retro отдельным блоком отчитаться о состоянии классификатора (timeouts/errors/source distribution).
+> Без этих 10 таблиц retro считается недоделанным. Narrative-выводы должны опираться на цифры из них, не на «общие ощущения». **Если classifier_output=NULL > 30% эпизодов** — это сигнал, что классификатор сломан; в retro отдельным блоком отчитаться о состоянии классификатора (timeouts/errors/source distribution).
 >
 > Запрет на жаргон для блока «Report to user»: цифры остаются техническими, словесные выводы пользователю — простым языком (см. memory `feedback_plain_language.md`).

+<!-- markdownlint-disable MD029 MD032 -->
+
 1. **Determine period**: ask user «за какой период» or default to «since last brain-retro» (find latest `docs/observer/notes/YYYY-MM-DD-brain-retro-*.md`).
 2. **Read evidence**: glob `docs/observer/episodes-YYYY-MM.jsonl` for the period; read all lines as JSON.
 3. **Read optional notes**: glob `docs/observer/notes/*.md` filtered by date.
@@ -43,8 +49,8 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
 5a. **[Phase 3] Sanity questions (spec §4.7)** — `node tools/brain-retro-sanity-generator.mjs` (called as a module from analyzer-driven flow, OR direct via `import { generateCandidateQuestions } from '../../../tools/brain-retro-sanity-generator.mjs'`) returns up to 5 candidate questions. Pick 3-4, ask via AskUserQuestion (multiple-choice + free comment). **Вопросы заказчику — простым языком**, не «rework / wrong_skill / TDD pattern / self_assessment», а «переделки / выбор не того инструмента / самопроверка» (memory `feedback_plain_language.md`). Если первый раунд содержит жаргон — переформулировать и переспросить. **Before persist:** sanitize free comments with `tools/observer-pii-filter.mjs` (`sanitize` export, RU_PHONE / EMAIL / TOKEN strip). Write answers to `docs/observer/sanity-checks/YYYY-MM-DD.json` `{schema_version: 1, questions: [...]}`.
 5b. **Reviewer pass** — pragmatic two-mode policy (added 2026-05-26 after brain-retro #6, replacing original spec §4.6 «subagent only» which was unrealistic at retro scale):

-  - **Batch mode (default, fast)** — `node tools/brain-retro-batch-reviewer.mjs docs/observer/episodes-YYYY-MM.jsonl <cutoff-iso> [limit=30] [conc=5]`. Direct Opus API via `reviewViaDirectApi` from `tools/brain-retro-opus-reviewer.mjs` with concurrency 5. Use for **N ≥ 20 unreviewed episodes** — typical retro workload (retro #6 processed 132 episodes in 293s = ~2.2s/episode, well under per-subagent overhead).
-  - **Subagent mode (per spec §4.6, deeper context)** — `Task(subagent_type='reviewer-agent', prompt=<episode JSON + sanity-answers context>)`. Use for **N < 20 episodes** OR when the reviewer needs access to other tools (read related files, grep history). Per-episode try/catch — on subagent crash/timeout, fall back to `reviewViaDirectApi`.
+- **Batch mode (default, fast)** — `node tools/brain-retro-batch-reviewer.mjs docs/observer/episodes-YYYY-MM.jsonl <cutoff-iso> [limit=30] [conc=5]`. Direct Opus API via `reviewViaDirectApi` from `tools/brain-retro-opus-reviewer.mjs` with concurrency 5. Use for **N ≥ 20 unreviewed episodes** — typical retro workload (retro #6 processed 132 episodes in 293s = ~2.2s/episode, well under per-subagent overhead).
+- **Subagent mode (per spec §4.6, deeper context)** — `Task(subagent_type='reviewer-agent', prompt=<episode JSON + sanity-answers context>)`. Use for **N < 20 episodes** OR when the reviewer needs access to other tools (read related files, grep history). Per-episode try/catch — on subagent crash/timeout, fall back to `reviewViaDirectApi`.

  Both modes write the same payload back: `review.*` + `outcome_reviewed` + `outcome_reviewed_source` (`direct_api_batch` for batch, `subagent` for Task(), `direct_api_fallback` when subagent fails). If both fail, leave `review.reviewer_error: <msg>` for the next retro.
 6. **Aggregate** per `references/aggregation-template.md` — fill the Factor analysis matrix from the analyzer's `factorMatrix`, the task groups from `tasks`, the causal-chain candidates from `causalChains`, plus the new sections: sanity-check results, reviewer-agent outcomes distribution, self-retrospect trigger status.
@@ -55,6 +61,8 @@ Aggregator over observer evidence. Reads JSONL + optional MD notes, surfaces can
 10. **Cost report** — read `~/.claude/runtime/cost-daily.json`; include classifier + self_assessment + reviewer cost totals for the period in the retro note.
 11. **Report to user**: high-signal summary including sanity highlights, reviewer outcome distribution, and any escalations.

+<!-- markdownlint-enable MD029 MD032 -->
+
 ## Output anatomy

 See `references/aggregation-template.md`.
@@ -7,6 +7,7 @@
 * Security Guidance #40: pure parsing — no exec/execSync.
 */
 import { Buffer } from 'buffer';
+import { resolve as pathResolve } from 'path';
 import { readFileSync, existsSync } from 'fs';
 import { detectMissedActivations } from './missed-activations.mjs';
 import {
@@ -356,6 +357,204 @@ export function buildFactorMatrix(episodesWithOutcome) {
  return matrix;
 }

+
+// ────────────────────────────────────────────────────────────────
+// New cut helpers — normalize recommended id to '#N' form for canon
+// comparison regardless of whether the source stored 19 or '#19'.
+// ────────────────────────────────────────────────────────────────
+function normalizeNodeId(id) {
+  if (id == null) return null;
+  const s = String(id).trim();
+  return s.startsWith('#') ? s : `#${s}`;
+}
+
+function hasRecommendation(ep) {
+  const pr = ep.primary_rationale || {};
+  const co = ep.classifier_output || {};
+  const recNode = pr.recommended_node || co.recommended_node;
+  const recChain = pr.recommended_chain || co.recommended_chain;
+  return !!(recNode || (Array.isArray(recChain) && recChain.length > 0));
+}
+
+function getRecommendedNode(ep) {
+  const pr = ep.primary_rationale || {};
+  const co = ep.classifier_output || {};
+  return pr.recommended_node || co.recommended_node || null;
+}
+
+function getRecommendedChain(ep) {
+  const pr = ep.primary_rationale || {};
+  const co = ep.classifier_output || {};
+  const chain = pr.recommended_chain || co.recommended_chain;
+  return Array.isArray(chain) ? chain : [];
+}
+
+
+/**
+ * Cut 8 — Class × canon coverage.
+ * Returns one row per task_classification appearing in the episodes, sorted by count desc.
+ * classificationMap shape: { [classification]: string[] } — canonical node IDs (e.g. '#34').
+ */
+export function buildClassCanonCoverage(episodes, classificationMap) {
+  const map = classificationMap || {};
+  const byClass = new Map();
+  for (const ep of episodes) {
+    const classification = (ep.primary_rationale || {}).task_classification || 'other';
+    if (!byClass.has(classification)) {
+      byClass.set(classification, {
+        classification,
+        count: 0,
+        canonicalNodes: map[classification] ? [...map[classification]] : [],
+        routerRecommended: 0,
+        claudeTook: 0,
+        recWithinCanon: 0,
+        rework: 0,
+      });
+    }
+    const row = byClass.get(classification);
+    row.count += 1;
+
+    const recNode = getRecommendedNode(ep);
+    const recChain = getRecommendedChain(ep);
+    const hasRec = !!(recNode || recChain.length > 0);
+    if (hasRec) {
+      row.routerRecommended += 1;
+      // Check if any recommended id falls within canonical set
+      const canonSet = new Set(row.canonicalNodes.map(normalizeNodeId));
+      const allRecIds = [];
+      if (recNode) allRecIds.push(normalizeNodeId(recNode));
+      for (const id of recChain) allRecIds.push(normalizeNodeId(id));
+      if (allRecIds.some((id) => id && canonSet.has(id))) {
+        row.recWithinCanon += 1;
+      }
+    }
+
+    const nodeChosen = (ep.primary_rationale || {}).node_chosen;
+    if (nodeChosen && nodeChosen !== 'direct') {
+      row.claudeTook += 1;
+    }
+    if (ep.outcome_reviewed === 'rework') {
+      row.rework += 1;
+    }
+  }
+  return [...byClass.values()].sort((a, b) => b.count - a.count);
+}
+
+/**
+ * Cut 9 — Router vs Opus three-section breakdown.
+ * Returns { sectionA, sectionB, sectionC } — each an array of structured items.
+ * Episodes lacking `review` are excluded from all sections.
+ */
+export function buildRouterVsOpus(episodes) {
+  const sectionA = [];
+  const sectionB = [];
+  const sectionC = [];
+
+  for (const ep of episodes) {
+    const rev = ep.review;
+    if (!rev || typeof rev !== 'object' || rev.reviewer_error) continue;
+
+    const pr = ep.primary_rationale || {};
+    const hasRec = hasRecommendation(ep);
+    const recNode = getRecommendedNode(ep);
+    const recChain = getRecommendedChain(ep);
+    const routerRecommendation = recChain.length > 0 ? recChain : recNode;
+    const time = (ep.timestamps || {}).started_at || null;
+    const taskId = String(ep.task_id || '').slice(0, 8);
+    const classification = pr.task_classification || 'other';
+    const nodeChosen = pr.node_chosen || 'direct';
+    const outcomeReviewed = ep.outcome_reviewed || 'unknown';
+
+    if (hasRec) {
+      const isCorrectNoAlt = rev.node_quality === 'correct' && !rev.alternative_better;
+      if (isCorrectNoAlt) {
+        // Section C: router gave + Opus agreed it was fine (correct, no better alternative)
+        sectionC.push({ time, taskId, classification, routerRecommendation, outcomeReviewed });
+      } else {
+        // Section A: router gave + some disagreement or uncertainty (wrong_node / disputable / has alternative)
+        sectionA.push({
+          time,
+          taskId,
+          classification,
+          routerRecommendation,
+          claudeChose: nodeChosen,
+          opusNodeQuality: rev.node_quality || 'n/a',
+          opusChainQuality: rev.chain_quality || 'n/a',
+          outcomeReviewed,
+          opusAlternative: rev.alternative_better || null,
+          opusRootCause: rev.error_root_cause || 'n/a',
+        });
+      }
+    } else if (!hasRec && rev.alternative_better) {
+      // Section B: router silent, Opus identified a better node
+      sectionB.push({
+        time,
+        taskId,
+        classification,
+        opusSuggests: rev.alternative_better,
+        outcomeReviewed,
+        opusReasoning: String(rev.reasoning || '').slice(0, 200),
+      });
+    }
+  }
+
+  return { sectionA, sectionB, sectionC };
+}
+
+/**
+ * Cut 10 — Chain-ignore breakdown.
+ * Distinguishes chain recommendations from node-only recommendations and reports
+ * ignore rates + rework rates, bucketed by chain length.
+ */
+export function buildChainIgnoreBreakdown(episodes) {
+  const result = {
+    totalChainRecommendations: 0,
+    ignoredChainCount: 0,
+    ignoredChainRework: 0,
+    totalNodeOnlyRecommendations: 0,
+    ignoredNodeOnlyCount: 0,
+    ignoredNodeOnlyRework: 0,
+    breakdownByChainLength: {
+      '1':  { count: 0, ignored: 0, rework: 0 },
+      '2':  { count: 0, ignored: 0, rework: 0 },
+      '3+': { count: 0, ignored: 0, rework: 0 },
+    },
+  };
+
+  for (const ep of episodes) {
+    const pr = ep.primary_rationale || {};
+    const recNode = getRecommendedNode(ep);
+    const recChain = getRecommendedChain(ep);
+    const hasChain = recChain.length > 0;
+    const hasNodeOnly = !hasChain && !!recNode;
+    const nodeChosen = pr.node_chosen || 'direct';
+    const isIgnored = nodeChosen === 'direct';
+    const isRework = ep.outcome_reviewed === 'rework';
+
+    if (hasChain) {
+      result.totalChainRecommendations += 1;
+      const lenBucket = recChain.length === 1 ? '1' : recChain.length === 2 ? '2' : '3+';
+      result.breakdownByChainLength[lenBucket].count += 1;
+      if (isIgnored) {
+        result.ignoredChainCount += 1;
+        result.breakdownByChainLength[lenBucket].ignored += 1;
+        if (isRework) {
+          result.ignoredChainRework += 1;
+          result.breakdownByChainLength[lenBucket].rework += 1;
+        }
+      }
+    } else if (hasNodeOnly) {
+      result.totalNodeOnlyRecommendations += 1;
+      if (isIgnored) {
+        result.ignoredNodeOnlyCount += 1;
+        if (isRework) result.ignoredNodeOnlyRework += 1;
+      }
+    }
+  }
+
+  return result;
+}
+
 /** Full deterministic aggregation: dedup → infer outcomes → group → chains → matrix → missed activations. */
 export function analyze(episodes, options = {}) {
  const deduped = dedupeEpisodes(episodes);
@@ -441,6 +640,20 @@ export function analyze(episodes, options = {}) {
    }
  }

+  // Cuts 8/9/10 — read classificationMap from the archived file when not
+  // passed via options (CLI invocation). Silent fallback to {} on missing/broken file.
+  let canonMapForCuts = classificationMap;
+  if (!Object.keys(canonMapForCuts).length) {
+    try {
+      const mapPath = pathResolve('docs/archive/llm-bootstrap-2026-05/routing-docs/observer-classification-map.json');
+      const raw = readFileSync(mapPath, 'utf-8');
+      const parsed = JSON.parse(raw);
+      canonMapForCuts = parsed.map || {};
+    } catch {
+      canonMapForCuts = {};
+    }
+  }
+
  return {
    episodeCount: normal.length,
    v1SkippedCount,
@@ -457,6 +670,9 @@ export function analyze(episodes, options = {}) {
    reviewerCoverage,
    degradedCount,
    costTotals,
+    classCanonCoverage: buildClassCanonCoverage(normal, canonMapForCuts),
+    routerVsOpus: buildRouterVsOpus(normal),
+    chainIgnoreBreakdown: buildChainIgnoreBreakdown(normal),
  };
 }

@@ -6,6 +6,9 @@ import {
  findCausalChains,
  buildFactorMatrix,
  analyze,
+  buildClassCanonCoverage,
+  buildRouterVsOpus,
+  buildChainIgnoreBreakdown,
 } from './brain-retro-analyzer.mjs';

 // Minimal v2 episode for tests.
@@ -717,3 +720,288 @@ describe('analyze — Pass 4 similar_past_outcome_majority axis (project-brain-f
    expect(result.factorMatrix.similar_past_outcome_majority.no_neighbors).toBeDefined();
  });
 });
+
+// ────────────────────────────────────────────────────────────────
+// NEW CUTS: buildClassCanonCoverage, buildRouterVsOpus, buildChainIgnoreBreakdown
+// ────────────────────────────────────────────────────────────────
+
+// Shared classMap fixture (embedded — no external file dependency)
+const testClassMap = {
+  monitoring: ['#34', '#35'],
+  bugfix:     ['#18', '#34'],
+  feature:    ['#19'],
+  release:    ['#37'],
+  planning:   ['#19', '#41', '#42'],
+  other:      [],
+};
+
+// Helper: episode for the new cuts (minimal — no embeddings needed)
+const epC = (overrides = {}) => ({
+  schema_version: 2,
+  task_id: 's1',
+  timestamps: { started_at: '2026-05-19T10:00:00Z', ended_at: '2026-05-19T10:05:00Z' },
+  primary_rationale: {
+    node_chosen: 'direct',
+    task_classification: 'other',
+    recommended_node: null,
+    recommended_chain: null,
+  },
+  outcome_reviewed: 'unknown',
+  ...overrides,
+});
+
+describe('buildClassCanonCoverage', () => {
+  it('returns [] for empty input', () => {
+    expect(buildClassCanonCoverage([], testClassMap)).toEqual([]);
+  });
+
+  it('single monitoring episode with recommended_node=#34, node_chosen=direct, rework', () => {
+    const eps = [epC({
+      primary_rationale: { node_chosen: 'direct', task_classification: 'monitoring', recommended_node: '#34', recommended_chain: null },
+      outcome_reviewed: 'rework',
+    })];
+    const rows = buildClassCanonCoverage(eps, testClassMap);
+    expect(rows).toHaveLength(1);
+    const row = rows[0];
+    expect(row.classification).toBe('monitoring');
+    expect(row.count).toBe(1);
+    expect(row.canonicalNodes).toEqual(['#34', '#35']);
+    expect(row.routerRecommended).toBe(1);   // has recommended_node
+    expect(row.claudeTook).toBe(0);           // node_chosen === 'direct'
+    expect(row.recWithinCanon).toBe(1);       // '#34' is in canonical
+    expect(row.rework).toBe(1);
+  });
+
+  it('classification not in map gets canonicalNodes=[]', () => {
+    const eps = [epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: null }, outcome_reviewed: 'success' })];
+    const rows = buildClassCanonCoverage(eps, {});
+    expect(rows[0].canonicalNodes).toEqual([]);
+  });
+
+  it('recommended_chain with numeric ids normalized to #N for canon check', () => {
+    const eps = [epC({
+      primary_rationale: { node_chosen: 'direct', task_classification: 'monitoring', recommended_node: null, recommended_chain: [19, 34] },
+      outcome_reviewed: 'success',
+    })];
+    const rows = buildClassCanonCoverage(eps, testClassMap);
+    // chain [19,34] → normalized ['#19','#34']. '#34' is in monitoring canonical → recWithinCanon=1
+    expect(rows[0].routerRecommended).toBe(1);
+    expect(rows[0].recWithinCanon).toBe(1);
+  });
+
+  it('mixed: 3 release episodes sorted desc, counting correctly', () => {
+    // 3 release, 2 feature (release > feature by count)
+    const eps = [
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'release', recommended_node: '#37', recommended_chain: null }, outcome_reviewed: 'rework', timestamps: { started_at: '2026-05-19T10:00:00Z' } }),
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'release', recommended_node: '#99', recommended_chain: null }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:01:00Z' } }),
+      epC({ primary_rationale: { node_chosen: '#37', task_classification: 'release', recommended_node: '#37', recommended_chain: null }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:02:00Z' } }),
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'feature', recommended_node: '#19', recommended_chain: null }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:03:00Z' } }),
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'feature', recommended_node: null, recommended_chain: null }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:04:00Z' } }),
+    ];
+    const rows = buildClassCanonCoverage(eps, testClassMap);
+    // Sorted by count desc: release=3, feature=2
+    expect(rows[0].classification).toBe('release');
+    expect(rows[0].count).toBe(3);
+    expect(rows[0].routerRecommended).toBe(3);   // all 3 have recommended_node
+    expect(rows[0].claudeTook).toBe(1);           // one has node_chosen='#37'
+    expect(rows[0].recWithinCanon).toBe(2);       // '#37' in release canonical for ep1 and ep3; '#99' not in canonical for ep2
+    expect(rows[0].rework).toBe(1);
+    expect(rows[1].classification).toBe('feature');
+    expect(rows[1].count).toBe(2);
+    expect(rows[1].routerRecommended).toBe(1);   // only 1 has recommended_node
+    expect(rows[1].claudeTook).toBe(0);
+  });
+});
+
+describe('buildRouterVsOpus', () => {
+  const epR = (overrides = {}) => ({
+    schema_version: 4,
+    task_id: 'session-abc-12345',
+    timestamps: { started_at: '2026-05-19T10:00:00Z' },
+    primary_rationale: {
+      node_chosen: 'direct',
+      task_classification: 'other',
+      recommended_node: null,
+      recommended_chain: null,
+    },
+    outcome_reviewed: 'unknown',
+    review: {
+      node_quality: 'correct',
+      chain_quality: 'n/a',
+      alternative_better: null,
+      error_root_cause: 'n/a',
+      reasoning: 'ok',
+    },
+    ...overrides,
+  });
+
+  it('one episode in each of A/B/C → 1/1/1', () => {
+    const eps = [
+      // A: router gave recommendation, has review
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'feature', recommended_node: '#19', recommended_chain: null },
+            review: { node_quality: 'wrong_node', chain_quality: 'n/a', alternative_better: '#37', error_root_cause: 'wrong_skill', reasoning: 'x' }, outcome_reviewed: 'rework' }),
+      // B: router silent, alternative_better set
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'planning', recommended_node: null, recommended_chain: null },
+            review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: '#41', error_root_cause: 'n/a', reasoning: 'should have used planning' }, outcome_reviewed: 'soft_success',
+            timestamps: { started_at: '2026-05-19T10:01:00Z' } }),
+      // C: router gave, node_quality=correct, no alternative
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'release', recommended_node: '#37', recommended_chain: null },
+            review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: null, error_root_cause: 'n/a', reasoning: 'direct was fine' }, outcome_reviewed: 'success',
+            timestamps: { started_at: '2026-05-19T10:02:00Z' } }),
+    ];
+    const result = buildRouterVsOpus(eps);
+    expect(result.sectionA).toHaveLength(1);
+    expect(result.sectionB).toHaveLength(1);
+    expect(result.sectionC).toHaveLength(1);
+  });
+
+  it('episode without review is excluded from all three sections', () => {
+    const eps = [
+      epR({ review: undefined, primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: '#19', recommended_chain: null } }),
+    ];
+    const result = buildRouterVsOpus(eps);
+    expect(result.sectionA).toHaveLength(0);
+    expect(result.sectionB).toHaveLength(0);
+    expect(result.sectionC).toHaveLength(0);
+  });
+
+  it('A: episode with recommended_chain array of strings goes into A with routerRecommendation = the array', () => {
+    const eps = [
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'planning', recommended_node: null, recommended_chain: ['#19', '#41'] },
+            review: { node_quality: 'wrong_node', chain_quality: 'missing_step', alternative_better: '#19', error_root_cause: 'wrong_chain_order', reasoning: 'chain needed' }, outcome_reviewed: 'rework' }),
+    ];
+    const result = buildRouterVsOpus(eps);
+    expect(result.sectionA).toHaveLength(1);
+    expect(Array.isArray(result.sectionA[0].routerRecommendation)).toBe(true);
+    expect(result.sectionA[0].routerRecommendation).toEqual(['#19', '#41']);
+  });
+
+  it('B: router silent AND alternative_better truthy → in B; router silent AND alternative_better=null → not in B', () => {
+    const eps = [
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: null },
+            review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: '#60', error_root_cause: 'n/a', reasoning: 'should use docs' }, outcome_reviewed: 'soft_success' }),
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: null },
+            review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: null, error_root_cause: 'n/a', reasoning: 'fine' }, outcome_reviewed: 'success',
+            timestamps: { started_at: '2026-05-19T10:01:00Z' } }),
+    ];
+    const result = buildRouterVsOpus(eps);
+    expect(result.sectionB).toHaveLength(1);
+    expect(result.sectionB[0].opusSuggests).toBe('#60');
+  });
+
+  it('C: router gave + node_quality=correct + no alternative → in C; same but alternative_better truthy → NOT in C', () => {
+    const inC = epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'release', recommended_node: '#37', recommended_chain: null },
+                      review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: null, error_root_cause: 'n/a', reasoning: 'fine' }, outcome_reviewed: 'success' });
+    const notInC = epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'release', recommended_node: '#37', recommended_chain: null },
+                         review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: '#41', error_root_cause: 'n/a', reasoning: 'actually #41 better' }, outcome_reviewed: 'rework',
+                         timestamps: { started_at: '2026-05-19T10:01:00Z' } });
+    const result = buildRouterVsOpus([inC, notInC]);
+    expect(result.sectionC).toHaveLength(1);
+    // The one NOT in C (has alternative_better) should be in A instead
+    expect(result.sectionA).toHaveLength(1);
+  });
+
+  it('sectionA item has all expected shape fields', () => {
+    const eps = [
+      // Must be wrong_node or have alternative to end up in A (not C)
+      epR({ primary_rationale: { node_chosen: 'direct', task_classification: 'feature', recommended_node: '#19', recommended_chain: null },
+            review: { node_quality: 'wrong_node', chain_quality: 'n/a', alternative_better: '#37', error_root_cause: 'wrong_skill', reasoning: 'should be #37' }, outcome_reviewed: 'rework' }),
+    ];
+    const result = buildRouterVsOpus(eps);
+    const item = result.sectionA[0];
+    expect(item).toHaveProperty('time');
+    expect(item).toHaveProperty('taskId');
+    expect(item).toHaveProperty('classification');
+    expect(item).toHaveProperty('routerRecommendation');
+    expect(item).toHaveProperty('claudeChose');
+    expect(item).toHaveProperty('opusNodeQuality');
+    expect(item).toHaveProperty('opusChainQuality');
+    expect(item).toHaveProperty('outcomeReviewed');
+    expect(item).toHaveProperty('opusAlternative');
+    expect(item).toHaveProperty('opusRootCause');
+    expect(item.taskId).toHaveLength(8);  // first 8 chars of task_id
+  });
+});
+
+describe('buildChainIgnoreBreakdown', () => {
+  it('returns all zeros for empty input', () => {
+    const result = buildChainIgnoreBreakdown([]);
+    expect(result.totalChainRecommendations).toBe(0);
+    expect(result.ignoredChainCount).toBe(0);
+    expect(result.ignoredChainRework).toBe(0);
+    expect(result.totalNodeOnlyRecommendations).toBe(0);
+    expect(result.ignoredNodeOnlyCount).toBe(0);
+    expect(result.ignoredNodeOnlyRework).toBe(0);
+    expect(result.breakdownByChainLength['1']).toEqual({ count: 0, ignored: 0, rework: 0 });
+    expect(result.breakdownByChainLength['2']).toEqual({ count: 0, ignored: 0, rework: 0 });
+    expect(result.breakdownByChainLength['3+']).toEqual({ count: 0, ignored: 0, rework: 0 });
+  });
+
+  it('chain-len-4 ep with node_chosen=direct and outcome=rework → ignoredChainCount=1, rework=1, 3+ bucket', () => {
+    const eps = [epC({
+      primary_rationale: { node_chosen: 'direct', task_classification: 'planning', recommended_node: null, recommended_chain: ['#19','#41','#42','#37'] },
+      outcome_reviewed: 'rework',
+    })];
+    const result = buildChainIgnoreBreakdown(eps);
+    expect(result.totalChainRecommendations).toBe(1);
+    expect(result.ignoredChainCount).toBe(1);
+    expect(result.ignoredChainRework).toBe(1);
+    expect(result.breakdownByChainLength['3+']).toEqual({ count: 1, ignored: 1, rework: 1 });
+  });
+
+  it('node-only rec ep with node_chosen=direct → ignoredNodeOnlyCount=1', () => {
+    const eps = [epC({
+      primary_rationale: { node_chosen: 'direct', task_classification: 'monitoring', recommended_node: '#34', recommended_chain: null },
+      outcome_reviewed: 'success',
+    })];
+    const result = buildChainIgnoreBreakdown(eps);
+    expect(result.totalNodeOnlyRecommendations).toBe(1);
+    expect(result.ignoredNodeOnlyCount).toBe(1);
+    expect(result.ignoredNodeOnlyRework).toBe(0);
+    expect(result.totalChainRecommendations).toBe(0);
+  });
+
+  it('chains of length 1, 2, 5 bucketed correctly into 1/2/3+', () => {
+    const eps = [
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: ['#19'] }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:00:00Z' } }),
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: ['#19','#34'] }, outcome_reviewed: 'success', timestamps: { started_at: '2026-05-19T10:01:00Z' } }),
+      epC({ primary_rationale: { node_chosen: 'direct', task_classification: 'other', recommended_node: null, recommended_chain: ['#19','#34','#37','#41','#42'] }, outcome_reviewed: 'rework', timestamps: { started_at: '2026-05-19T10:02:00Z' } }),
+    ];
+    const result = buildChainIgnoreBreakdown(eps);
+    expect(result.totalChainRecommendations).toBe(3);
+    expect(result.breakdownByChainLength['1']).toEqual({ count: 1, ignored: 1, rework: 0 });
+    expect(result.breakdownByChainLength['2']).toEqual({ count: 1, ignored: 1, rework: 0 });
+    expect(result.breakdownByChainLength['3+']).toEqual({ count: 1, ignored: 1, rework: 1 });
+  });
+
+  it('chain-rec ep where node_chosen != direct → in totalChainRecommendations but NOT in ignoredChainCount', () => {
+    const eps = [epC({
+      primary_rationale: { node_chosen: '#19', task_classification: 'feature', recommended_node: null, recommended_chain: ['#19', '#34'] },
+      outcome_reviewed: 'success',
+    })];
+    const result = buildChainIgnoreBreakdown(eps);
+    expect(result.totalChainRecommendations).toBe(1);
+    expect(result.ignoredChainCount).toBe(0);
+    expect(result.breakdownByChainLength['2']).toEqual({ count: 1, ignored: 0, rework: 0 });
+  });
+});
+
+describe('analyze — classCanonCoverage / routerVsOpus / chainIgnoreBreakdown integrated', () => {
+  it('analyze() result includes classCanonCoverage, routerVsOpus, chainIgnoreBreakdown keys', () => {
+    const eps = [
+      ep({ schema_version: 4,
+        primary_rationale: { node_chosen: 'direct', task_classification: 'feature', recommended_node: '#19', recommended_chain: null, triggers_matched: [], boundaries_applied: [], step: 1, candidates_considered: [], hard_floor: { invoked: false, rules: [] } },
+        review: { node_quality: 'correct', chain_quality: 'n/a', alternative_better: null, error_root_cause: 'n/a', reasoning: 'ok' },
+        outcome_reviewed: 'success' }),
+    ];
+    const result = analyze(eps);
+    expect(result.classCanonCoverage).toBeDefined();
+    expect(result.routerVsOpus).toBeDefined();
+    expect(result.chainIgnoreBreakdown).toBeDefined();
+    expect(Array.isArray(result.classCanonCoverage)).toBe(true);
+    expect(result.routerVsOpus).toHaveProperty('sectionA');
+    expect(result.routerVsOpus).toHaveProperty('sectionB');
+    expect(result.routerVsOpus).toHaveProperty('sectionC');
+    expect(result.chainIgnoreBreakdown).toHaveProperty('totalChainRecommendations');
+  });
+});