feat(observer/analyzer): Pass 3 — dynamics fields + 8 axes

Adds 3 new fields to the v4 episode (`task_meta` block) and 8 new factor-matrix axes capturing turn dynamics: prompt complexity, time- of-day rhythms, inter-prompt cadence, MCP-tool reach, file-mix shape, skill / subagent invocation density. Builds on Pass 1 (4f362a9e) and Pass 2 (2bf25db7) per memory/project_brain_factor_analysis_4passes.md. # observer-transcript-parser.mjs New exported helpers (covered by unit tests): - classifyFilePath(path) — 7-bucket path categorizer with priority ordering (test > norm > spec > config > data > src > other). Handles both POSIX and Windows separators, normalises CRLF-tolerant. - extractFileTypeDistribution(files) — counts per bucket, zero-fills missing categories for stable downstream key shape. - extractMcpServers(turn) — unique mcp__<server>__* fingerprints, non-greedy match preserves multi-word server names (e.g. plugin_brand-voice_box, plugin_finance_bigquery). parseTranscript() now attaches a `task_meta` block to every episode: - prompt_length_chars — strlen of first user prompt. - mcp_servers_used — unique MCP fingerprints in the turn. - file_type_distribution — count by classifyFilePath bucket. # brain-retro-analyzer.mjs (8 new FACTOR_FNS axes) - prompt_length_bucket: short (<100) / medium / long / huge / null. - time_of_day_bucket: night (00-05 UTC) / morning / afternoon / evening. - day_of_week: Sun..Sat (UTC). - inter_prompt_gap_bucket: <1m / 1-10m / 10-60m / 60m+ / null. Computed in analyze() as (current.started_at − previous.ended_at) within the same session, then read off `episode._interPromptGapMin` by the axis fn (same pattern as `_inferredOutcome`). - mcp_server_used: any / none. - file_type_main: dominant bucket from file_type_distribution, with 'mixed' on top-bucket ties and 'none' on empty / missing. - skill_invocations_bucket: 0 / 1 / 2+ (Skill tool_summary count). - subagent_spawns_bucket: 0 / 1 / 2+ (Agent or Task tool_summary count). `time_of_day_bucket` / `day_of_week` reject null / empty timestamps explicitly — `new Date(null)` would coerce to the epoch and falsely bucket as 'night' / 'Thu'. # Tests 24 new tests (RED → GREEN): - observer-transcript-parser.test.mjs: 13 tests covering classifyFilePath (6 bucket smokes), extractFileTypeDistribution (2), extractMcpServers (2), parseTranscript task_meta block (2 — populated + empty-transcript defaults). - brain-retro-analyzer.test.mjs: 9 tests for each new axis + a smoke verifying all 8 axes land via analyze() on minimal v2. Targeted sweep: 3708 tests pass across 65 affected suites (2 worktree- CRLF copies pre-existing failures, unrelated). Factor matrix grew 11 → 19 → 21 → 29 axes across Pass 1+2+3. Older episodes without task_meta surface as 'null' / 'none' buckets — no throws, no schema_minor bump needed (task_meta is purely additive). LEFTHOOK=0 due to quirk #111. Manual gitleaks scan: clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 16:50:04 +03:00
parent 2bf25db72e
commit 4010495d19
4 changed files with 398 additions and 0 deletions
@@ -12,6 +12,9 @@ import {
  extractLastUserPromptText,
  classifyTask,
  extractTokenUsage,
+  extractMcpServers,
+  extractFileTypeDistribution,
+  classifyFilePath,
 } from './observer-transcript-parser.mjs';

 // Build a JSONL transcript string from entry objects.
@@ -1813,3 +1816,110 @@ describe('parseTranscript — schema v4.3 write-block fields (phase 3 deferred #
    expect(cost.reviewer_subagent_usd).toBe(0);
  });
 });
+
+describe('classifyFilePath — Pass 3 path-pattern bucketing (project-brain-factor-analysis-4passes)', () => {
+  it('classifies test files', () => {
+    expect(classifyFilePath('tools/foo.test.mjs')).toBe('test');
+    expect(classifyFilePath('app/tests/Feature/X.php')).toBe('test');
+    expect(classifyFilePath('resources/js/foo.spec.ts')).toBe('test');
+  });
+  it('classifies config files', () => {
+    expect(classifyFilePath('package.json')).toBe('config');
+    expect(classifyFilePath('vite.config.ts')).toBe('config');
+    expect(classifyFilePath('lefthook.yml')).toBe('config');
+    expect(classifyFilePath('.env')).toBe('config');
+    expect(classifyFilePath('tsconfig.json')).toBe('config');
+  });
+  it('classifies spec/plan files under docs/superpowers/', () => {
+    expect(classifyFilePath('docs/superpowers/specs/x.md')).toBe('spec');
+    expect(classifyFilePath('docs/superpowers/plans/x.md')).toBe('spec');
+  });
+  it('classifies normative documents', () => {
+    expect(classifyFilePath('CLAUDE.md')).toBe('norm');
+    expect(classifyFilePath('c:\\моя\\проекты\\портал crm\\Документация\\CLAUDE.md')).toBe('norm');
+    expect(classifyFilePath('docs/Pravila_raboty_Claude_v1_1.md')).toBe('norm');
+    expect(classifyFilePath('docs/Plugin_stack_rules_v1.md')).toBe('norm');
+    expect(classifyFilePath('docs/Tooling_v8_3.md')).toBe('norm');
+    expect(classifyFilePath('C:\\Users\\x\\.claude\\projects\\proj\\memory\\foo.md')).toBe('norm');
+  });
+  it('classifies data files', () => {
+    expect(classifyFilePath('docs/observer/episodes-2026-05.jsonl')).toBe('data');
+    expect(classifyFilePath('db/seed.csv')).toBe('data');
+    expect(classifyFilePath('db/schema.sql')).toBe('data');
+  });
+  it('classifies app/tools source under src', () => {
+    expect(classifyFilePath('app/Http/Controllers/X.php')).toBe('src');
+    expect(classifyFilePath('tools/router-classifier.mjs')).toBe('src');
+    expect(classifyFilePath('resources/js/views/Dashboard.vue')).toBe('src');
+  });
+  it('returns other for paths that fit no category', () => {
+    expect(classifyFilePath('some-random-binary.png')).toBe('other');
+  });
+});
+
+describe('extractFileTypeDistribution — Pass 3 (project-brain-factor-analysis-4passes)', () => {
+  it('counts each path bucket and zero-fills missing categories', () => {
+    const dist = extractFileTypeDistribution([
+      'tools/router-classifier.mjs',
+      'tools/router-classifier.test.mjs',
+      'docs/superpowers/specs/x.md',
+      'CLAUDE.md',
+    ]);
+    expect(dist.src).toBe(1);
+    expect(dist.test).toBe(1);
+    expect(dist.spec).toBe(1);
+    expect(dist.norm).toBe(1);
+    expect(dist.config).toBe(0);
+    expect(dist.data).toBe(0);
+    expect(dist.other).toBe(0);
+  });
+  it('returns all-zero distribution for empty input', () => {
+    const dist = extractFileTypeDistribution([]);
+    for (const v of Object.values(dist)) expect(v).toBe(0);
+  });
+});
+
+describe('extractMcpServers — Pass 3 (project-brain-factor-analysis-4passes)', () => {
+  it('extracts unique mcp__<server>__* prefixes from tool_use entries', () => {
+    const turn = [
+      assistantTurn([
+        { type: 'tool_use', id: 't1', name: 'mcp__github__list_issues', input: {} },
+        { type: 'tool_use', id: 't2', name: 'mcp__github__get_pr', input: {} },
+        { type: 'tool_use', id: 't3', name: 'mcp__playwright__browser_click', input: {} },
+        { type: 'tool_use', id: 't4', name: 'Read', input: { file_path: 'a.txt' } },
+      ], '2026-05-25T10:00:00Z'),
+    ];
+    expect(extractMcpServers(turn).sort()).toEqual(['github', 'playwright']);
+  });
+  it('returns empty array when no mcp tools used', () => {
+    const turn = [assistantTurn([{ type: 'tool_use', id: 't1', name: 'Read', input: { file_path: 'a' } }], '2026-05-25T10:00:00Z')];
+    expect(extractMcpServers(turn)).toEqual([]);
+  });
+});
+
+describe('parseTranscript — Pass 3 task_meta block (project-brain-factor-analysis-4passes)', () => {
+  it('includes prompt_length_chars / mcp_servers_used / file_type_distribution under task_meta', () => {
+    const t = jsonl([
+      userPrompt('добавь функцию X в файл tools/router-classifier.mjs', '2026-05-25T10:00:00Z'),
+      assistantTurn([
+        { type: 'tool_use', id: 't1', name: 'mcp__github__get_pr', input: {} },
+        { type: 'tool_use', id: 't2', name: 'Read', input: { file_path: 'tools/router-classifier.mjs' } },
+        { type: 'tool_use', id: 't3', name: 'Edit', input: { file_path: 'tools/router-classifier.test.mjs', old_string: 'a', new_string: 'b' } },
+      ], '2026-05-25T10:01:00Z'),
+    ]);
+    const ep = parseTranscript(t);
+    expect(ep.task_meta).toBeDefined();
+    expect(ep.task_meta.prompt_length_chars).toBe('добавь функцию X в файл tools/router-classifier.mjs'.length);
+    expect(ep.task_meta.mcp_servers_used).toEqual(['github']);
+    expect(ep.task_meta.file_type_distribution.src).toBe(1);
+    expect(ep.task_meta.file_type_distribution.test).toBe(1);
+  });
+
+  it('task_meta is present even on empty transcript (null-safe defaults)', () => {
+    const ep = parseTranscript('');
+    expect(ep.task_meta).toBeDefined();
+    expect(ep.task_meta.prompt_length_chars).toBe(0);
+    expect(ep.task_meta.mcp_servers_used).toEqual([]);
+    expect(ep.task_meta.file_type_distribution.other).toBe(0);
+  });
+});