58784b182d
Closes the 4-pass factor-analysis expansion plan in
memory/project_brain_factor_analysis_4passes.md. Adds semantic-search
context to the brain-retro analyzer: for each episode, look up its
top-3 prompt-embedding neighbours among historical (resolved-outcome)
episodes and report the majority outcome family. Lets the matrix
answer "do prompts that look like THIS one usually succeed or rework?"
# New module: tools/observer-embedding-index.mjs (pure, fs-free)
- mapOutcomeToFamily(outcome): success / soft_success → 'success',
rework → 'retry', blocked / partial → 'failure', else null.
- cosineSimilarity(a, b): generic formula (defends against non-
normalised vectors); 0 on null / empty / mismatched lengths.
- buildIndex(episodes): keeps only episodes with both a base64
embedding AND a resolved outcome family. Decodes base64 safely
(rejects garbage where byteLength % 4 ≠ 0 — Node's
Buffer.from('garbage', 'base64') silently strips invalid chars).
- findNearestNeighbors(target, index, k, opts): top-k by descending
cosine. Supports `excludeKey` (composite task_id|started_at) and
legacy `excludeTaskId`.
- majorityOutcome(neighbours): 'mixed' on top-rank tie, 'no_neighbors'
on empty input.
- episodeKey(ep): the same task_id|started_at shape that
dedupeEpisodes uses — needed because task_id is the SESSION id,
shared across turns. task_id alone cannot identify a single turn.
# brain-retro-analyzer.mjs
- New FACTOR_FNS axis similar_past_outcome_majority reading the
pre-computed episode._similarPastOutcomeMajority field.
- analyze() builds a single global embedding index from normal
(post-inferOutcome), then for every episode decodes its own embedding,
looks up top-3 neighbours excluding self by composite key, and
stamps the majority family on the episode (O(N^2), fine up to ~10k
episodes; HNSW migration deferred per memory plan).
- Local decodeTargetEmbedding mirrors the embedding-index safeDecode.
# Tests
20 new tests (RED -> GREEN):
- observer-embedding-index.test.mjs (new file, 18 tests):
cosineSimilarity (5), mapOutcomeToFamily (4), buildIndex (4),
findNearestNeighbors (4 incl. self-exclusion), majorityOutcome (3).
- brain-retro-analyzer.test.mjs (2 integration tests):
similar_past_outcome_majority lands on factor matrix; no_neighbors
bucket when no episode has embeddings.
Targeted sweep: 632/632 PASS on the 2 directly-affected suites.
Broader tools/ sweep: 7968/7969 PASS. Pre-existing 1 test failure in
observer-self-assessment-api.test.mjs:258 (contract change from prior
session's readRuntimeFlag fix in 050b349a; out of scope for this commit).
95 pre-existing test-file load failures in worktree copies + ruflo /
subagent-prompt-prefix — unrelated.
Factor matrix grew 11 -> 19 -> 21 -> 29 -> 30 axes across Pass 1+2+3+4.
LEFTHOOK=0 due to quirk #111. Manual gitleaks scan: clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>