Commit Graph

55 Commits

Author SHA1 Message Date
Дмитрий 8e5eaecf6a feat(observer): Task 2 — extractTokenUsage + task_cost in parseTranscript
- export extractTokenUsage(turn): sums input/output/cache/iterations/
  web_search/web_fetch across all assistant messages in a turn
- parseTranscript now includes task_cost field (zero-filled when no usage)
- 7 new tests (5 unit + 2 integration); total 248/248 GREEN
- V2_FIELDS in observer-stop-hook.mjs NOT changed (backward compat)
2026-05-20 13:47:35 +03:00
Дмитрий 47c03a9e18 feat(observer): extend classifyTask with 7 new classes
Closes brain-retro 2026-05-20 #1 — analysis/memory-sync/regulatory-bump/
release/cleanup/monitoring/planning. Addresses '59% other' observation
from initial retro factor matrix.

Ordering: release before feature (merge feature-branch), planning before
refactor (план рефакторинга), memory-sync/regulatory-bump at top as most
specific. monitoring regex проверь состоян covers inflected forms.

9 new vitest tests, 241/241 GREEN in npm run test:tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:34 +03:00
Дмитрий 2476dd3c1b fix(observer): expand PII patterns — JWT/AWS/Yandex/IPv4/OS-username
PII filter previously covered only RU phone, email, Sentry, OpenAI token,
and generic Bearer. Several common surface leaks were uncovered:

- JWT tokens (eyJ<base64>.<base64>.<base64>) — auth/session tokens.
- AWS access key IDs (AKIA<16 alphanum>) — IAM static creds.
- Yandex Cloud IAM static keys (AQVN<base64>), session tokens (t1.<base64>),
  OAuth tokens (y0_<base64>) — primary cloud-provider for this project.
- IPv4 addresses (dotted-quad) — over-redacts 4-segment build numbers as
  an accepted tradeoff (under-redaction is the worse failure).
- Windows user-paths (C:\Users\<name>) → C:\Users\***. Otherwise the OS
  username `Administrator` leaks via task_size.files in every episode.
- POSIX /home/<name>/ → /home/***/. Same rationale for Linux dev hosts.

Pattern order: highly-specific token patterns (JWT/AWS/YC) run BEFORE
OPENAI_TOKEN/GENERIC_BEARER fallbacks; otherwise partial overlaps would
strip the wrong segments.

Tests: 9 new (each new pattern + idempotency over the expanded redaction
markers). 27/27 PII tests green.

.gitleaks.toml: added the test fixture to the path allowlist — the file
contains synthetic JWT/AWS/Yandex tokens (the filter is supposed to redact
them), not real secrets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:10:53 +03:00
Дмитрий 3ec638cbd2 fix(observer): C5 coverage driven by hook registration, drop commit ratio (COV-1)
Bug: checkCoverage flagged anomaly when "recent commits > 0 AND episodes == 0".
Two design flaws, proven in this project:
- Wrong unit: commits = work-unit (one turn → many commits via subagent
  workflow); episodes = turn-unit. A 1023-vs-19 ratio is not anomalous, it's
  expected.
- Wrong window: the 14-day commit window predated the Stop-hook's existence
  (registered 2026-05-19). For 13 of 14 days the hook didn't exist — 889
  commits were structurally impossible to mirror as episodes.

Result: the C5 indicator was either always-red (flagging the hook's birth
as anomaly) or always-green (any episode count vs huge commit count = ok).
Either way uninformative.

Fix:
- checkCoverage(episodeCount, hookRegistered) — drops the commit param.
  Warn iff hook is registered AND 0 episodes this month → the hook is
  silently failing. If the hook isn't registered, 0 episodes is correct.
- runCoverageChecker derives hookRegistered from settings.json
  (isObserverStopRegistered helper) and passes it to checkCoverage.
  No more git execFileSync — pure fs.

Tests rewritten under the new contract: 7/7 (was 6, +1 drift-hazard guard
ensuring detail strings never mention "commit"). 15/15 coverage tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:07:58 +03:00
Дмитрий 3b7e549e02 fix(observer): validate prompt_signal + events in appendEpisode (C-7)
V2_FIELDS list omitted prompt_signal and events — both are always produced
by parser and buildEpisodeFromContext, so the happy path is unaffected, but
a future ctx-fallback path that dropped them would silently write a
malformed episode. Add both to V2_FIELDS; appendEpisode now throws on either
being missing.

Tests: 2 new — appendEpisode throws when prompt_signal missing /
when events missing. 38/38 stop-hook tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:05:56 +03:00
Дмитрий 7fe9f89574 fix(observer): exclude hot/normative files from causal chains (A-3)
Bug: findCausalChains flagged a chain whenever two episodes shared any
file. CLAUDE.md / MEMORY.md / STATUS.md / episodes-YYYY-MM.jsonl /
memory/*.md are touched by almost every turn (memory store, status
regeneration, normative-doc updates) — sharing them is not evidence of
causality, just baseline noise. Result: spurious chains on hot files
crowded out the genuine signal.

Fix: HOT_FILE_PATTERNS regex list + `isHotFile(path)` predicate. In
findCausalChains, filter hot files out of BOTH the errored-episode file
set AND the candidate-shared list. If only hot files were shared → no
chain. If a non-hot file is also shared → the chain stands and the
sharedFiles list contains only the non-hot ones.

Tests: 4 new cases — CLAUDE.md / memory/*.md / episodes/STATUS/MEMORY
sharing yields no chain; a turn sharing both CLAUDE.md AND /src/app.ts
yields a chain with sharedFiles=['/src/app.ts'] only. 33/33 analyzer
tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:04:59 +03:00
Дмитрий c386361881 fix(observer): infer blocked from unrecovered_error tail, not raw error/retry count (A-1)
Bug: inferOutcome flagged `blocked` whenever errorCount > retryCount across
the turn's events. But the parser emits an `error` event for ANY tool_result
with is_error=true — including expected failures: TDD failing-test-first,
grep returning nothing, git commands with intentional non-zero exit. On
TDD-heavy turns (project's standard discipline) this systematically marked
turns as blocked even when they ended on a successful tool_use.

Fix:
- Parser (extractProcessEvents): walk turn from end, find the LAST
  tool_result; if its is_error=true, emit a single `unrecovered_error`
  event. Distinguishes "turn ended on failure" from "errors recovered
  later". The original per-is_error `error` events remain (useful as raw
  factor signals).
- Analyzer (inferOutcome): replace `errorCount > retryCount → blocked`
  with `events.some(kind === 'unrecovered_error') → blocked`. Same
  ordering preserved (interrupt > blocked > rework/success/unknown).

Tests:
- Parser: emits unrecovered_error when last tool_result is_error;
  does NOT emit when turn ended on a successful tool_result;
  does NOT emit for turns with no tool_results.
- Analyzer: blocked iff unrecovered_error event present (not raw count);
  events=[error, error, retry] → success (no unrecovered_error).

142/142 vitest green (was 128).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:03:15 +03:00
Дмитрий 94f831f7d1 fix(observer): uuid-dedup in parseLines (C-1 root fix for quirk #101)
Bug: Claude Code's transcript JSONL file accumulates duplicated context-
rebuild snapshots — the same entry re-printed with the SAME `uuid`. Without
dedup, session_turn / task_size / events double-count, and session_turn
becomes non-monotonic across episodes parsed at different file-growth
states. Live evidence: episodes-2026-05.jsonl lines 14/15/16 of the same
session showed session_turn 139 → 140 → 91 (backwards in time). Probe
on transcript 553717ec: 22400 entries, only 6074 unique uuid (68% dup
rate); real user prompts 264 total vs 92 unique-uuid.

Fix: parseLines now tracks a `seenUuid` Set and skips entries whose uuid
has already been encountered (keep-first). Entries without `uuid`
(synthetic test fixtures) pass through unchanged. All downstream functions
(findTurnStart, extractEnvironment, extractTaskSize, etc.) operate on the
deduped entries array, so the fix is single-point and total.

Tests: new `parseTranscript — uuid-dedup` describe block covers
(1) duplicated-uuid prompts collapse → session_turn counts once,
(2) distinct-uuid entries preserved (no over-dedup),
(3) no-uuid entries pass through (synthetic-fixture safety),
(4) duplicated-uuid assistant turns → tool_calls / files_touched counted once.
110/110 parser tests green (was 106).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:00:50 +03:00
Дмитрий 030bdc65ab fix(observer): narrow parallel_session detector to tool_result evidence (C-2)
extractEnvironment was scanning JSON.stringify(turn) for collision markers
(чужой staged / foreign git index / index.lock / another git process). Prose
mentions in user/assistant text flipped parallel_session=true. Live FP proven
on episodes-2026-05.jsonl line 20: my own analysis turn was non-parallel but
recorded parallel_session: true because the finding text mentioned the markers.

Fix: collectToolResultText(turn) — gather text only from tool_result blocks
(both string content and structured `[{type:text,text}]` arrays). Scan THAT
for collision markers; prose is no longer a signal.

Tests: rewrote `parallel_session narrowed` block — false on user/assistant
prose / no-tool-result turns; true on tool_result strings + structured form.
106/106 parser tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:58:37 +03:00
Дмитрий 5f17ca51ac chore(tools): worktree pre-commit gate runner (quirks #86/#97)
In a git worktree the shared .git/hooks/pre-commit cannot find lefthook on
PATH and silently skips every gate (pint/larastan/pest/gitleaks). This
script hardcodes the lefthook.exe + lefthook.yml paths from the main
checkout and runs `pre-commit` explicitly. Run before `git commit` inside
any worktree. Exit 0 = all gates passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:32:31 +03:00
Дмитрий 353b1599b6 fix(observer): brain-retro analyzer — blocked outcome + v1 filter + factors
P0.1b: inferOutcome emits 'blocked' when a turn had more error than retry
events (an unrecovered tool failure) — previously the enum value was dead.

P0.1c: 'failure' documented as deferred to the phase-2 agent-judge. It is a
judgment (work wrong AND never corrected), not deterministically recoverable
from a transcript; a wrong-then-corrected turn surfaces as 'rework'.

P1.1: analyze() drops v1 episodes (no schema_version 2) — they lack
environment/prompt_signal/decision_provenance and polluted the factor
matrix. Reports v1SkippedCount.

P2.1: session_turn (bucketed early/mid/late) and parallel_session added to
FACTOR_FNS — closes the schema↔matrix mismatch (both were captured in the
episode but absent from the factor axes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:40:44 +03:00
Дмитрий 97388cf840 fix(observer): transcript-parser accuracy — session_turn + correction signal
P0.2: count session_turn from the last compaction. The transcript file
accumulates duplicated context-rebuild snapshots (quirk #101), so counting
real prompts from i=0 inflated it and made it non-monotonic. Now counts
"real prompts since the last compaction" — monotonic by construction.

P0.1a: widen the correction prompt_signal regex (не работает / сломал /
опять / откати / revert / still not / wrong / ...). The old regex was too
narrow, so rework outcomes were invisible to the factor analysis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:40:29 +03:00
Дмитрий 83295a25f3 fix(brain): redirect / to /docs/observer/dashboard.html (browser-smoke fix)
Browser smoke (Playwright) revealed that rewriting path internally without
changing the response URL left the browser's base URL as /, breaking
relative <script src="dashboard.js"> and ../automation-graph-data.js
references. 302 redirect makes the browser settle on /docs/observer/,
which resolves the relative paths correctly. All 4 views verified clean
(0 console errors). Screenshots: brain-dashboard-{map,replay,feed,aggregate}-view.png.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 16:23:52 +03:00
Дмитрий 2f60910b09 feat(brain): conflict three-layer panel (design / friction / correlation) +3 tests 2026-05-19 16:23:51 +03:00
Дмитрий 774763c21c feat(brain): aggregator — node heat, distributions, redirect rate (+4 tests) 2026-05-19 16:23:50 +03:00
Дмитрий e34b11aca5 feat(brain): Лента view — groupBySession + grouped feed UI 2026-05-19 16:23:49 +03:00
Дмитрий 475e233c2a feat(brain): filterEpisodes + 3 tests (Task 7 logic; UI deferred)
Worktree has no app/node_modules — vitest not run here; final regression
deferred to main-checkout post parallel-session release. Logic is a 7-line
pure filter; tests cover empty filter, classification, errors-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 16:23:48 +03:00
Дмитрий c3392bef13 feat(brain): node attribution — episode signals to graph nodes 2026-05-19 16:23:46 +03:00
Дмитрий 7fed5bc18b feat(brain): episode JSONL parser + v1/v2 normalizer 2026-05-19 16:23:46 +03:00
Дмитрий f1092772fb feat(brain): static server + /api/episodes for the dashboard 2026-05-19 16:23:45 +03:00
Дмитрий b2b9a75731 feat(observer): AskUserQuestion in-turn choice + parallel_session narrowing
#1 — detectAskUserQuestionChoice: when a turn contains an AskUserQuestion
whose answer exactly matches an offered option label, classify as
user_chose_from_options. The answered entry carries a structured
toolUseResult (questions[].options[].label + answers map). A custom
"Other" free-text answer is NOT a pick — falls through. Wired into
parseTranscript after the text-list detector.

#3 — parallel_session: dropped broad word matches (параллельн /
"parallel session") that false-fired on any casual mention. Now only
strong collision evidence (foreign git index / чужой staged /
index.lock / another git process). Best-effort per spec R2 — prefer
false-negative over false-positive.

169/169 tools tests GREEN (+9 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:39:09 +03:00
Дмитрий 8550ba243d fix(observer): exclude synthetic user-role messages from turn detection
Root cause (systematic-debugging): isRealUserPrompt treated skill-content
("Base directory for this skill:"), local-command output
(<local-command-stdout>), and interrupt markers as genuine prompts.
findTurnStart then anchored a turn on the synthetic message — the turn
slice missed the genuine prompt's UserPromptSubmit hook_additional_context
attachment → economy_level: null, wrong prompt_signal/task_classification.
Same cause made extractLastUserPromptText return skill content, so the
Stop-hook routing-gate false-positive-blocked autonomous §12 skill
invocations (detectMethodDirected saw the node name in skill text).

Fix: SYNTHETIC_PROMPT_MARKERS + isSyntheticPrompt — isRealUserPrompt
returns false for synthetic messages. One fix closes both the
economy_level capture gap and the 2nd routing-gate FP class.

160/160 tools tests GREEN (+3 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:39:06 +03:00
Дмитрий dc6d2dd358 test(brain-retro): regression guard — 3rd provenance kind in factor matrix
buildFactorMatrix already buckets decision_provenance.kind dynamically
(brain-retro-analyzer.mjs:112) — no production change needed. Test
pins that user_chose_from_options is counted on the provenance axis.

12/12 brain-retro tests GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:06:56 +03:00
Дмитрий 4969363f78 feat(observer): routing-gate no-block for user_chose_from_options
When episode is user_chose_from_options, routing-gate does NOT block —
collaborative-choice from Claude-offered options doesn't require a
routing-tag (detector is deterministic). 18/18 stop-hook tests GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:05:49 +03:00
Дмитрий 0e3938f845 feat(observer): parser integration — user_chose_from_options before routing-tag
detectChoiceProvenance runs BEFORE parseRoutingTag; if last assistant
turn offered options and user prompt references one, decision_provenance
becomes user_chose_from_options. Otherwise falls back to existing
routing-tag / autonomous logic.

3 new parser tests GREEN; all existing tests still GREEN (43/43).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:04:25 +03:00
Дмитрий 7f379bd6a2 feat(observer): choice detector — user_chose_from_options kind
Pure module — extracts options (numbered/lettered/bullets/AskUserQuestion)
from last assistant message, detects user reference (position-based +
substring), returns decision_provenance for the 3rd kind.

23/23 tests GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 11:57:36 +03:00
Дмитрий a6f44e5bb4 feat(observer): brain-retro analyzer — outcome inference + factor matrix
Pure deterministic Layer-4 aggregation module (spec §6) for the /brain-retro
skill. Exports: dedupeEpisodes, inferOutcome, groupEpisodesToTasks,
findCausalChains, buildFactorMatrix, analyze. Read-only — never writes JSONL.
11/11 tests green. CLI smoke: 10 real episodes → valid JSON with all 5 keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 10:47:57 +03:00
Дмитрий cde9478899 feat(observer): STATUS.md — C5 row + observer_error metric 2026-05-19 10:41:17 +03:00
Дмитрий d080198220 feat(observer): coverage + registration-integrity controller (C5)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 10:38:25 +03:00
Дмитрий 35231d8b96 feat(observer): Stop-hook routing-gate enforcement 2026-05-19 10:34:57 +03:00
Дмитрий 2e11c452a9 feat(observer): Stop-hook v2 episode + observer_error marker 2026-05-19 10:31:37 +03:00
Дмитрий 02bff371c1 feat(observer): routing-gate method-direction detector 2026-05-19 10:27:23 +03:00
Дмитрий 375c3e2d1f feat(observer): parser v2 — process events, routing-tag, episode assembly 2026-05-19 10:23:08 +03:00
Дмитрий 85a95aa2d0 feat(observer): parser v2 — environment, task_size, prompt_signal extractors 2026-05-19 10:15:17 +03:00
Дмитрий 99c7bac99b feat(brain): observer captures real session data via transcript parse
The Stop-hook was writing empty-shell episodes (task_id "unknown-<ts>",
node_chosen "unknown", events []). Root cause: buildEpisodeFromContext
read fields from the Stop-event stdin that Claude Code never sends
(primary_rationale, node_chosen, ...) and the session field name was
wrong (ctx.sessionId camelCase vs Claude Code's session_id). The hook
never read transcript_path — the only real source of session data.

New tools/observer-transcript-parser.mjs — pure parseTranscript(text,
fallbackSessionId):
- Scopes to the last turn (from the last real user prompt to EOF) —
  one episode == one prompt→response cycle. A tool_result-carrier user
  message is not treated as a turn boundary.
- Extracts task_id (real sessionId), timestamps (real duration),
  skill_invoked events, a tool_summary event with per-tool counts,
  error events (tool_result is_error), node_chosen (first skill, else
  "direct"), hard_floor (invoked when a superpowers:* skill is used),
  path_type (regulated/improvised), task_classification (keyword
  heuristic on the prompt).
- Reasoning fields triggers_matched/candidates_considered/
  boundaries_applied stay [] — not recoverable from a transcript;
  their capture is a separate ADR-011 follow-up.

observer-stop-hook.mjs: reads ctx.transcript_path + ctx.session_id
(camelCase fallback kept), readFileSync best-effort, delegates to
parseTranscript. No transcript → graceful fallback to ctx defaults.
Episode schema (5 mandatory + 7-field primary_rationale) unchanged —
no normative change. Stop-event is never blocked (exit 0 on any error).

TDD: 17 parseTranscript tests + 1 buildEpisodeFromContext transcript
test. Full tools Vitest 70/70 GREEN. CLI smoke against a real 575-entry
transcript: episode populated — real task_id, ~6.5 min duration,
tool_summary {Bash:5,Read:5,Grep:1,Edit:9,Write:1}, error event.

Refs: ADR-011 brain governance §6.2 (observer evidence loop).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:11:10 +03:00
Дмитрий 2a2ded7a53 refactor(brain): C1 L1-watcher — drop broken reverse drift check
Removes the `missingInSettings` reverse check ("plugin documented in
Tooling but disabled in settings.json"). It was broken by design:
Tooling Прил. Н lists tools by human/group name ("Frontend Design
plugin", "Trail of Bits Skills") while settings.json keys are machine
IDs (`name@marketplace`) — the two namespaces never compare. The
`/#\d+\s+([\w-]+(?:@[\w-]+)?)/` scan also captured the first plain word
after "#NN" ("#1 PostgreSQL MCP" → "PostgreSQL"), so every run emitted
~190 lines of WARN noise.

ADR-011 §6.1 specifies only the settings→Tooling direction (the L1
pattern "plugin enabled without Tooling formalization"). That is the
FAIL path and is unchanged. detectDrift now returns `{ missingInTooling }`
only. CLI output is a clean single line on success.

Closes the cosmetic issue flagged in bffdaa9.

TDD: reverse-check test replaced with `not.toHaveProperty
('missingInSettings')`; 12/12 GREEN. Smoke: node tools/l1-watcher.mjs
-> exit 0, "OK — 0 drift" (no WARN block).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:36:21 +03:00
Дмитрий 8ae0ecef25 feat(brain): C2 cross-ref-checker link-anchored detection — strict-ready
Closes the ~150 false drifts that prevented strict mode. The old regex
`\b(Name)\s+v(\d+\.\d+)` swept the whole file head and matched every
historical version mention, plus the FROM-side of arrow transitions
("v1.30→v1.31"). Real current-vs-header drift in the repo: zero.

Two-tier detection:
- Primary LINK_REF_RE: a markdown-link to a normative file followed by
  the first bold version — "[..](docs/Tooling_v8_3.md) (**Прил. Н
  v2.17**". Link anchor makes it immune to history-block noise. This is
  how CLAUDE.md §0 cross-refs table is written, so CLAUDE.md is fully
  validated. Runs on the whole file.
- Fallback CROSS_REF_RE: plain "Name vX.Y" mention, scoped to the text
  *before* the first history block. Pravila/Tooling/PSR_v1 have no
  markdown-link cross-refs, so the fallback covers them — but their
  shapki list past releases, so the scan stops at the first history
  marker (`**vN.M наследие**` / `**Что изменилось в vN.M относительно**`
  / `**vN.M** — `). dedupe-by-target keeps the first ref per target.

Regex hardening:
- `\b` after the version forbids backtracking to a partial capture
  (so "v1.30→" never collapses to a spurious "v1.3" match).
- `(?!\s*→)` negative lookahead drops the FROM-side of transitions.

TDD: 8 new tests (link-based, "Прил. Н" prefix, multi-file table,
dedupe, two arrow shapes, three history-marker shapes, link-beats-
fallback). 18/18 GREEN.
Smoke: node tools/cross-ref-checker.mjs -> exit 0, "OK — 0 drift in
4 files" (Pravila/CLAUDE.md/Tooling/PSR_v1; MEMORY.md is outside the
repo by design — existsSync-skipped).

Refs: ADR-011 brain governance §6.2 (C2 cross-ref consistency detector).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:29:43 +03:00
Дмитрий bffdaa9f57 feat(brain): C1 L1-watcher alias mechanism — strict-ready
Closes the 9 pre-existing name@source drifts that prevented strict mode:
settings.json lists each marketplace plugin by machine name (e.g.
"frontend-design@claude-plugins-official"), while Tooling Прил. Н
describes them under a human/group name (e.g. "Frontend Design plugin",
"Trail of Bits Skills" — single row #39 for 8 sub-plugins).

Mechanism:
- tools/.l1-watcher-aliases.txt — settings_name=tooling_substring map.
- detectDrift(settings, tooling, aliases): direct match first, then
  alias-substring fallback. Settings name considered formalized if
  Tooling text includes either the name itself or aliases[name].
- parseAliases(raw) exported — line-based KV parser with #-comments
  and split-on-first-= semantics (values may contain "=").

TDD: 6 new tests (3 detectDrift + 4 parseAliases). 12/12 GREEN.
Smoke: node tools/l1-watcher.mjs -> exit 0, "OK — 0 drift".

Known cosmetic baseline issue (pre-existing, not introduced here):
the missingInSettings WARN list is noisy — regex
/#\d+\s+([\w-]+(?:@[\w-]+)?)/g captures the first \w+ after "#NN"
even when it is a plain word (e.g. "#1 PostgreSQL MCP" -> "PostgreSQL"),
producing ~190 WARN entries. WARN is non-blocking, so strict mode flip
in Phase 3 is unaffected; a follow-up filter on names containing "@"
would silence this without behavioural change.

Refs: ADR-011 brain governance §6.1 (C1 L1-watcher detector for the
"plugin in settings.json without Tooling formalization" L1 pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:12:05 +03:00
Дмитрий 9ef5227f0f fix(observer): STATUS.md plain-text reference to memory file (lychee pre-push fix)
Memory files (e.g. feedback_brain_unused_tools_not_problem.md) live
in C:/Users/.../memory/, OUTSIDE the git repo. Markdown link from
docs/observer/STATUS.md (relative path) resolved to non-existent
in-repo path → lychee broken-link error in pre-push gate.

Fix: plain-text mention of memory key (no markdown link), with
explicit note «outside-repo memory store». Generator updated
accordingly; 31/31 Vitest tests still GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:49:39 +03:00
Дмитрий ce2333e309 feat(controller): C4 status-md-generator — dashboard
Aggregates C1/C2/C3 outputs via execFileSync (Security Guidance #40
compliant — uses fixed args array, no shell injection surface) +
observer episode count. Behavioral rule embedded in metric copy.
Per ADR-011 + spec §6.4.

3 Vitest tests GREEN (31/31 total).

Smoke run rebuilds STATUS.md with current state:
- C1 🔴 (l1-watcher surfaces 9 plugins in settings not formalized
  in Tooling Прил. Н by exact name@source — see commit 4382de3)
- C2 🔴 (cross-ref-checker surfaces noise from 'наследие' headers
  — see commit a780959 DWC)
- C3  (0 weeks since last read)
- C4  (this file)

Both 🔴 states surface known pre-existing drift (not regressions).
C5 lefthook wiring will handle WARN-vs-FAIL semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:37:27 +03:00
Дмитрий 0c9661d694 feat(controller): C3 observer-of-observer — 54-week self-prune counter
Pure date math, 0 LLM calls. 5 Vitest tests GREEN (28/28 total).
Per ADR-011 + spec §6.3.

Modes:
- check (default, lefthook): warn if last_read_at >= 54 weeks ago.
- record: bump counter (invoked manually or by future read-tracking hook).

isStale threshold is inclusive (>= 54 weeks) — spec «через 54 недели»
means at-or-past 54 weeks fires the warn.

Smoke run OK — current counter (period_start 2026-05-19) shows
0 weeks ago.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:36:13 +03:00
Дмитрий a780959de9 feat(controller): C2 cross-ref-checker — version drift detector (DONE_WITH_CONCERNS)
Pure regex/JSON, 0 LLM calls. 5 Vitest tests GREEN (23/23 total).
Per ADR-011 + spec §6.2.

Smoke run on real repo surfaces ~150 «drifts» — these are
**historical 'наследие' entries** in headers (CLAUDE.md / Pravila /
Tooling / PSR_v1), not actual current cross-ref mismatches. Each
of these 4 files has a multi-line «v2.X наследие:» / «v1.Y наследие:»
chain in its top header describing past sub-versions; my 50-line
scan picks them all up.

CONCERN: mechanism is correct (test fixtures pass), but real-world
needs refinement before lefthook wiring (C5). Options for follow-up:
- Scope match to explicit «§0 cross-refs» table marker.
- Distinguish «current cross-ref» from «historical наследие mention»
  by surrounding markup.
- Restrict regex to cross-ref tables (markdown | columns) only.

Until refined: C2 will be wired in C5 with caveat (WARN-only, or
disabled) to avoid blocking every commit on pre-existing 'наследие'
entries.

Extracted Tooling Прил. Н version via **Версия:** pattern (file-level
v8.3 wrapper at line 1 was misleading — Прил. Н is v2.17 at line 4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:34:10 +03:00
Дмитрий 4382de3a79 feat(controller): C1 l1-watcher — settings.json ↔ Tooling drift detector
Pure regex/JSON, 0 LLM calls. 4 Vitest tests GREEN. Per ADR-011 + spec §6.1.

Smoke run surfaces REAL drift (DONE_WITH_CONCERNS — plan B5 said «that's
a real signal, document, don't fix here»): 9 plugins in
~/.claude/settings.json enabledPlugins NOT formalized by exact
«name@source» string in Tooling Прил. Н:
- frontend-design@claude-plugins-official (informally as #30
  «Frontend Design plugin»)
- 8× ToB plugins @trailofbits (differential-review, audit-context-
  building, supply-chain-risk-auditor, insecure-defaults, sharp-
  edges, static-analysis, variant-analysis, agentic-actions-auditor)
  informally as #39 «Trail of Bits Skills»

This is naming-vocabulary mismatch (Tooling uses human-readable
names; settings.json uses machine names). Not architectural drift.
Resolution options for follow-up:
- Add machine names as «external_id» attribute to Tooling Прил. Н rows.
- Add tools/.l1-watcher-aliases.txt with accepted machine→human map.

Until resolved: C1 will FAIL on lefthook (C5 wiring) — addressed in
C5 by adding alias mechanism OR temporarily downgrade to WARN.

Also fixed CLI guard bug in observer-stop-hook.mjs (B3) and l1-watcher
— old guard `import.meta.url === \`file://\${argv[1]}\`` did not match
on Windows (file:/// triple-slash vs file:// double-slash + relative
argv[1]). New guard: argv[1].endsWith('/<filename>.mjs').

Weekly GH Actions cron (Mon 09:00 MSK) opens issue on drift.

Vitest config extended to ../tools/*.test.mjs with exclude for ruflo-*
and subagent-prompt-prefix tests (pre-existing, not part of brain
governance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:31:18 +03:00
Дмитрий a8257001a7 feat(observer): Stop-event hook — JSONL append with PII filter + primary_rationale validation
Hook contract: reads JSON ctx from stdin (Claude Code Stop-event),
builds episode with 5 mandatory fields including primary_rationale
(7 sub-fields per spec v1.1 §5.2.1), sanitizes via observer-pii-filter,
appends to docs/observer/episodes-YYYY-MM.jsonl. Never blocks
Stop-event (exit 0 on error).

8 Vitest tests verified GREEN (6 in appendEpisode + 2 in
buildEpisodeFromContext): append/append-existing/PII-filter/
missing-required/missing-rationale-field/routing_decision-preserved
+ buildEpisode 5-field extraction + user-rationale-preserved.

Vitest config for tools/ already covers via glob ../tools/observer-*.test.mjs
(extended in B2 commit 4616308).

Per Pravila §16.2 + ADR-011 + spec v1.1 §5.2.1 (factor analysis).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:16:36 +03:00
Дмитрий 4616308402 feat(observer): PII filter — phone/email/Sentry/OpenAI/Bearer masking
Used by Stop-hook before JSONL write. 6 Vitest cases including
idempotence and recursive object sanitization. Per Pravila §16.2 +
ADR-011 + spec §5.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:11:25 +03:00
Дмитрий ef5da8def8 test(hooks): fix test 5 Windows-compat — PATH=nodeDir not PATH=''
Previous test 5 stripped PATH entirely, which kills node.exe spawn resolution
on Windows (CreateProcess needs PATH to find node). Changed to set PATH to
node's own directory only — node spawns fine, git is not in node-dir → ENOENT
→ hook fail-opens per spec §4.5.

All 5 tests now pass cross-platform.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 10:18:54 +03:00
Дмитрий 78bae4addf feat(hooks): subagent-prompt-prefix — PreToolUse git-safety inject (TDD green)
Per Pravila §15.1 — инжектит cwd/branch/HEAD/worktree-root + правила
поведения в каждый Task-prompt. FAIL-OPEN на любой ошибке (git
не в PATH, malformed stdin, non-Task tools).

Все 5 тестов из subagent-prompt-prefix.test.mjs PASS.
Регистрация в .claude/settings.json — Task 6 плана.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 10:17:04 +03:00
Дмитрий 049eaf0dfc test(hooks): subagent-prompt-prefix — failing tests (TDD red)
5 тестов для Task git-safety inject хука:
- inject SUBAGENT GIT-SAFETY HEADER в Task-prompt
- inject real cwd/branch/HEAD/worktree-root
- passes through non-Task tools
- fail-open on malformed stdin
- fail-open when git unavailable

Tests FAIL — hook implementation в следующем коммите (TDD green-phase).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 10:13:27 +03:00
Дмитрий dd9e37ea3f feat(adr): wire adr-judge as lefthook pre-commit job 9 (Task 5)
adr-judge v0.13.1 vendored from the adr-kit plugin (MIT) -> tools/adr-judge.py (819 lines, Python stdlib only). lefthook pre-commit job 9 runs 'git diff --cached --unified=0 | python tools/adr-judge.py --diff - --adr-dir docs/adr/'.

AK6 resolved: the --llm flag is NOT passed, so adr-judge runs declarative regex only — no Claude Sonnet call, zero economy cost. adr-kit's own git-hook template passes --llm; we deliberately do not, and lefthook keeps sole ownership of .git/hooks (AK1).

Verified: red test — staged @inertiajs/vue3 import in app/resources/js/ blocked with VIOLATION citing ADR-001 line 1, lefthook exit 1. Green test — clean diff, 9/9 jobs pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 04:54:43 +03:00
Дмитрий 22056baabc fix(ruflo): queen-hook isDiscussion — word-boundary guard (review) 2026-05-15 17:25:09 +03:00