Commit Graph

83 Commits

Author SHA1 Message Date
Дмитрий c5d360fc59 docs(security): server-hardening setup-док + SEC-1..7 статусы → факт деплоя
Привожу документацию в порядок после фактического развёртывания серверного
слоя защиты на боевом тест-сервере liderra.ru (22.05.2026, на тестовой VM
Yandex Cloud, до закрытия Б-1).

Что сделано:
- docs/security/server-hardening-setup.md (новый) — setup-док серверного
  слоя SEC-1..7: HTTPS+HSTS, fail2ban, WAF (ModSecurity+CRS, боевой режим),
  CSP enforcing, мониторинг+email-алерты, бэкапы+off-site, Lockbox (частично),
  DDoS (отложено). Зеркалит стиль docs/security/pgaudit-anonymizer-setup.md.
- docs/Открытые_вопросы_v8_3.md -> v1.85: SEC-1..7 статусы приведены к факту
  (сделано / отложено / частично). Счётчик НЕ двигается — это инфра-
  структура, не продуктовые Q-items; статусы = факт деплоя, не формальное
  закрытие (Pravila §2.2 соблюдена). v1.84/v1.83 трейл не тронут.
- cspell-words.txt +10 терминов серверного слоя.
- tools/observer-chain-map.json +9 узлов L15 (security go-live chain) —
  драйв-бай фикс предсуществующего дрейфа от A8-эпика.

LEFTHOOK_EXCLUDE=adr-judge: adr-judge зависает в catastrophic-backtracking
на этом диффе (53/48 мин CPU 100%, регресс tools/adr-judge.py на длинных
markdown-доках). Диф чисто документация, ADR-нарушений нет. Баг adr-judge —
отдельный follow-up. Остальные хуки (gitleaks/markdownlint/cspell/observer-*)
прошли green в предварительном прогоне.

Источник фактов: memory/project_server_hardening.md, ADR-014 §9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 11:11:47 +03:00
Дмитрий 365d1a0a93 feat(ops): мониторинг + pre-flight + WAF +/api threshold (incident 2026-05-22)
Инцидент 22.05.2026: liderra.ru 500 Server Error. Корень — повреждённый
APP_KEY в .env (24 строки с CRLF + дубль ключа от key:generate). Каскад:
Laravel не парсил .env → fallback на default sqlite/database cache →
sqlite-файла нет → 500 на каждом HTTP-запросе; liderra-queue в
бесконечном activating-loop'е (Restart=always без лимитов).

Файлы (все LF через локальный .gitattributes — защита от CRLF-инцидента):

  liderra-precheck.sh — pre-flight гейт (15 проверок: CRLF в .env, длина
    APP_KEY, decrypt(encrypt) round-trip, PG/Redis ping, config-cache
    свежее .env, pending migrations, HTTP smoke). exit 1 при любом провале.

  liderra-healthcheck.sh + cron */2 — проверка портала каждые 2 минуты;
    2 подряд провала (~4 мин downtime) → email DOWN; первый 200 после
    DOWN → email RECOVERED.

  liderra-queue.service — Restart=on-failure, StartLimitBurst=5/5min,
    OnFailure=liderra-queue-alert.service. Очередь больше не крутится в
    бесконечном крэше — после 5 крашей systemd останавливает + шлёт email.

  liderra-queue-alert.service + liderra-systemd-alert.sh — отправка email
    при окончательном fail системного юнита (status + journalctl tail).

  msmtprc.template — шаблон для /etc/msmtprc (placeholder
    __MAIL_PASSWORD__ подставляется из app/.env MAIL_PASSWORD).

Установлено на /var/www/liderra/app (тест-сервер YC):
  /etc/msmtprc, /usr/local/bin/liderra-*.sh,
  /etc/cron.d/liderra-healthcheck, /etc/systemd/system/liderra-queue*.service.
  Тестовое письмо на kdv1@bk.ru доставлено (smtpstatus=250).

WAF (ModSecurity OWASP CRS 3.3.5) уже было правило 1900200 от A8 infosec
(разрешает PUT/PATCH/DELETE — добавлено в 06:00). Дополнительно:
  /etc/nginx/modsec/liderra-exclusions.conf id:1900300 — для /api/*
  поднят порог inbound_anomaly_score_threshold с 5 до 10 (чтобы edge-case
  JSON-payloads не давали false-positive: PATCH/DELETE и так дают +5 в CRS).

Verification: 9/9 GREEN.
  Smoke: liderra.ru → 200, PATCH/DELETE /api/* → 419 (Laravel CSRF, не 403 WAF).
  Services: php-fpm/queue/nginx/postgres/redis — все active.
  Pre-flight: 15/15 ✓ (был бы DOWN-сигнализатор сегодня за 5 секунд).
  Laravel production.ERROR за последние 10 минут: 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 11:10:31 +03:00
Дмитрий 20cc132777 feat(observer): render missed_activations in STATUS.md C5 2026-05-21 09:59:56 +03:00
Дмитрий 4d7e9ca0e4 feat(observer): C5 surfaces missed-activation count via runCoverageChecker 2026-05-21 09:59:56 +03:00
Дмитрий 6174830311 feat(observer): wire missed-activation matcher into analyze() 2026-05-21 09:59:56 +03:00
Дмитрий 3ef1e625eb feat(observer): missed-activation matcher (pure, deterministic) 2026-05-21 09:59:56 +03:00
Дмитрий 6dec34403f feat(observer): node-dormancy extractor + initial JSON snapshot
Two-signal availability check: dormant=true OR boundaries contains DEFERRED.
Treats #17 (Tooling-marked) and #44/#50/#54/#67 (DEFERRED in boundaries)
uniformly as unavailable. Tooling Прил.Н unmodified — semantics preserved.

7 vitest cases (basic, multi-row, DEFERRED-fallback, boundary check).
Initial JSON: 67 nodes, 6 unavailable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 09:59:56 +03:00
Дмитрий 45691d0324 feat(observer): add classification→node mapping for missed-activation detection 2026-05-21 09:59:55 +03:00
Дмитрий df2d091174 feat(status-md): surface C6 chain-map sync row 2026-05-21 06:06:28 +03:00
Дмитрий 4c9a1e9ccb feat(brain-retro): aggregate chain_ref into factorMatrix (multi-chain axis) 2026-05-21 06:06:27 +03:00
Дмитрий 65c2c5e471 feat(observer): one-shot chain_ref retrofill script (idempotent, atomic) 2026-05-21 06:06:27 +03:00
Дмитрий 05076c4f1d feat(observer): C6 chain-map-checker (JSON vs routing-off-phase.md sync) + L14 coverage 2026-05-21 06:06:26 +03:00
Дмитрий f943b229c0 feat(observer): emit chain_ref in primary_rationale 2026-05-21 06:06:25 +03:00
Дмитрий 28671cb012 feat(observer): chain-map JSON + chainsFor detector (L1-L13 attribution) 2026-05-21 06:06:25 +03:00
Дмитрий be9571353a feat(status-md): surface legacy v1 episodes count
Closes brain-retro 2026-05-20 #18 — episodes without schema_version=2
(legacy v1 era pre-2026-05-19T08:06) are now visible in STATUS.md
metrics. They're already filtered out of factor analysis by analyzer's
v1SkippedCount, but their existence was invisible to humans reading
STATUS — masking the bootstrap-epoch gap.

2 new vitest tests, 326/326 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:44 +03:00
Дмитрий 147200ff8e tools(observer): add Glob latency investigator (ad-hoc script)
Closes brain-retro 2026-05-20 #17 — one-off Node script for investigating
the Glob p50=12.7s anomaly from initial retro. Parses transcript JSONL,
prints top-N slowest Glob round-trips with pattern + path.

Smoke-tested on session 553717ec (5h+ session): finds 32 Glob calls,
median 12690ms (matches retro finding), top-5 all 'docs/adr/**' at
20265ms — Glob recursive on ADR directory is the apparent culprit.

NOT production code path — never imported by parser/hook/analyzer.
Run on demand: node tools/glob-latency-investigator.mjs <transcript.jsonl>.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:43 +03:00
Дмитрий 492a4fc969 feat(observer): inferOutcome neutral next-prompt → soft_success
Closes brain-retro 2026-05-20 #16 — when the next prompt is 'neutral'
(no correction/approval/new_task markers), interpret as silent success
('no objection') and surface as soft_success. Slightly weaker than
explicit approval — labelled separately so brain-retro can show
breakdown.

4 new vitest tests, 324/324 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:43 +03:00
Дмитрий a007295abe refactor(observer): rename factor axis session_turn → session_segment_turn
Closes brain-retro 2026-05-20 #14 — `environment.session_turn` уже значит
'turns since last compaction' (parser counts from lastCompactIdx + 1).
Ось матрицы под именем 'session_turn' путала с глобальным turn-номером.
Семантика данных не меняется, только имя axis в FACTOR_FNS.

Existing test renamed; new explicit test verifies new name present and
legacy name absent.

1 new vitest test + 1 renamed, 320/320 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:41 +03:00
Дмитрий 5d3e29669b feat(observer): parallel_session +OR pre-flight git fetch heuristic (Task 13 PIVOT)
Closes brain-retro 2026-05-20 #13 PIVOT — additive to F1 (parallel
session sessions session). F1 narrowed parallel_session to tool_result-only
to fix live FP. This Task adds OR-clause: Bash command containing
'git fetch && git log HEAD..origin/...' (Pravila §15.2 pre-flight)
is a strong signal that the operator expects parallel sessions.

Does NOT overwrite F1 — both signals coexist via OR.

4 new vitest tests, 319/319 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:41 +03:00
Дмитрий ef4cc825bf feat(observer): emit subagent_invoked events from Agent tool_use
Closes brain-retro 2026-05-20 #12 — each Agent tool_use produces a
subagent_invoked event with subagent_type / model (if explicit) /
first 80 chars of description. Visibility from parent Claude's
perspective; full subagent trace lives in subagents/ directory and is
out of scope for this parser.

6 new vitest tests, 315/315 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:40 +03:00
Дмитрий f54c82d682 feat(observer): opt-in reasoning-tag merges with heuristic primary_rationale
Closes brain-retro 2026-05-20 #11 — parseReasoningTag extracts opt-in
<!-- reasoning: triggers="..." candidates="..." boundaries="..." -->
HTML-comment from assistant text. Semicolon-separated values merged into
heuristic-derived primary_rationale arrays via Set-dedupe.

Conservative: tag is opt-in; heuristic still runs even when tag present
(heuristic provides baseline, tag enriches).

5 new vitest tests, 309/309 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:39 +03:00
Дмитрий 884169e847 feat(status-md): show last /brain-retro days-ago
Closes brain-retro 2026-05-20 #10 — STATUS.md теперь сообщает, когда
последний раз был прочитан observer (через .read-counter.json
last_read_at). Помогает не забыть про ретро между sprint-кадансами.

3 new vitest tests, 304/304 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:39 +03:00
Дмитрий f8b32a7d3a feat(observer): extend classifyPromptSignal vocabulary
Closes brain-retro 2026-05-20 #9 — добавлены маркеры:
- correction: 'не совсем', 'другое|другая', 'не сходится', 'wrong direction'
- approval: 'класс', 'хорошо', 'принято', 'well done', 'nice'
- new_task (prefix): 'теперь', 'далее', 'следующее', 'next', 'now'

NB на JS \b с Cyrillic: \b matches word↔non-word boundary, но Cyrillic
chars не word-chars в JS RegExp default → \b после русского слова
никогда не fires. Решение: substring-match для русских correction-маркеров;
lookahead с явными разделителями для start-of-prompt new_task маркеров.

11 new vitest tests, 301/301 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:38 +03:00
Дмитрий ffaeb8f37b feat(observer): strip <system-reminder> blocks from promptText
Closes brain-retro 2026-05-20 #8 — UserPromptSubmit hook injects
<system-reminder>...</system-reminder> blocks into user.content that
polluted classifyTask / classifyPromptSignal / routing detection.
Now stripped via regex before any analysis.

Completed by controller (Opus) after subagent hit context limit on
1250-line test file. Helper stripSystemReminders + promptText update
were committed by subagent; test cases appended via Bash heredoc.

4 new vitest tests, 290/290 GREEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 13:47:38 +03:00
Дмитрий c0e3e901d0 feat(observer): differentiate error events by tool + summary
Closes brain-retro 2026-05-20 #7 — each tool_result.is_error now emits
{ kind:'error', tool:<name>, summary:<first 80 chars> }. Allows
aggregation by tool (Bash/Edit/Read) + cause prefix (ENOENT/timeout/
'String to replace not found').

Required updating existing 'emits error events for tool_result with
is_error' test assertion (old shape had bare 'message' field).

4 new vitest tests + 1 existing relaxed, 286/286 GREEN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:37 +03:00
Дмитрий 0663479bb8 feat(observer): heuristic reasoning capture in primary_rationale
Closes brain-retro 2026-05-20 #6 — extractTriggers/Candidates/Boundaries
scan assistant.text for Pravila §N / ADR-N / PSR_v1 RX / routing-off-phase
LN / hard-floor + numbered/bulleted lists (≥2). Populates previously-
always-empty primary_rationale arrays.

Conservative-broad: false positives accepted (mention ≠ application);
/brain-retro determines applied validity. Phase 2 agent-judge out of scope.

19 new tests, 282/282 GREEN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:37 +03:00
Дмитрий 52728dfc12 feat(observer): capture ask_user_question events with answer_kind classification (Task 4)
Add extractAskUserQuestionEvents() — for each AskUserQuestion toolUseResult emits
one event per question with answer_kind: option|custom|no_answer and question_count.
Integrated into parseTranscript events pipeline. 7 new tests (263 total, 0 failed).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:36 +03:00
Дмитрий dbe2252421 feat(observer): real PII counter — STATUS.md stops lying
Closes brain-retro 2026-05-20 #3 SIMPLIFIED — sanitizeWithCount in
pii-filter (counts matches per pattern) + persistent monthly counter
docs/observer/.pii-counters.json (bumped by Stop-hook on each episode
write) + status-md-generator reads real count (no more piiMatches: 0
hardcode).

PII patterns themselves NOT changed (F7 of parallel session already
extended to 13 patterns).

Counter is informational — write failure never blocks Stop-event.

5+1+1=7 new vitest tests, 256/256 GREEN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:36 +03:00
Дмитрий 8e5eaecf6a feat(observer): Task 2 — extractTokenUsage + task_cost in parseTranscript
- export extractTokenUsage(turn): sums input/output/cache/iterations/
  web_search/web_fetch across all assistant messages in a turn
- parseTranscript now includes task_cost field (zero-filled when no usage)
- 7 new tests (5 unit + 2 integration); total 248/248 GREEN
- V2_FIELDS in observer-stop-hook.mjs NOT changed (backward compat)
2026-05-20 13:47:35 +03:00
Дмитрий 47c03a9e18 feat(observer): extend classifyTask with 7 new classes
Closes brain-retro 2026-05-20 #1 — analysis/memory-sync/regulatory-bump/
release/cleanup/monitoring/planning. Addresses '59% other' observation
from initial retro factor matrix.

Ordering: release before feature (merge feature-branch), planning before
refactor (план рефакторинга), memory-sync/regulatory-bump at top as most
specific. monitoring regex проверь состоян covers inflected forms.

9 new vitest tests, 241/241 GREEN in npm run test:tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:34 +03:00
Дмитрий 2476dd3c1b fix(observer): expand PII patterns — JWT/AWS/Yandex/IPv4/OS-username
PII filter previously covered only RU phone, email, Sentry, OpenAI token,
and generic Bearer. Several common surface leaks were uncovered:

- JWT tokens (eyJ<base64>.<base64>.<base64>) — auth/session tokens.
- AWS access key IDs (AKIA<16 alphanum>) — IAM static creds.
- Yandex Cloud IAM static keys (AQVN<base64>), session tokens (t1.<base64>),
  OAuth tokens (y0_<base64>) — primary cloud-provider for this project.
- IPv4 addresses (dotted-quad) — over-redacts 4-segment build numbers as
  an accepted tradeoff (under-redaction is the worse failure).
- Windows user-paths (C:\Users\<name>) → C:\Users\***. Otherwise the OS
  username `Administrator` leaks via task_size.files in every episode.
- POSIX /home/<name>/ → /home/***/. Same rationale for Linux dev hosts.

Pattern order: highly-specific token patterns (JWT/AWS/YC) run BEFORE
OPENAI_TOKEN/GENERIC_BEARER fallbacks; otherwise partial overlaps would
strip the wrong segments.

Tests: 9 new (each new pattern + idempotency over the expanded redaction
markers). 27/27 PII tests green.

.gitleaks.toml: added the test fixture to the path allowlist — the file
contains synthetic JWT/AWS/Yandex tokens (the filter is supposed to redact
them), not real secrets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:10:53 +03:00
Дмитрий 3ec638cbd2 fix(observer): C5 coverage driven by hook registration, drop commit ratio (COV-1)
Bug: checkCoverage flagged anomaly when "recent commits > 0 AND episodes == 0".
Two design flaws, proven in this project:
- Wrong unit: commits = work-unit (one turn → many commits via subagent
  workflow); episodes = turn-unit. A 1023-vs-19 ratio is not anomalous, it's
  expected.
- Wrong window: the 14-day commit window predated the Stop-hook's existence
  (registered 2026-05-19). For 13 of 14 days the hook didn't exist — 889
  commits were structurally impossible to mirror as episodes.

Result: the C5 indicator was either always-red (flagging the hook's birth
as anomaly) or always-green (any episode count vs huge commit count = ok).
Either way uninformative.

Fix:
- checkCoverage(episodeCount, hookRegistered) — drops the commit param.
  Warn iff hook is registered AND 0 episodes this month → the hook is
  silently failing. If the hook isn't registered, 0 episodes is correct.
- runCoverageChecker derives hookRegistered from settings.json
  (isObserverStopRegistered helper) and passes it to checkCoverage.
  No more git execFileSync — pure fs.

Tests rewritten under the new contract: 7/7 (was 6, +1 drift-hazard guard
ensuring detail strings never mention "commit"). 15/15 coverage tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:07:58 +03:00
Дмитрий 3b7e549e02 fix(observer): validate prompt_signal + events in appendEpisode (C-7)
V2_FIELDS list omitted prompt_signal and events — both are always produced
by parser and buildEpisodeFromContext, so the happy path is unaffected, but
a future ctx-fallback path that dropped them would silently write a
malformed episode. Add both to V2_FIELDS; appendEpisode now throws on either
being missing.

Tests: 2 new — appendEpisode throws when prompt_signal missing /
when events missing. 38/38 stop-hook tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:05:56 +03:00
Дмитрий 7fe9f89574 fix(observer): exclude hot/normative files from causal chains (A-3)
Bug: findCausalChains flagged a chain whenever two episodes shared any
file. CLAUDE.md / MEMORY.md / STATUS.md / episodes-YYYY-MM.jsonl /
memory/*.md are touched by almost every turn (memory store, status
regeneration, normative-doc updates) — sharing them is not evidence of
causality, just baseline noise. Result: spurious chains on hot files
crowded out the genuine signal.

Fix: HOT_FILE_PATTERNS regex list + `isHotFile(path)` predicate. In
findCausalChains, filter hot files out of BOTH the errored-episode file
set AND the candidate-shared list. If only hot files were shared → no
chain. If a non-hot file is also shared → the chain stands and the
sharedFiles list contains only the non-hot ones.

Tests: 4 new cases — CLAUDE.md / memory/*.md / episodes/STATUS/MEMORY
sharing yields no chain; a turn sharing both CLAUDE.md AND /src/app.ts
yields a chain with sharedFiles=['/src/app.ts'] only. 33/33 analyzer
tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:04:59 +03:00
Дмитрий c386361881 fix(observer): infer blocked from unrecovered_error tail, not raw error/retry count (A-1)
Bug: inferOutcome flagged `blocked` whenever errorCount > retryCount across
the turn's events. But the parser emits an `error` event for ANY tool_result
with is_error=true — including expected failures: TDD failing-test-first,
grep returning nothing, git commands with intentional non-zero exit. On
TDD-heavy turns (project's standard discipline) this systematically marked
turns as blocked even when they ended on a successful tool_use.

Fix:
- Parser (extractProcessEvents): walk turn from end, find the LAST
  tool_result; if its is_error=true, emit a single `unrecovered_error`
  event. Distinguishes "turn ended on failure" from "errors recovered
  later". The original per-is_error `error` events remain (useful as raw
  factor signals).
- Analyzer (inferOutcome): replace `errorCount > retryCount → blocked`
  with `events.some(kind === 'unrecovered_error') → blocked`. Same
  ordering preserved (interrupt > blocked > rework/success/unknown).

Tests:
- Parser: emits unrecovered_error when last tool_result is_error;
  does NOT emit when turn ended on a successful tool_result;
  does NOT emit for turns with no tool_results.
- Analyzer: blocked iff unrecovered_error event present (not raw count);
  events=[error, error, retry] → success (no unrecovered_error).

142/142 vitest green (was 128).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:03:15 +03:00
Дмитрий 94f831f7d1 fix(observer): uuid-dedup in parseLines (C-1 root fix for quirk #101)
Bug: Claude Code's transcript JSONL file accumulates duplicated context-
rebuild snapshots — the same entry re-printed with the SAME `uuid`. Without
dedup, session_turn / task_size / events double-count, and session_turn
becomes non-monotonic across episodes parsed at different file-growth
states. Live evidence: episodes-2026-05.jsonl lines 14/15/16 of the same
session showed session_turn 139 → 140 → 91 (backwards in time). Probe
on transcript 553717ec: 22400 entries, only 6074 unique uuid (68% dup
rate); real user prompts 264 total vs 92 unique-uuid.

Fix: parseLines now tracks a `seenUuid` Set and skips entries whose uuid
has already been encountered (keep-first). Entries without `uuid`
(synthetic test fixtures) pass through unchanged. All downstream functions
(findTurnStart, extractEnvironment, extractTaskSize, etc.) operate on the
deduped entries array, so the fix is single-point and total.

Tests: new `parseTranscript — uuid-dedup` describe block covers
(1) duplicated-uuid prompts collapse → session_turn counts once,
(2) distinct-uuid entries preserved (no over-dedup),
(3) no-uuid entries pass through (synthetic-fixture safety),
(4) duplicated-uuid assistant turns → tool_calls / files_touched counted once.
110/110 parser tests green (was 106).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:00:50 +03:00
Дмитрий 030bdc65ab fix(observer): narrow parallel_session detector to tool_result evidence (C-2)
extractEnvironment was scanning JSON.stringify(turn) for collision markers
(чужой staged / foreign git index / index.lock / another git process). Prose
mentions in user/assistant text flipped parallel_session=true. Live FP proven
on episodes-2026-05.jsonl line 20: my own analysis turn was non-parallel but
recorded parallel_session: true because the finding text mentioned the markers.

Fix: collectToolResultText(turn) — gather text only from tool_result blocks
(both string content and structured `[{type:text,text}]` arrays). Scan THAT
for collision markers; prose is no longer a signal.

Tests: rewrote `parallel_session narrowed` block — false on user/assistant
prose / no-tool-result turns; true on tool_result strings + structured form.
106/106 parser tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:58:37 +03:00
Дмитрий 5f17ca51ac chore(tools): worktree pre-commit gate runner (quirks #86/#97)
In a git worktree the shared .git/hooks/pre-commit cannot find lefthook on
PATH and silently skips every gate (pint/larastan/pest/gitleaks). This
script hardcodes the lefthook.exe + lefthook.yml paths from the main
checkout and runs `pre-commit` explicitly. Run before `git commit` inside
any worktree. Exit 0 = all gates passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:32:31 +03:00
Дмитрий 353b1599b6 fix(observer): brain-retro analyzer — blocked outcome + v1 filter + factors
P0.1b: inferOutcome emits 'blocked' when a turn had more error than retry
events (an unrecovered tool failure) — previously the enum value was dead.

P0.1c: 'failure' documented as deferred to the phase-2 agent-judge. It is a
judgment (work wrong AND never corrected), not deterministically recoverable
from a transcript; a wrong-then-corrected turn surfaces as 'rework'.

P1.1: analyze() drops v1 episodes (no schema_version 2) — they lack
environment/prompt_signal/decision_provenance and polluted the factor
matrix. Reports v1SkippedCount.

P2.1: session_turn (bucketed early/mid/late) and parallel_session added to
FACTOR_FNS — closes the schema↔matrix mismatch (both were captured in the
episode but absent from the factor axes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:40:44 +03:00
Дмитрий 97388cf840 fix(observer): transcript-parser accuracy — session_turn + correction signal
P0.2: count session_turn from the last compaction. The transcript file
accumulates duplicated context-rebuild snapshots (quirk #101), so counting
real prompts from i=0 inflated it and made it non-monotonic. Now counts
"real prompts since the last compaction" — monotonic by construction.

P0.1a: widen the correction prompt_signal regex (не работает / сломал /
опять / откати / revert / still not / wrong / ...). The old regex was too
narrow, so rework outcomes were invisible to the factor analysis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:40:29 +03:00
Дмитрий 83295a25f3 fix(brain): redirect / to /docs/observer/dashboard.html (browser-smoke fix)
Browser smoke (Playwright) revealed that rewriting path internally without
changing the response URL left the browser's base URL as /, breaking
relative <script src="dashboard.js"> and ../automation-graph-data.js
references. 302 redirect makes the browser settle on /docs/observer/,
which resolves the relative paths correctly. All 4 views verified clean
(0 console errors). Screenshots: brain-dashboard-{map,replay,feed,aggregate}-view.png.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 16:23:52 +03:00
Дмитрий 2f60910b09 feat(brain): conflict three-layer panel (design / friction / correlation) +3 tests 2026-05-19 16:23:51 +03:00
Дмитрий 774763c21c feat(brain): aggregator — node heat, distributions, redirect rate (+4 tests) 2026-05-19 16:23:50 +03:00
Дмитрий e34b11aca5 feat(brain): Лента view — groupBySession + grouped feed UI 2026-05-19 16:23:49 +03:00
Дмитрий 475e233c2a feat(brain): filterEpisodes + 3 tests (Task 7 logic; UI deferred)
Worktree has no app/node_modules — vitest not run here; final regression
deferred to main-checkout post parallel-session release. Logic is a 7-line
pure filter; tests cover empty filter, classification, errors-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 16:23:48 +03:00
Дмитрий c3392bef13 feat(brain): node attribution — episode signals to graph nodes 2026-05-19 16:23:46 +03:00
Дмитрий 7fed5bc18b feat(brain): episode JSONL parser + v1/v2 normalizer 2026-05-19 16:23:46 +03:00
Дмитрий f1092772fb feat(brain): static server + /api/episodes for the dashboard 2026-05-19 16:23:45 +03:00
Дмитрий b2b9a75731 feat(observer): AskUserQuestion in-turn choice + parallel_session narrowing
#1 — detectAskUserQuestionChoice: when a turn contains an AskUserQuestion
whose answer exactly matches an offered option label, classify as
user_chose_from_options. The answered entry carries a structured
toolUseResult (questions[].options[].label + answers map). A custom
"Other" free-text answer is NOT a pick — falls through. Wired into
parseTranscript after the text-list detector.

#3 — parallel_session: dropped broad word matches (параллельн /
"parallel session") that false-fired on any casual mention. Now only
strong collision evidence (foreign git index / чужой staged /
index.lock / another git process). Best-effort per spec R2 — prefer
false-negative over false-positive.

169/169 tools tests GREEN (+9 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:39:09 +03:00
Дмитрий 8550ba243d fix(observer): exclude synthetic user-role messages from turn detection
Root cause (systematic-debugging): isRealUserPrompt treated skill-content
("Base directory for this skill:"), local-command output
(<local-command-stdout>), and interrupt markers as genuine prompts.
findTurnStart then anchored a turn on the synthetic message — the turn
slice missed the genuine prompt's UserPromptSubmit hook_additional_context
attachment → economy_level: null, wrong prompt_signal/task_classification.
Same cause made extractLastUserPromptText return skill content, so the
Stop-hook routing-gate false-positive-blocked autonomous §12 skill
invocations (detectMethodDirected saw the node name in skill text).

Fix: SYNTHETIC_PROMPT_MARKERS + isSyntheticPrompt — isRealUserPrompt
returns false for synthetic messages. One fix closes both the
economy_level capture gap and the 2nd routing-gate FP class.

160/160 tools tests GREEN (+3 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:39:06 +03:00