liderra/portal - portal - Gitea: Git with a cup of tea

liderra/portal

Author	SHA1	Message	Date
Дмитрий	5d3e29669b	feat(observer): parallel_session +OR pre-flight git fetch heuristic (Task 13 PIVOT) Closes brain-retro 2026-05-20 #13 PIVOT — additive to F1 (parallel session sessions session). F1 narrowed parallel_session to tool_result-only to fix live FP. This Task adds OR-clause: Bash command containing 'git fetch && git log HEAD..origin/...' (Pravila §15.2 pre-flight) is a strong signal that the operator expects parallel sessions. Does NOT overwrite F1 — both signals coexist via OR. 4 new vitest tests, 319/319 GREEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:47:41 +03:00
Дмитрий	ef4cc825bf	feat(observer): emit subagent_invoked events from Agent tool_use Closes brain-retro 2026-05-20 #12 — each Agent tool_use produces a subagent_invoked event with subagent_type / model (if explicit) / first 80 chars of description. Visibility from parent Claude's perspective; full subagent trace lives in subagents/ directory and is out of scope for this parser. 6 new vitest tests, 315/315 GREEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:47:40 +03:00
Дмитрий	f54c82d682	feat(observer): opt-in reasoning-tag merges with heuristic primary_rationale Closes brain-retro 2026-05-20 #11 — parseReasoningTag extracts opt-in <!-- reasoning: triggers="..." candidates="..." boundaries="..." --> HTML-comment from assistant text. Semicolon-separated values merged into heuristic-derived primary_rationale arrays via Set-dedupe. Conservative: tag is opt-in; heuristic still runs even when tag present (heuristic provides baseline, tag enriches). 5 new vitest tests, 309/309 GREEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:47:39 +03:00
Дмитрий	f8b32a7d3a	feat(observer): extend classifyPromptSignal vocabulary Closes brain-retro 2026-05-20 #9 — добавлены маркеры: - correction: 'не совсем', 'другое\|другая', 'не сходится', 'wrong direction' - approval: 'класс', 'хорошо', 'принято', 'well done', 'nice' - new_task (prefix): 'теперь', 'далее', 'следующее', 'next', 'now' NB на JS \b с Cyrillic: \b matches word↔non-word boundary, но Cyrillic chars не word-chars в JS RegExp default → \b после русского слова никогда не fires. Решение: substring-match для русских correction-маркеров; lookahead с явными разделителями для start-of-prompt new_task маркеров. 11 new vitest tests, 301/301 GREEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:47:38 +03:00
Дмитрий	ffaeb8f37b	feat(observer): strip <system-reminder> blocks from promptText Closes brain-retro 2026-05-20 #8 — UserPromptSubmit hook injects <system-reminder>...</system-reminder> blocks into user.content that polluted classifyTask / classifyPromptSignal / routing detection. Now stripped via regex before any analysis. Completed by controller (Opus) after subagent hit context limit on 1250-line test file. Helper stripSystemReminders + promptText update were committed by subagent; test cases appended via Bash heredoc. 4 new vitest tests, 290/290 GREEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 13:47:38 +03:00
Дмитрий	c0e3e901d0	feat(observer): differentiate error events by tool + summary Closes brain-retro 2026-05-20 #7 — each tool_result.is_error now emits { kind:'error', tool:<name>, summary:<first 80 chars> }. Allows aggregation by tool (Bash/Edit/Read) + cause prefix (ENOENT/timeout/ 'String to replace not found'). Required updating existing 'emits error events for tool_result with is_error' test assertion (old shape had bare 'message' field). 4 new vitest tests + 1 existing relaxed, 286/286 GREEN. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:47:37 +03:00
Дмитрий	0663479bb8	feat(observer): heuristic reasoning capture in primary_rationale Closes brain-retro 2026-05-20 #6 — extractTriggers/Candidates/Boundaries scan assistant.text for Pravila §N / ADR-N / PSR_v1 RX / routing-off-phase LN / hard-floor + numbered/bulleted lists (≥2). Populates previously- always-empty primary_rationale arrays. Conservative-broad: false positives accepted (mention ≠ application); /brain-retro determines applied validity. Phase 2 agent-judge out of scope. 19 new tests, 282/282 GREEN. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:47:37 +03:00
Дмитрий	52728dfc12	feat(observer): capture ask_user_question events with answer_kind classification (Task 4) Add extractAskUserQuestionEvents() — for each AskUserQuestion toolUseResult emits one event per question with answer_kind: option\|custom\|no_answer and question_count. Integrated into parseTranscript events pipeline. 7 new tests (263 total, 0 failed). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:47:36 +03:00
Дмитрий	8e5eaecf6a	feat(observer): Task 2 — extractTokenUsage + task_cost in parseTranscript - export extractTokenUsage(turn): sums input/output/cache/iterations/ web_search/web_fetch across all assistant messages in a turn - parseTranscript now includes task_cost field (zero-filled when no usage) - 7 new tests (5 unit + 2 integration); total 248/248 GREEN - V2_FIELDS in observer-stop-hook.mjs NOT changed (backward compat)	2026-05-20 13:47:35 +03:00
Дмитрий	47c03a9e18	feat(observer): extend classifyTask with 7 new classes Closes brain-retro 2026-05-20 #1 — analysis/memory-sync/regulatory-bump/ release/cleanup/monitoring/planning. Addresses '59% other' observation from initial retro factor matrix. Ordering: release before feature (merge feature-branch), planning before refactor (план рефакторинга), memory-sync/regulatory-bump at top as most specific. monitoring regex проверь состоян covers inflected forms. 9 new vitest tests, 241/241 GREEN in npm run test:tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 13:47:34 +03:00
Дмитрий	c386361881	fix(observer): infer blocked from unrecovered_error tail, not raw error/retry count (A-1) Bug: inferOutcome flagged `blocked` whenever errorCount > retryCount across the turn's events. But the parser emits an `error` event for ANY tool_result with is_error=true — including expected failures: TDD failing-test-first, grep returning nothing, git commands with intentional non-zero exit. On TDD-heavy turns (project's standard discipline) this systematically marked turns as blocked even when they ended on a successful tool_use. Fix: - Parser (extractProcessEvents): walk turn from end, find the LAST tool_result; if its is_error=true, emit a single `unrecovered_error` event. Distinguishes "turn ended on failure" from "errors recovered later". The original per-is_error `error` events remain (useful as raw factor signals). - Analyzer (inferOutcome): replace `errorCount > retryCount → blocked` with `events.some(kind === 'unrecovered_error') → blocked`. Same ordering preserved (interrupt > blocked > rework/success/unknown). Tests: - Parser: emits unrecovered_error when last tool_result is_error; does NOT emit when turn ended on a successful tool_result; does NOT emit for turns with no tool_results. - Analyzer: blocked iff unrecovered_error event present (not raw count); events=[error, error, retry] → success (no unrecovered_error). 142/142 vitest green (was 128). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 11:03:15 +03:00
Дмитрий	94f831f7d1	fix(observer): uuid-dedup in parseLines (C-1 root fix for quirk #101 ) Bug: Claude Code's transcript JSONL file accumulates duplicated context- rebuild snapshots — the same entry re-printed with the SAME `uuid`. Without dedup, session_turn / task_size / events double-count, and session_turn becomes non-monotonic across episodes parsed at different file-growth states. Live evidence: episodes-2026-05.jsonl lines 14/15/16 of the same session showed session_turn 139 → 140 → 91 (backwards in time). Probe on transcript 553717ec: 22400 entries, only 6074 unique uuid (68% dup rate); real user prompts 264 total vs 92 unique-uuid. Fix: parseLines now tracks a `seenUuid` Set and skips entries whose uuid has already been encountered (keep-first). Entries without `uuid` (synthetic test fixtures) pass through unchanged. All downstream functions (findTurnStart, extractEnvironment, extractTaskSize, etc.) operate on the deduped entries array, so the fix is single-point and total. Tests: new `parseTranscript — uuid-dedup` describe block covers (1) duplicated-uuid prompts collapse → session_turn counts once, (2) distinct-uuid entries preserved (no over-dedup), (3) no-uuid entries pass through (synthetic-fixture safety), (4) duplicated-uuid assistant turns → tool_calls / files_touched counted once. 110/110 parser tests green (was 106). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 11:00:50 +03:00
Дмитрий	030bdc65ab	fix(observer): narrow parallel_session detector to tool_result evidence (C-2) extractEnvironment was scanning JSON.stringify(turn) for collision markers (чужой staged / foreign git index / index.lock / another git process). Prose mentions in user/assistant text flipped parallel_session=true. Live FP proven on episodes-2026-05.jsonl line 20: my own analysis turn was non-parallel but recorded parallel_session: true because the finding text mentioned the markers. Fix: collectToolResultText(turn) — gather text only from tool_result blocks (both string content and structured `[{type:text,text}]` arrays). Scan THAT for collision markers; prose is no longer a signal. Tests: rewrote `parallel_session narrowed` block — false on user/assistant prose / no-tool-result turns; true on tool_result strings + structured form. 106/106 parser tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 10:58:37 +03:00
Дмитрий	97388cf840	fix(observer): transcript-parser accuracy — session_turn + correction signal P0.2: count session_turn from the last compaction. The transcript file accumulates duplicated context-rebuild snapshots (quirk #101), so counting real prompts from i=0 inflated it and made it non-monotonic. Now counts "real prompts since the last compaction" — monotonic by construction. P0.1a: widen the correction prompt_signal regex (не работает / сломал / опять / откати / revert / still not / wrong / ...). The old regex was too narrow, so rework outcomes were invisible to the factor analysis. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 17:40:29 +03:00
Дмитрий	b2b9a75731	feat(observer): AskUserQuestion in-turn choice + parallel_session narrowing #1 — detectAskUserQuestionChoice: when a turn contains an AskUserQuestion whose answer exactly matches an offered option label, classify as user_chose_from_options. The answered entry carries a structured toolUseResult (questions[].options[].label + answers map). A custom "Other" free-text answer is NOT a pick — falls through. Wired into parseTranscript after the text-list detector. #3 — parallel_session: dropped broad word matches (параллельн / "parallel session") that false-fired on any casual mention. Now only strong collision evidence (foreign git index / чужой staged / index.lock / another git process). Best-effort per spec R2 — prefer false-negative over false-positive. 169/169 tools tests GREEN (+9 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 13:39:09 +03:00
Дмитрий	8550ba243d	fix(observer): exclude synthetic user-role messages from turn detection Root cause (systematic-debugging): isRealUserPrompt treated skill-content ("Base directory for this skill:"), local-command output (<local-command-stdout>), and interrupt markers as genuine prompts. findTurnStart then anchored a turn on the synthetic message — the turn slice missed the genuine prompt's UserPromptSubmit hook_additional_context attachment → economy_level: null, wrong prompt_signal/task_classification. Same cause made extractLastUserPromptText return skill content, so the Stop-hook routing-gate false-positive-blocked autonomous §12 skill invocations (detectMethodDirected saw the node name in skill text). Fix: SYNTHETIC_PROMPT_MARKERS + isSyntheticPrompt — isRealUserPrompt returns false for synthetic messages. One fix closes both the economy_level capture gap and the 2nd routing-gate FP class. 160/160 tools tests GREEN (+3 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 13:39:06 +03:00
Дмитрий	0e3938f845	feat(observer): parser integration — user_chose_from_options before routing-tag detectChoiceProvenance runs BEFORE parseRoutingTag; if last assistant turn offered options and user prompt references one, decision_provenance becomes user_chose_from_options. Otherwise falls back to existing routing-tag / autonomous logic. 3 new parser tests GREEN; all existing tests still GREEN (43/43). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 12:04:25 +03:00
Дмитрий	375c3e2d1f	feat(observer): parser v2 — process events, routing-tag, episode assembly	2026-05-19 10:23:08 +03:00
Дмитрий	85a95aa2d0	feat(observer): parser v2 — environment, task_size, prompt_signal extractors	2026-05-19 10:15:17 +03:00
Дмитрий	99c7bac99b	feat(brain): observer captures real session data via transcript parse The Stop-hook was writing empty-shell episodes (task_id "unknown-<ts>", node_chosen "unknown", events []). Root cause: buildEpisodeFromContext read fields from the Stop-event stdin that Claude Code never sends (primary_rationale, node_chosen, ...) and the session field name was wrong (ctx.sessionId camelCase vs Claude Code's session_id). The hook never read transcript_path — the only real source of session data. New tools/observer-transcript-parser.mjs — pure parseTranscript(text, fallbackSessionId): - Scopes to the last turn (from the last real user prompt to EOF) — one episode == one prompt→response cycle. A tool_result-carrier user message is not treated as a turn boundary. - Extracts task_id (real sessionId), timestamps (real duration), skill_invoked events, a tool_summary event with per-tool counts, error events (tool_result is_error), node_chosen (first skill, else "direct"), hard_floor (invoked when a superpowers:* skill is used), path_type (regulated/improvised), task_classification (keyword heuristic on the prompt). - Reasoning fields triggers_matched/candidates_considered/ boundaries_applied stay [] — not recoverable from a transcript; their capture is a separate ADR-011 follow-up. observer-stop-hook.mjs: reads ctx.transcript_path + ctx.session_id (camelCase fallback kept), readFileSync best-effort, delegates to parseTranscript. No transcript → graceful fallback to ctx defaults. Episode schema (5 mandatory + 7-field primary_rationale) unchanged — no normative change. Stop-event is never blocked (exit 0 on any error). TDD: 17 parseTranscript tests + 1 buildEpisodeFromContext transcript test. Full tools Vitest 70/70 GREEN. CLI smoke against a real 575-entry transcript: episode populated — real task_id, ~6.5 min duration, tool_summary {Bash:5,Read:5,Grep:1,Edit:9,Write:1}, error event. Refs: ADR-011 brain governance §6.2 (observer evidence loop). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:11:10 +03:00