Merge branch 'fix/enforce-9-holes' into main

Brain-retro #5 candidate C — closes 7 of 9 enforce bypasses, defers 2.
+ enforce mode flipped from warn-only to enforce in runtime.

Hole fixes:
  1. Remove self-override via assistant text (ce02d1ad)
  2. Task/Agent in MUTATING_TOOLS (7e5c2973)
  5. Tighten nodeMatches to exact/segment match (a846eed9)
  4. Triggers_matched fallback when classifier silent (56829266)
  8. Override-usage monitor in STATUS.md + new module (08e2a969)
  9. Rationalization-audit blocks on 3rd flag + expanded vocab (0ea3b5d7)
  7. ремонт инфраструктуры requires justification line (57a7f55b)

Deferred (architectural):
  3. Confidence threshold (separate spec)
  6. Stop-event post-mutation timing (separate spec)

152 enforce-* tests GREEN.

# Conflicts:
#	docs/observer/STATUS.md
#	tools/status-md-generator.mjs
This commit is contained in:
Дмитрий
2026-05-26 11:48:16 +03:00
14 changed files with 452 additions and 25 deletions
+17 -7
View File
@@ -1,6 +1,6 @@
# Brain Status (auto-generated)
Last updated: 2026-05-26T07:52:20.201Z
Last updated: 2026-05-26T08:45:49.087Z
| Контролёр | Состояние | Детали |
|---|---|---|
@@ -8,13 +8,13 @@ Last updated: 2026-05-26T07:52:20.201Z
| C2 Cross-ref consistency | ✅ | [cross-ref-checker] OK — 0 drift in 4 files |
| C3 Observer-of-observer | ✅ | [observer-of-observer] OK — last read 0 week(s) ago |
| C4 Сигнальный статус | ✅ | This file (self-reference) |
| C5 Observer-coverage | ⚠️ | 464 episode(s) this month · Stop-hook + post-commit OK · 21 missed activation(s) — see /brain-retro |
| C5 Observer-coverage | ⚠️ | 465 episode(s) this month · Stop-hook + post-commit OK · 21 missed activation(s) — see /brain-retro |
| C6 Chain map sync | ✅ | [chain-map-checker] OK — 16 chains in sync |
## Метрики (информационные, не алерты)
- Observer evidence: 464 episodes this month, 0 observer_error markers, 74 PII matches before filter
- Legacy v1 episodes (not in factor analysis): 325
- Observer evidence: 465 episodes this month, 0 observer_error markers, 74 PII matches before filter
- Legacy v1 episodes (not in factor analysis): 326
- Last /brain-retro: 0 day(s) ago
- Использование узлов: см. `/brain-retro` (раз в спринт). missed_activations: 21. **Неиспользованные узлы — не алерт, если профильной задачи не было** (Pravila §16.4 v1.36; capability-readiness; см. memory `feedback_brain_unused_tools_not_problem` — outside-repo memory store).
@@ -32,9 +32,9 @@ Baseline дисциплины роутера (этап 2 router discipline overh
| cleanup | 4 | 0.0% | 0.0% |
| refactor | 1 | 0.0% | 0.0% |
Router step distribution: 1: 187, 2: 170, 3: 54, 5: 48
Router step distribution: 1: 188, 2: 170, 3: 54, 5: 48
Boundaries applied (ADR / границы): 65 of 459 эпизодов (14.2%).
Boundaries applied (ADR / границы): 65 of 460 эпизодов (14.1%).
## Активные многоэтапные проекты
@@ -71,9 +71,19 @@ Episodes since last run: 202 / threshold: 10
## Reviewer: субагент vs fallback
0 эпизодов проверено из 464.
0 эпизодов проверено из 465.
## Использование override-фраз
⚠️ Превышен порог override-использования сегодня (≥5/день)
| Фраза | За всё время | За сегодня |
|---|---|---|
| `recovery` | 54 | 44 ⚠️ |
| `без скилов` | 10 | 8 ⚠️ |
| `ремонт инфраструктуры` | 10 | 10 ⚠️ |
## Алерт-индикаторы
✅ — норма ・ ⚠️ — внимание ・ 🔴 — действие требуется ・ ⚪ — не запускалось
@@ -0,0 +1,44 @@
# Enforce Rule #8 Hole 3 — Deferred
**Date:** 2026-05-26
**Source:** brain-retro #5, [candidate C](../../observer/notes/2026-05-26-brain-retro.md)
**Status:** DEFERRED — architectural, requires owner decision before implementation.
## Hole
`tools/enforce-classifier-match.mjs` `decide()`:
```js
if (typeof confidence === 'number' && confidence < CONFIDENCE_THRESHOLD) return { block: false };
```
The rule only blocks when classifier confidence ≥ 0.7. But `confidence` is only set when the LLM classifier path runs (`source: "llm"`). For prefilter / regex sources, `confidence` is null. Hole 4 fix (commit `56829266`) extended `main()` to fall back to `triggers_matched[0]` as recommendation when classifier was silent — and because `decide()` only short-circuits on numeric confidence, this fallback path *does* enforce.
So hole 3 in its narrowest form is partially addressed. The remaining architectural question:
**When the LLM classifier actively ran and returned `confidence < 0.7`, should we trust that signal?**
Currently we don't (rule skipped). But this can be wrong:
- LLM said «task=question, recommended_node=null, confidence=0.4» → fine, skip is correct.
- LLM said «task=feature, recommended_node=#19, confidence=0.4» → we skip, but the recommendation may still be valuable.
## Options
| # | Approach | Trade-off |
|---|---|---|
| A | Always run LLM classifier, enforce at all confidence levels | Cost: every turn pays for an LLM call. Latency: +1-3s per turn. Best signal quality. |
| B | Synthetic confidence for triggers (assume 0.8 for prefilter matches) | Cheap. Semantically wrong — prefilter has no probabilistic basis. Falsifies the dataset for downstream analysis. |
| C | New "trust level" field in classifier output (`high` / `low` / `null`) instead of numeric confidence; rule honors `high` regardless of source | Cleanest. Requires changes in classifier (`tools/router-classifier.mjs`), prefilter, episode schema (`schema_version` bump), and tests. Estimated 1-2 days. |
| D | Lower threshold to 0.4 — bias toward enforcement when LLM ran | One-line change. May increase false-positives in genuine "low-stakes" cases. |
**Recommendation:** Option C, planned as Stage 4 of router-discipline-overhaul (see [docs/superpowers/specs/2026-05-23-router-discipline-overhaul-design.md](2026-05-23-router-discipline-overhaul-design.md)). Stage 4 was already planned; this hole is a concrete requirement for it.
## Why deferred now
- Stage 3 (current) ships warn-only enforcement; hole 3 is about how enforce decides what to block. The current "trust LLM at 0.7+" rule is acceptable as the first iteration.
- Cross-cutting change (classifier + schema + tests) would expand this fix-pass beyond the 7-of-9 scope already in flight.
## Re-open trigger
Next brain-retro that shows ≥5 episodes where `node_chosen=direct` AND `recommended_node !== null` AND `confidence < 0.7` (i.e., real recommendations being skipped because of low confidence). Currently no such data — too few LLM-classifier runs to populate this distribution.
@@ -0,0 +1,29 @@
# Enforce Rule #8 Hole 6 — Deferred
**Date:** 2026-05-26
**Source:** brain-retro #5, [candidate C](../../observer/notes/2026-05-26-brain-retro.md)
**Status:** DEFERRED — by-definition, requires architectural choice.
## Hole
`enforce-classifier-match.mjs` is a **Stop-event hook**. The Stop event fires AFTER the agent's turn ends, which means all mutations (Edit, Write, Bash) have ALREADY happened. The hook can block the *next* turn (by returning `decision: block` in the Stop payload) but cannot revert the current turn's changes. By the time the hook decides "you should not have done that mutation", the mutation is committed to the working tree.
## Options
| # | Approach | Trade-off |
|---|---|---|
| A | Mirror the rule as a PreToolUse hook on `Edit\|Write\|Bash\|...` | PreToolUse fires before each mutation. But classifier output is computed once per turn (UserPromptSubmit), and per-tool re-check is per-tool — works. **Downside:** classifier_state may not be written by the time the first PreToolUse fires (race). Need to handle "no state yet" gracefully. |
| B | Mutation reversal (snapshot before, restore on block) | Dangerous. File-state restore is hard. Bash side-effects (DB writes, network calls, file deletions) can't be reverted at all. **Not recommended.** |
| C | Accept Stop-timing as best-effort | What we have now. Stop-event block prevents the *next* turn — still useful as cumulative discipline signal (agent sees the block message and adjusts in subsequent turns). Less immediate than A but materially valuable. |
**Recommendation:** Option A, as a follow-up after we have at least 7 days of data on the Stop-event enforce mode (which goes live after this 9-hole fix pass closes). The Stop-event variant is the "first line of defense" and should keep operating. PreToolUse variant adds "early-blocker" for the most-egregious classifier mismatches.
## Why deferred now
- The 9-hole pass is about closing bypass holes in the existing logic — adding a parallel hook layer is scope creep.
- Option A also needs a careful "no state yet" fallback (PreToolUse can fire before classifier ran for the turn — the classifier hook is on UserPromptSubmit, which races with PreToolUse on the first tool call).
- Stop-event enforce is materially useful as-is, even with this hole — the next turn's cumulative-discipline-block has a clear deterrent effect.
## Re-open trigger
If reviewer-pass data over a multi-week period shows ≥10 episodes where the rule "would have blocked" mutations had it fired earlier (i.e., mutations that completed successfully but were the wrong tool), reconsider Option A.
+28 -10
View File
@@ -1,4 +1,4 @@
#!/usr/bin/env node
#!/usr/bin/env node
/**
* Rule #8 — Classifier-mismatch enforce.
*
@@ -28,7 +28,7 @@ import {
const RULE_KEY = 'classifier-mismatch';
const CONFIDENCE_THRESHOLD = 0.7;
const MUTATING_TOOLS = new Set(['Edit', 'Write', 'MultiEdit', 'NotebookEdit', 'Bash']);
const MUTATING_TOOLS = new Set(['Edit', 'Write', 'MultiEdit', 'NotebookEdit', 'Bash', 'Task', 'Agent']);
/** Normalize a node id: strip "superpowers:" / "skill:" prefix; allow #ID. */
function normalizeNode(s) {
@@ -40,13 +40,22 @@ function nodeMatches(recommendation, toolUse) {
if (!recommendation || !toolUse) return false;
const rec = normalizeNode(recommendation);
if (!rec) return false;
// Hole 5 fix: exact match OR matching last segment after ':' / '#'.
// No generic substring (would match meta-planning to planning).
const matches = (candidate) => {
if (!candidate) return false;
if (candidate === rec) return true;
const recSegs = rec.split(/[:#]/);
const canSegs = candidate.split(/[:#]/);
const recLast = recSegs[recSegs.length - 1];
const canLast = canSegs[canSegs.length - 1];
return recLast === canLast;
};
if (toolUse.name === 'Skill') {
const s = normalizeNode(String(toolUse.input && toolUse.input.skill || ''));
if (s && (s === rec || s.includes(rec) || rec.includes(s))) return true;
return matches(normalizeNode(String(toolUse.input && toolUse.input.skill || '')));
}
if (toolUse.name === 'Task') {
const sub = String(toolUse.input && toolUse.input.subagent_type || '').toLowerCase();
if (sub && rec.includes(sub)) return true;
if (toolUse.name === 'Task' || toolUse.name === 'Agent') {
return matches(String(toolUse.input && toolUse.input.subagent_type || '').toLowerCase());
}
return false;
}
@@ -63,8 +72,8 @@ export function decide({ toolUses, recommendation, confidence, assistantText, ov
const matched = toolUses.some((u) => nodeMatches(recommendation, u));
if (matched) return { block: false };
// Allow explicit override: lines like "override: <reason>" in assistant text.
if (assistantText && /\boverride:\s+\S/i.test(assistantText)) return { block: false };
// NOTE: prior \ self-bypass removed (retro #5 hole 1) - assistant
// cannot grant itself an override. User must use a vocabulary phrase.
return {
block: true,
@@ -89,8 +98,17 @@ async function main() {
const state = readRouterState(event.session_id);
const cls = state && state.classification;
const recommendation = cls && (cls.recommended_node || cls.recommendedNode);
let recommendation = cls && (cls.recommended_node || cls.recommendedNode);
const confidence = cls && typeof cls.confidence === 'number' ? cls.confidence : null;
// Hole 4 fix: fall back to triggers_matched[0] when classifier silent.
// Confidence stays null in fallback path — decide() accepts null (only
// numeric confidence < 0.7 blocks the rule).
if (!recommendation) {
const triggers = (cls && cls.triggers_matched) || [];
if (Array.isArray(triggers) && triggers.length > 0 && typeof triggers[0] === 'string' && triggers[0].length > 0) {
recommendation = triggers[0];
}
}
const toolUses = turnToolUses(transcript);
const assistantText = lastAssistantText(transcript);
+79 -2
View File
@@ -72,14 +72,26 @@ describe('enforce-classifier-match / decide', () => {
expect(r.block).toBe(false);
});
it('allows when explicit "override:" in assistant text', () => {
it('blocks (not allows) when only "override:" in assistant text — self-override removed (hole 1)', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: {} }],
recommendation: 'foo:bar',
confidence: 0.9,
assistantText: 'override: simpler direct edit, foo:bar overkill here\n',
override: null,
});
expect(r.block).toBe(false);
expect(r.block).toBe(true);
});
it('blocks when assistant text has "override: reason" but user prompt has no override phrase (hole 1)', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: {} }],
recommendation: 'superpowers:writing-plans',
confidence: 0.9,
assistantText: 'override: just doing it quick',
override: null,
});
expect(r.block).toBe(true);
});
it('allows when override phrase present', () => {
@@ -91,4 +103,69 @@ describe('enforce-classifier-match / decide', () => {
});
expect(r.block).toBe(false);
});
it('blocks when Task subagent is spawned without matching recommendation (hole 2)', () => {
const r = decide({
toolUses: [{ name: 'Task', input: { subagent_type: 'general-purpose', prompt: 'do stuff' } }],
recommendation: 'superpowers:writing-plans',
confidence: 0.9,
assistantText: '',
override: null,
});
expect(r.block).toBe(true);
});
it('does NOT block when Task subagent matches recommendation (regression — Task should count as match when right type)', () => {
const r = decide({
toolUses: [{ name: 'Task', input: { subagent_type: 'writing-plans', prompt: '...' } }],
recommendation: 'writing-plans',
confidence: 0.9,
assistantText: '',
override: null,
});
expect(r.block).toBe(false);
});
it('does not match meta-planning to planning recommendation (hole 5)', () => {
const r = decide({
toolUses: [{ name: 'Skill', input: { skill: 'meta-planning' } }, { name: 'Edit', input: {} }],
recommendation: 'planning',
confidence: 0.9,
assistantText: '',
override: null,
});
expect(r.block).toBe(true);
});
it('matches superpowers:writing-plans to writing-plans recommendation (regression — keep working)', () => {
expect(decide({
toolUses: [{ name: 'Skill', input: { skill: 'superpowers:writing-plans' } }, { name: 'Edit', input: {} }],
recommendation: 'writing-plans',
confidence: 0.9,
assistantText: '',
override: null,
}).block).toBe(false);
});
it('matches exact-name skill regression — keep working', () => {
expect(decide({
toolUses: [{ name: 'Skill', input: { skill: 'brainstorming' } }, { name: 'Edit', input: {} }],
recommendation: 'brainstorming',
confidence: 0.9,
assistantText: '',
override: null,
}).block).toBe(false);
});
// hole 4: triggers_matched fallback — decide() contract test
it('blocks when recommendation comes from triggers_matched fallback (hole 4, null confidence)', () => {
const r = decide({
toolUses: [{ name: 'Edit', input: {} }],
recommendation: 'superpowers:writing-plans', // would-be from triggers_matched[0]
confidence: null, // no LLM, but triggers present
assistantText: '',
override: null,
});
expect(r.block).toBe(true);
});
});
+10 -1
View File
@@ -200,7 +200,16 @@ export function findOverride(userPrompt, ruleKey, vocab) {
for (const p of v.phrases || []) {
if (!p.phrase || !Array.isArray(p.suppresses)) continue;
if (!lo.includes(p.phrase.toLowerCase())) continue;
if (p.suppresses.includes(ruleKey)) return p;
if (!p.suppresses.includes(ruleKey)) continue;
if (p.requires_justification) {
// Hole 7 fix: master overrides require a line "<prefix> <non-empty>"
// in the same prompt documenting what is being repaired.
const prefix = p.requires_justification.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const re = new RegExp(prefix + '\\s+(\\S[^\\n]*)', 'i');
const m = userPrompt.match(re);
if (!m || !m[1] || !m[1].trim()) continue;
}
return p;
}
return null;
}
+29
View File
@@ -151,6 +151,35 @@ describe('loadOverrideVocab / findOverride', () => {
});
});
describe('findOverride — requires_justification (hole 7)', () => {
const testVocab = {
phrases: [
{
phrase: 'ремонт инфраструктуры',
suppresses: ['classifier-mismatch'],
requires_justification: 'ремонт:',
description: 'master kill — requires justification',
},
],
};
it('rejects when phrase present but justification line missing (hole 7)', () => {
const r = findOverride('ремонт инфраструктуры', 'classifier-mismatch', testVocab);
expect(r).toBeNull();
});
it('accepts when justification line provides target', () => {
const r = findOverride('ремонт инфраструктуры\nремонт: enforce-hook-helpers.mjs', 'classifier-mismatch', testVocab);
expect(r).not.toBeNull();
expect(r.phrase).toBe('ремонт инфраструктуры');
});
it('rejects when justification line empty after the prefix', () => {
const r = findOverride('ремонт инфраструктуры\nремонт: ', 'classifier-mismatch', testVocab);
expect(r).toBeNull();
});
});
describe('isProductionCodePath', () => {
it('classifies tools/*.mjs as production', () => {
expect(isProductionCodePath('tools/router-classifier.mjs')).toBe(true);
+57
View File
@@ -0,0 +1,57 @@
// Brain-retro #5 candidate C, hole 8: override-usage monitor.
//
// Reads override-usage.jsonl (one JSON line per override invocation:
// {ts, session_id, rule, phrase}) and produces a STATUS.md block with
// per-phrase totals + today's count. Warns when any phrase exceeds
// threshold/day (default 5).
//
// Pure — takes raw log string + opts, returns markdown.
export function computeOverrideUsageBlock(rawLog, opts = {}) {
const now = opts.now ? new Date(opts.now) : new Date();
const today = now.toISOString().slice(0, 10);
const threshold = opts.threshold ?? 5;
if (!rawLog || typeof rawLog !== 'string') {
return `## Использование override-фраз\n\nНе использовалось.`;
}
const lines = rawLog.split('\n').filter(Boolean);
if (lines.length === 0) {
return `## Использование override-фраз\n\nНе использовалось.`;
}
const todayCounts = {};
const allCounts = {};
for (const l of lines) {
let e;
try { e = JSON.parse(l); } catch { continue; }
if (!e || typeof e.phrase !== 'string' || !e.phrase) continue;
allCounts[e.phrase] = (allCounts[e.phrase] || 0) + 1;
if (typeof e.ts === 'string' && e.ts.slice(0, 10) === today) {
todayCounts[e.phrase] = (todayCounts[e.phrase] || 0) + 1;
}
}
if (Object.keys(allCounts).length === 0) {
return `## Использование override-фраз\n\nНе использовалось.`;
}
const sorted = Object.entries(allCounts).sort((a, b) => b[1] - a[1]);
const rows = sorted.map(([phrase, total]) => {
const tCount = todayCounts[phrase] || 0;
const warn = tCount >= threshold ? ' ⚠️' : '';
return `| \`${phrase}\` | ${total} | ${tCount}${warn} |`;
}).join('\n');
const anyWarn = Object.values(todayCounts).some((v) => v >= threshold);
const header = anyWarn ? `⚠️ Превышен порог override-использования сегодня (≥${threshold}/день)` : '';
return `## Использование override-фраз
${header}
| Фраза | За всё время | За сегодня |
|---|---|---|
${rows}`;
}
+48
View File
@@ -0,0 +1,48 @@
import { describe, it, expect } from 'vitest';
import { computeOverrideUsageBlock } from './enforce-override-monitor.mjs';
describe('computeOverrideUsageBlock', () => {
const today = '2026-05-26';
const entry = (phrase, dt = today) => JSON.stringify({ ts: `${dt}T01:00:00Z`, session_id: 'x', rule: 'r', phrase });
it('returns placeholder when log empty', () => {
expect(computeOverrideUsageBlock('')).toContain('Не использовалось');
expect(computeOverrideUsageBlock(null)).toContain('Не использовалось');
});
it('lists phrase frequencies and totals', () => {
const log = [entry('recovery'), entry('recovery'), entry('без скилов')].join('\n');
const out = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z` });
expect(out).toContain('`recovery`');
expect(out).toContain('| 2 |');
expect(out).toContain('без скилов');
});
it('warns when any phrase exceeds 5/day', () => {
const log = Array.from({ length: 7 }, () => entry('recovery')).join('\n');
const out = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z` });
expect(out).toContain('⚠️');
expect(out).toContain('recovery');
});
it('only counts today for "сегодня" column', () => {
const log = [entry('recovery', '2026-05-25'), entry('recovery', today)].join('\n');
const out = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z` });
// total=2, today=1
expect(out).toMatch(/`recovery`.*\|\s*2\s*\|\s*1/);
});
it('respects custom threshold', () => {
const log = Array.from({ length: 3 }, () => entry('recovery')).join('\n');
const flagged = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z`, threshold: 2 });
const notFlagged = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z`, threshold: 10 });
expect(flagged).toContain('⚠️');
expect(notFlagged).not.toContain('⚠️');
});
it('skips malformed JSON lines silently', () => {
const log = ['not-json', entry('recovery'), '{}'].join('\n');
const out = computeOverrideUsageBlock(log, { now: `${today}T05:00:00Z` });
expect(out).toContain('recovery');
});
});
+2 -1
View File
@@ -35,7 +35,8 @@
{
"phrase": "ремонт инфраструктуры",
"suppresses": ["tdd-gate", "verify-before-commit", "verify-before-push", "writing-plans-required", "skill-required", "memory-sync-coverage", "classifier-mismatch", "coverage-skill-match"],
"description": "Bypass all rules (full opt-out). Use only when literally fixing the enforce-infrastructure itself."
"requires_justification": "ремонт:",
"description": "Bypass all rules (full opt-out). Requires 'ремонт: <what>' line in same prompt."
}
]
}
+32 -1
View File
@@ -22,6 +22,7 @@ import {
lastAssistantText,
turnToolUses,
appendRationalizationFlag,
readRationalizationFlags,
exitDecision,
isProductionCodePath,
} from './enforce-hook-helpers.mjs';
@@ -39,6 +40,12 @@ const RATIONALIZATION_PHRASES = [
'rationalize',
'без церемоний',
'без скила сейчас',
// expanded vocabulary
'давай разок',
'только сейчас',
'один раз без правил',
'на этот раз без',
'я знаю что не надо но',
];
export function findRationalizationPhrases(text) {
@@ -87,14 +94,38 @@ export function audit(transcriptEntries) {
return flags;
}
/**
* Pure decision seam — injectable priorFlagCount for testability.
* Blocks on 3rd flag of the same session (priorFlagCount >= 2).
*/
export function decide({ assistantText, sessionId: _sessionId, override = false, priorFlagCount = 0 }) {
const detected = findRationalizationPhrases(assistantText || '');
if (override) return { block: false, detected };
if (priorFlagCount >= 2 && detected.length > 0) {
return {
block: true,
message: `Rationalization detected (phrase: "${detected[0]}"). This is the ${priorFlagCount + 1}th flag in this session — blocking to prevent pattern escalation.`,
detected,
};
}
return { block: false, detected };
}
async function main() {
try {
const raw = await readStdin();
const event = parseEventJson(raw);
const transcript = readTranscript(event.transcript_path);
const flags = audit(transcript);
// Count prior flags before appending new ones
const priorFlagCount = readRationalizationFlags(event.session_id).length;
for (const f of flags) appendRationalizationFlag(event.session_id, f.kind, f.evidence);
exitDecision({ block: false });
// Check if we should block based on rationalization phrases specifically
const text = lastAssistantText(transcript);
const decision = decide({ assistantText: text, sessionId: event.session_id, priorFlagCount });
exitDecision(decision.block ? { block: true, message: decision.message } : { block: false });
} catch {
exitDecision({ block: false });
}
+57 -1
View File
@@ -1,5 +1,5 @@
import { describe, it, expect } from 'vitest';
import { findRationalizationPhrases, detectProdEditWithoutTest, audit } from './enforce-rationalization-audit.mjs';
import { findRationalizationPhrases, detectProdEditWithoutTest, audit, decide } from './enforce-rationalization-audit.mjs';
describe('findRationalizationPhrases', () => {
it('detects "just this once" in mixed case', () => {
@@ -78,3 +78,59 @@ describe('audit', () => {
expect(audit(entries)).toEqual([]);
});
});
describe('vocab — new phrases', () => {
it('detects "давай разок"', () => {
expect(findRationalizationPhrases('давай разок без тестов')).toContain('давай разок');
});
it('detects "только сейчас"', () => {
expect(findRationalizationPhrases('только сейчас пропустим')).toContain('только сейчас');
});
it('detects "один раз без правил"', () => {
expect(findRationalizationPhrases('один раз без правил сделаем')).toContain('один раз без правил');
});
it('detects "на этот раз без"', () => {
expect(findRationalizationPhrases('на этот раз без скила')).toContain('на этот раз без');
});
it('detects "я знаю что не надо но"', () => {
expect(findRationalizationPhrases('я знаю что не надо но пропустим')).toContain('я знаю что не надо но');
});
});
describe('decide — escalation on 3rd flag', () => {
const sessionId = 'test-session';
const textWithPhrase = 'just this once';
it('does NOT block when priorFlagCount=0', () => {
const result = decide({ assistantText: textWithPhrase, sessionId, priorFlagCount: 0 });
expect(result.block).toBe(false);
expect(result.detected.length).toBeGreaterThan(0);
});
it('does NOT block when priorFlagCount=1', () => {
const result = decide({ assistantText: textWithPhrase, sessionId, priorFlagCount: 1 });
expect(result.block).toBe(false);
});
it('blocks when priorFlagCount=2 (3rd occurrence)', () => {
const result = decide({ assistantText: textWithPhrase, sessionId, priorFlagCount: 2 });
expect(result.block).toBe(true);
expect(result.message).toMatch(/rationali/i);
});
it('blocks when priorFlagCount=5 (subsequent occurrences)', () => {
const result = decide({ assistantText: textWithPhrase, sessionId, priorFlagCount: 5 });
expect(result.block).toBe(true);
});
it('does NOT block clean text even with priorFlagCount=10', () => {
const result = decide({ assistantText: 'coverage: skill:tdd', sessionId, priorFlagCount: 10 });
expect(result.block).toBe(false);
expect(result.detected).toEqual([]);
});
it('override=true suppresses block even on 3rd flag', () => {
const result = decide({ assistantText: textWithPhrase, sessionId, override: true, priorFlagCount: 2 });
expect(result.block).toBe(false);
});
});
+10 -2
View File
@@ -2,10 +2,12 @@
import { readFileSync, writeFileSync, existsSync } from 'fs';
import { join } from 'path';
import { execFileSync } from 'child_process';
import { homedir } from 'os';
import { runCoverageChecker } from './observer-coverage-checker.mjs';
import { analyze } from './brain-retro-analyzer.mjs';
import { loadRegistry } from './registry-load.mjs';
import { buildClassificationMap, buildDormancyMap } from './registry-to-classification-map.mjs';
import { computeOverrideUsageBlock } from './enforce-override-monitor.mjs';
const PRICING = {
sonnet46: { input_per_mtok: 3.0, output_per_mtok: 15.0 },
@@ -274,7 +276,7 @@ Last updated: ${now}
- Legacy v1 episodes (not in factor analysis): ${observer.v1Episodes || 0}
- Last /brain-retro: ${retroLine}
- Использование узлов: см. \`/brain-retro\` (раз в спринт). missed_activations: ${missed.totalMissed}. **Неиспользованные узлы — не алерт, если профильной задачи не было** (Pravila §16.4 v1.36; capability-readiness; см. memory \`feedback_brain_unused_tools_not_problem\` — outside-repo memory store).
${disciplineBlock}${projectsBlock}${inputs.sessionLengthBlock ? `\n${inputs.sessionLengthBlock}\n` : ''}${inputs.costBlock ? `\n${inputs.costBlock}\n` : ''}${inputs.anomalyBlock ? `\n${inputs.anomalyBlock}\n` : ''}${inputs.selfRetrospectBlock ? `\n${inputs.selfRetrospectBlock}\n` : ''}${inputs.reviewerBlock ? `\n${inputs.reviewerBlock}\n` : ''}
${disciplineBlock}${projectsBlock}${inputs.sessionLengthBlock ? `\n${inputs.sessionLengthBlock}\n` : ''}${inputs.costBlock ? `\n${inputs.costBlock}\n` : ''}${inputs.anomalyBlock ? `\n${inputs.anomalyBlock}\n` : ''}${inputs.selfRetrospectBlock ? `\n${inputs.selfRetrospectBlock}\n` : ''}${inputs.reviewerBlock ? `\n${inputs.reviewerBlock}\n` : ''}${inputs.overrideUsageBlock ? `\n${inputs.overrideUsageBlock}\n` : ''}
## Алерт-индикаторы
норма внимание 🔴 действие требуется не запускалось
@@ -404,17 +406,23 @@ if (process.argv[1] && process.argv[1].replace(/\\/g, '/').endsWith('/status-md-
};
const eps = loadCurrentMonthEpisodes();
let costBlock = null, anomalyBlock = null, selfRetrospectBlock = null, reviewerBlock = null, sessionLengthBlock = null;
let costBlock = null, anomalyBlock = null, selfRetrospectBlock = null, reviewerBlock = null, sessionLengthBlock = null, overrideUsageBlock = null;
try { costBlock = computeCostBlock(eps, PRICING); } catch (err) { console.warn('[status-md-generator] costBlock skipped:', err.message); costBlock = '(нет данных)'; }
try { anomalyBlock = computeAnomalyBlock(eps); } catch (err) { console.warn('[status-md-generator] anomalyBlock skipped:', err.message); anomalyBlock = '(нет данных)'; }
try { selfRetrospectBlock = computeSelfRetrospectBlock(join('docs', 'observer', '.self-retrospect-counter.json')); } catch (err) { console.warn('[status-md-generator] selfRetrospectBlock skipped:', err.message); selfRetrospectBlock = '(нет данных)'; }
try { reviewerBlock = computeReviewerBlock(eps); } catch (err) { console.warn('[status-md-generator] reviewerBlock skipped:', err.message); reviewerBlock = '(нет данных)'; }
try { sessionLengthBlock = computeSessionLengthBlock(eps); } catch (err) { console.warn('[status-md-generator] sessionLengthBlock skipped:', err.message); sessionLengthBlock = '(нет данных)'; }
try {
const logPath = join(homedir(), '.claude', 'runtime', 'override-usage.jsonl');
const raw = existsSync(logPath) ? readFileSync(logPath, 'utf-8') : '';
overrideUsageBlock = computeOverrideUsageBlock(raw);
} catch (err) { console.warn('[status-md-generator] overrideUsageBlock skipped:', err.message); overrideUsageBlock = '(нет данных)'; }
inputs.costBlock = costBlock;
inputs.anomalyBlock = anomalyBlock;
inputs.selfRetrospectBlock = selfRetrospectBlock;
inputs.reviewerBlock = reviewerBlock;
inputs.sessionLengthBlock = sessionLengthBlock;
inputs.overrideUsageBlock = overrideUsageBlock;
const md = renderStatus(inputs);
writeFileSync('docs/observer/STATUS.md', md);
+10
View File
@@ -149,6 +149,16 @@ describe('renderStatus — discipline block (stage 2)', () => {
const md = renderStatus(baseInputs);
expect(md).not.toMatch(/## Метрики дисциплины/);
});
it('coexists: both sessionLengthBlock (brain-retro candidate B) and overrideUsageBlock (enforce hole 8) appear together in template after merge', () => {
const md = renderStatus({
...baseInputs,
sessionLengthBlock: '## Длинные сессии\n\nflagged content',
overrideUsageBlock: '## Использование override-фраз\n\nflagged content',
});
expect(md).toContain('## Длинные сессии');
expect(md).toContain('## Использование override-фраз');
});
});
// ── Phase 3 deferred #3: 4 new helper blocks ─────────────────────────────────