fix(adr-judge): catastrophic backtracking on prose-only Enforcement section

ENFORCEMENT_BLOCK_RE used a single regex with nested non-greedy
quantifier `(?:.*?\n)*?` plus re.DOTALL — when an ADR has the
`## Enforcement` heading but no fenced ```json block in that
section (prose-only enforcement is legitimate; see ADR-011 where
the prose explicitly says "this section's existence is verified
per-commit"), the regex engine exhausts itself searching for a
non-existent closing fence through ~50+ lines of subsequent prose.

Observed: lefthook adr-judge job >60s timeout (exit 124) on every
commit, traced to ADR-011 (10337 B) — ADR-016 has the same shape
and would have hung next. Other ADRs (000–010) finish in <0.2 ms
either because they have a fenced JSON block to find or no
`## Enforcement` heading at all.

Fix: decompose into three non-backtracking searches —
1. find `## Enforcement` heading
2. find next `## ` heading (section boundary; falls back to EOF)
3. search ```json fence ONLY within that section

Side benefit: the JSON fence is now correctly scoped to the
Enforcement section, so a ```json block in a later section
(References, Amendment, etc.) is no longer accidentally picked up.

Verification:
- Repro `tools/adr-judge-repro.py`: all 13 ADRs parse in <1 ms each
  post-fix (ADR-011 / ADR-016 prose-only sections return None
  correctly; ADR-001 still extracts its forbid_import / require_pattern
  / llm_judge keys).
- End-to-end `python -X utf8 tools/adr-judge.py --diff - --adr-dir docs/adr/`
  with a small diff: exit 0 in <1 s (was: >60 s timeout).
- Lefthook adr-judge job in the preceding brain-retro commit
  (b1398883): 0.25 s, OK.

Note: tools/adr-judge.py is vendored from adr-kit v0.13.1 (per
lefthook.yml comment "пере-вендорить после /adr-kit:upgrade").
This fix should be reported upstream; until upstream releases the
patched parser the local change must be preserved across re-vendor.

ремонт инфраструктуры
ремонт: catastrophic-backtracking in adr-judge ENFORCEMENT_BLOCK_RE
        blocks every commit > 60 s on prose-only Enforcement sections
        (ADR-011, ADR-016)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Дмитрий
2026-05-27 18:09:38 +03:00
parent b139888376
commit 1e1457eb4c
2 changed files with 57 additions and 31 deletions
+30 -23
View File
@@ -1,6 +1,6 @@
# Brain Status (auto-generated)
Last updated: 2026-05-27T05:23:19.894Z
Last updated: 2026-05-27T15:09:00.708Z
| Контролёр | Состояние | Детали |
|---|---|---|
@@ -8,13 +8,13 @@ Last updated: 2026-05-27T05:23:19.894Z
| C2 Cross-ref consistency | ✅ | [cross-ref-checker] OK — 0 drift in 4 files |
| C3 Observer-of-observer | ✅ | [observer-of-observer] OK — last read 0 week(s) ago |
| C4 Сигнальный статус | ✅ | This file (self-reference) |
| C5 Observer-coverage | ⚠️ | 632 episode(s) this month · Stop-hook + post-commit OK · 21 missed activation(s) — see /brain-retro |
| C5 Observer-coverage | ⚠️ | 692 episode(s) this month · Stop-hook + post-commit OK · 21 missed activation(s) — see /brain-retro |
| C6 Chain map sync | ✅ | [chain-map-checker] OK — 16 chains in sync |
## Метрики (информационные, не алерты)
- Observer evidence: 632 episodes this month, 0 observer_error markers, 133 PII matches before filter
- Legacy v1 episodes (not in factor analysis): 493
- Observer evidence: 692 episodes this month, 0 observer_error markers, 154 PII matches before filter
- Legacy v1 episodes (not in factor analysis): 553
- Last /brain-retro: 0 day(s) ago
- Использование узлов: см. `/brain-retro` (раз в спринт). missed_activations: 21. **Неиспользованные узлы — не алерт, если профильной задачи не было** (Pravila §16.4 v1.36; capability-readiness; см. memory `feedback_brain_unused_tools_not_problem` — outside-repo memory store).
@@ -24,17 +24,17 @@ Baseline дисциплины роутера (этап 2 router discipline overh
| Тип задачи | Эпизодов | % с триггер-матчем | % через скил |
|---|---|---|---|
| analysis | 26 | 30.8% | 15.4% |
| monitoring | 26 | 0.0% | 0.0% |
| bugfix | 18 | 22.2% | 27.8% |
| planning | 16 | 18.8% | 18.8% |
| feature | 15 | 13.3% | 0.0% |
| cleanup | 6 | 0.0% | 0.0% |
| monitoring | 32 | 0.0% | 0.0% |
| analysis | 27 | 29.6% | 14.8% |
| bugfix | 19 | 21.1% | 26.3% |
| planning | 17 | 17.6% | 17.6% |
| feature | 16 | 12.5% | 0.0% |
| cleanup | 7 | 0.0% | 0.0% |
| refactor | 1 | 0.0% | 0.0% |
Router step distribution: 1: 272, 2: 233, 3: 60, 5: 60
Router step distribution: 1: 297, 2: 260, 3: 65, 5: 62
Boundaries applied (ADR / границы): 73 of 625 эпизодов (11.7%).
Boundaries applied (ADR / границы): 78 of 684 эпизодов (11.4%).
## Активные многоэтапные проекты
@@ -46,16 +46,23 @@ Boundaries applied (ADR / границы): 73 of 625 эпизодов (11.7%).
## Длинные сессии
Ни одной сессии с >50 ходов сегодня (UTC). ✅
⚠️ Сегодня (2026-05-27 UTC) есть сессии с 50 ходов — корреляция с падением дисциплины роутинга (retro #5 candidate B).
| session_id | макс. ход | % regulated | последний эпизод |
|---|---|---|---|
| `0ade4c82` | 54 | 9% | 2026-05-27T12:49:21.664Z |
| `b11f6b8d` | 51 | 4% | 2026-05-27T08:32:49.803Z |
Long sessions correlate with discipline drift. Если % regulated просел в текущей сессии — рассмотри перезапуск.
## Стоимость месяца
| Компонент | Токены (in/out) | USD |
|---|---|---|
| Classifier (Sonnet 4.6) | 3036/36816 | $0.56 |
| Classifier (Sonnet 4.6) | 6458/61654 | $0.94 |
| Self-assessment (Sonnet 4.6) | 0/0 | $0.00 |
| Reviewer (Opus 4.7 + fallback) | 0/0 | $0.00 |
| **Итого** | | **$0.56** |
| **Итого** | | **$0.94** |
## Аномалии классификатора
@@ -63,12 +70,12 @@ Boundaries applied (ADR / границы): 73 of 625 эпизодов (11.7%).
## Авто-ретроспектива
Last self-retrospect: never ⚠️ (542 эпизодов с последнего запуска, порог 10)
Episodes since last run: 542 / threshold: 10
Last self-retrospect: never ⚠️ (609 эпизодов с последнего запуска, порог 10)
Episodes since last run: 609 / threshold: 10
## Reviewer: субагент vs fallback
0 эпизодов проверено из 632.
0 эпизодов проверено из 692.
## Reviewer findings
@@ -110,13 +117,13 @@ Episodes since last run: 542 / threshold: 10
| Фраза | За всё время | За сегодня |
|---|---|---|
| `recovery` | 162 | 68 ⚠️ |
| `ремонт инфраструктуры` | 116 | 45 ⚠️ |
| `recovery` | 251 | 157 ⚠️ |
| `ремонт инфраструктуры` | 159 | 88 ⚠️ |
| `срочно` | 82 | 39 ⚠️ |
| `без скилов` | 50 | 24 ⚠️ |
| `без скилов` | 56 | 30 ⚠️ |
| `memory dump` | 8 | 6 ⚠️ |
| `direct ok` | 6 | 2 |
| `memory dump` | 2 | 0 |
| `быстрый коммит` | 1 | 0 |
| `быстрый коммит` | 3 | 2 |
## Алерт-индикаторы
+27 -8
View File
@@ -72,10 +72,22 @@ STATUS_BOLD_INLINE_RE = re.compile(
r"^\s*\*\*\s*Status\s*:?\s*\*\*\s*:?\s*([A-Za-z]+)|^\s*\*\*\s*Status\s*:?\s*([A-Za-z]+)\s*\*\*",
re.IGNORECASE | re.MULTILINE,
)
ENFORCEMENT_BLOCK_RE = re.compile(
r"^##\s+Enforcement\s*$\n+(?:.*?\n)*?```json\s*\n(.*?)\n```",
re.IGNORECASE | re.MULTILINE | re.DOTALL,
# Section-bounded Enforcement parsing. The previous single-regex form
# `^##\s+Enforcement\s*$\n+(?:.*?\n)*?` ```json ... ``` ` with re.DOTALL
# suffered catastrophic backtracking when an ADR had `## Enforcement` but
# no fenced JSON block in it (prose-only enforcement is valid — see
# ADR-011). The nested non-greedy quantifier `(?:.*?\n)*?` with DOTALL
# exhausted the regex engine searching for a non-existent closing fence
# through ~50+ lines, producing 60s+ hangs.
#
# Fix: decompose into three non-backtracking searches. Side benefit —
# the JSON fence is now correctly scoped to the Enforcement section, so
# a ```json block in a later section (e.g. References) is not picked up.
ENFORCEMENT_HEADING_RE = re.compile(
r"^##\s+Enforcement\s*$", re.IGNORECASE | re.MULTILINE
)
NEXT_SECTION_HEADING_RE = re.compile(r"^##\s+", re.MULTILINE)
JSON_FENCE_RE = re.compile(r"```json\s*\n(.*?)\n```", re.DOTALL)
HUNK_HEADER_RE = re.compile(r"^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@")
@@ -115,13 +127,20 @@ def adr_status(text: str) -> Optional[str]:
def parse_enforcement(adr_text: str, adr_path: Path) -> Optional[Dict]:
"""Extract and parse the JSON inside an ADR's ## Enforcement section.
Returns None when there is no Enforcement section. Raises JudgeError when
the section exists but the JSON is malformed.
Returns None when there is no Enforcement section, OR the section has
no fenced JSON block (prose-only enforcement is valid — see ADR-011).
Raises JudgeError when the JSON exists but is malformed.
"""
m = ENFORCEMENT_BLOCK_RE.search(adr_text)
if not m:
hm = ENFORCEMENT_HEADING_RE.search(adr_text)
if not hm:
return None
raw = m.group(1)
section_start = hm.end()
nm = NEXT_SECTION_HEADING_RE.search(adr_text, section_start)
section_end = nm.start() if nm else len(adr_text)
fm = JSON_FENCE_RE.search(adr_text, section_start, section_end)
if not fm:
return None
raw = fm.group(1)
try:
data = json.loads(raw)
except json.JSONDecodeError as e: