Bug: gitleaks (rule `ru-phone-unmasked`) caught `79135191264` in 3 lines
of docs/observer/episodes-2026-05.jsonl during brain-retro #3 push
(963379c3). Stop-hook PII-filter was not masking bare-format Russian
phone numbers (without the `+` prefix).
Root cause:
const RU_PHONE = /\+7\d{10}/g; // requires literal '+7'
Free-text observer episodes captured phone `79135191264` in field-value
context (`call client 79135191264` / `phone 79135191264 in payload`),
slipping past the existing filter.
Fix:
const RU_PHONE = /(?:\+7|\b7)\d{10}/g;
The `\b7` branch catches bare format with a word-boundary on the left,
avoiding false-positives inside long digit sequences (timestamps, IDs,
hashes). False-positive guard verified via test:
'id 1796133619135191264999 not a phone' → unchanged.
TDD cycle:
- RED: 3 new tests + 1 sanitizeWithCount test (4 fails on bare phone)
- GREEN: regex extended, 24/24 file tests pass, 373/373 full tools
suite GREEN (0 regressions across 18 files).
Cleanup: applied sanitize() to docs/observer/episodes-2026-05.jsonl;
11 lines touched (3 phone-leak lines + 8 with other PII patterns).
gitleaks now finds 0 leaks in the file.
Pravila §5.2 (no PII in commits) + 152-FZ (phone is regulated PD).
Closes DO-PII-1 (see memory observer-pii-leak-2026-05-23).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes brain-retro 2026-05-20 #3 SIMPLIFIED — sanitizeWithCount in
pii-filter (counts matches per pattern) + persistent monthly counter
docs/observer/.pii-counters.json (bumped by Stop-hook on each episode
write) + status-md-generator reads real count (no more piiMatches: 0
hardcode).
PII patterns themselves NOT changed (F7 of parallel session already
extended to 13 patterns).
Counter is informational — write failure never blocks Stop-event.
5+1+1=7 new vitest tests, 256/256 GREEN.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PII filter previously covered only RU phone, email, Sentry, OpenAI token,
and generic Bearer. Several common surface leaks were uncovered:
- JWT tokens (eyJ<base64>.<base64>.<base64>) — auth/session tokens.
- AWS access key IDs (AKIA<16 alphanum>) — IAM static creds.
- Yandex Cloud IAM static keys (AQVN<base64>), session tokens (t1.<base64>),
OAuth tokens (y0_<base64>) — primary cloud-provider for this project.
- IPv4 addresses (dotted-quad) — over-redacts 4-segment build numbers as
an accepted tradeoff (under-redaction is the worse failure).
- Windows user-paths (C:\Users\<name>) → C:\Users\***. Otherwise the OS
username `Administrator` leaks via task_size.files in every episode.
- POSIX /home/<name>/ → /home/***/. Same rationale for Linux dev hosts.
Pattern order: highly-specific token patterns (JWT/AWS/YC) run BEFORE
OPENAI_TOKEN/GENERIC_BEARER fallbacks; otherwise partial overlaps would
strip the wrong segments.
Tests: 9 new (each new pattern + idempotency over the expanded redaction
markers). 27/27 PII tests green.
.gitleaks.toml: added the test fixture to the path allowlist — the file
contains synthetic JWT/AWS/Yandex tokens (the filter is supposed to redact
them), not real secrets.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Used by Stop-hook before JSONL write. 6 Vitest cases including
idempotence and recursive object sanitization. Per Pravila §16.2 +
ADR-011 + spec §5.4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>