Commit Graph

4 Commits

Author SHA1 Message Date
Дмитрий 11822e3803 fix(observer): RU_PHONE regex catches bare 7XXXXXXXXXX (DO-PII-1)
Bug: gitleaks (rule `ru-phone-unmasked`) caught `79135191264` in 3 lines
of docs/observer/episodes-2026-05.jsonl during brain-retro #3 push
(963379c3). Stop-hook PII-filter was not masking bare-format Russian
phone numbers (without the `+` prefix).

Root cause:
  const RU_PHONE = /\+7\d{10}/g;   // requires literal '+7'

Free-text observer episodes captured phone `79135191264` in field-value
context (`call client 79135191264` / `phone 79135191264 in payload`),
slipping past the existing filter.

Fix:
  const RU_PHONE = /(?:\+7|\b7)\d{10}/g;

The `\b7` branch catches bare format with a word-boundary on the left,
avoiding false-positives inside long digit sequences (timestamps, IDs,
hashes). False-positive guard verified via test:
  'id 1796133619135191264999 not a phone' → unchanged.

TDD cycle:
  - RED: 3 new tests + 1 sanitizeWithCount test (4 fails on bare phone)
  - GREEN: regex extended, 24/24 file tests pass, 373/373 full tools
    suite GREEN (0 regressions across 18 files).

Cleanup: applied sanitize() to docs/observer/episodes-2026-05.jsonl;
11 lines touched (3 phone-leak lines + 8 with other PII patterns).
gitleaks now finds 0 leaks in the file.

Pravila §5.2 (no PII in commits) + 152-FZ (phone is regulated PD).
Closes DO-PII-1 (see memory observer-pii-leak-2026-05-23).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:26:24 +03:00
Дмитрий dbe2252421 feat(observer): real PII counter — STATUS.md stops lying
Closes brain-retro 2026-05-20 #3 SIMPLIFIED — sanitizeWithCount in
pii-filter (counts matches per pattern) + persistent monthly counter
docs/observer/.pii-counters.json (bumped by Stop-hook on each episode
write) + status-md-generator reads real count (no more piiMatches: 0
hardcode).

PII patterns themselves NOT changed (F7 of parallel session already
extended to 13 patterns).

Counter is informational — write failure never blocks Stop-event.

5+1+1=7 new vitest tests, 256/256 GREEN.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:36 +03:00
Дмитрий 2476dd3c1b fix(observer): expand PII patterns — JWT/AWS/Yandex/IPv4/OS-username
PII filter previously covered only RU phone, email, Sentry, OpenAI token,
and generic Bearer. Several common surface leaks were uncovered:

- JWT tokens (eyJ<base64>.<base64>.<base64>) — auth/session tokens.
- AWS access key IDs (AKIA<16 alphanum>) — IAM static creds.
- Yandex Cloud IAM static keys (AQVN<base64>), session tokens (t1.<base64>),
  OAuth tokens (y0_<base64>) — primary cloud-provider for this project.
- IPv4 addresses (dotted-quad) — over-redacts 4-segment build numbers as
  an accepted tradeoff (under-redaction is the worse failure).
- Windows user-paths (C:\Users\<name>) → C:\Users\***. Otherwise the OS
  username `Administrator` leaks via task_size.files in every episode.
- POSIX /home/<name>/ → /home/***/. Same rationale for Linux dev hosts.

Pattern order: highly-specific token patterns (JWT/AWS/YC) run BEFORE
OPENAI_TOKEN/GENERIC_BEARER fallbacks; otherwise partial overlaps would
strip the wrong segments.

Tests: 9 new (each new pattern + idempotency over the expanded redaction
markers). 27/27 PII tests green.

.gitleaks.toml: added the test fixture to the path allowlist — the file
contains synthetic JWT/AWS/Yandex tokens (the filter is supposed to redact
them), not real secrets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:10:53 +03:00
Дмитрий 4616308402 feat(observer): PII filter — phone/email/Sentry/OpenAI/Bearer masking
Used by Stop-hook before JSONL write. 6 Vitest cases including
idempotence and recursive object sanitization. Per Pravila §16.2 +
ADR-011 + spec §5.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 06:11:25 +03:00