Files
portal/tools
Дмитрий 2476dd3c1b fix(observer): expand PII patterns — JWT/AWS/Yandex/IPv4/OS-username
PII filter previously covered only RU phone, email, Sentry, OpenAI token,
and generic Bearer. Several common surface leaks were uncovered:

- JWT tokens (eyJ<base64>.<base64>.<base64>) — auth/session tokens.
- AWS access key IDs (AKIA<16 alphanum>) — IAM static creds.
- Yandex Cloud IAM static keys (AQVN<base64>), session tokens (t1.<base64>),
  OAuth tokens (y0_<base64>) — primary cloud-provider for this project.
- IPv4 addresses (dotted-quad) — over-redacts 4-segment build numbers as
  an accepted tradeoff (under-redaction is the worse failure).
- Windows user-paths (C:\Users\<name>) → C:\Users\***. Otherwise the OS
  username `Administrator` leaks via task_size.files in every episode.
- POSIX /home/<name>/ → /home/***/. Same rationale for Linux dev hosts.

Pattern order: highly-specific token patterns (JWT/AWS/YC) run BEFORE
OPENAI_TOKEN/GENERIC_BEARER fallbacks; otherwise partial overlaps would
strip the wrong segments.

Tests: 9 new (each new pattern + idempotency over the expanded redaction
markers). 27/27 PII tests green.

.gitleaks.toml: added the test fixture to the path allowlist — the file
contains synthetic JWT/AWS/Yandex tokens (the filter is supposed to redact
them), not real secrets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 11:10:53 +03:00
..