portal

Author	SHA1	Message	Date
Дмитрий	972be5c58a	ci: fix pre-deploy-checks paths (APP_DIR + backup dir) Канонические пути из deploy.yml: - APP_DIR: /opt/liderra/app → /var/www/liderra/app - Backup dir: /var/backups/postgresql → /home/ubuntu/deploy-backups/ (deploy.yml сохраняет pre-deploy backups как app-pre-deploy-*.tgz) Также Check 4 теперь NOTE вместо FAIL для случаев >24h или отсутствия dir — deploy.yml сам создаёт свежий backup перед раскаткой. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 18:29:38 +03:00
Дмитрий	7c5b7215a1	ci: pre-deploy-checks workflow (Pravila §2.4 via Azure runner) Воспроизводит 8 pre-flight проверок project-local агента prod-deploy-validator через GitHub Actions runner (Azure), обходя YC backbone-фильтр который блокирует direct SSH с dev-IP 89.144.17.119. Read-only — ничего не меняет на проде. Возвращает GO/NO-GO в exit code. Использует тот же LIDERRA_SSH_KEY что deploy.yml. Cross-ref: docs/Pravila_raboty_Claude_v1_1.md §2.4, .claude/agents/prod-deploy-validator.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 18:27:08 +03:00
Дмитрий	34bcc570ad	fix(setup-logrotate): add 'su postgres postgres' directive для PG logrotate ремонт: logrotate отказал rotation PG log из-за insecure parent dir permissions /var/log/postgresql/ имеет permissions drwxrwxr-t (group-writable + sticky). Logrotate refuses to rotate без явного su directive в config. Стандарт postgresql-common тоже использует 'su' — копирую идиому.	2026-05-29 14:48:05 +03:00
Дмитрий	6383da7f12	chore(incident-followup): close 4 tails from 29.05 disk-full incident ремонт: incident-followup cleanup batch — 4 хвоста 1. Larastan baseline regenerated (was 161 errors pre-existing IDE helper drift) 2. Deptrac Mail: [Model, Service] + ADR-005 amend (was 4 pre-existing violations) 3. PG logrotate config in setup-logrotate.yml 4. F1 6 mismatches — RCA updated (algorithm divergence trigger global vs verify per-tenant) +3 cspell words: notifempty, missingok, верифицируется. Ref: docs/incidents/2026-05-29-disk-full-pg-recovery.md §4-5	2026-05-29 14:45:28 +03:00
Дмитрий	8fde6a3b50	ops(prevention): disk-usage-alert workflow — cron every 30min ремонт: prevent recurrence of 29.05 disk-full incident GitHub Actions cron */30 min: ssh + df -h /. Threshold 85% → warning, 95% → critical (job fails, GitHub notifications fire). Output: GITHUB_STEP_SUMMARY with size/used/avail + likely causes from incident. Future: extend sql-runner whitelist для INSERT into incidents_log (post-Б-1 Sentry/Telegram bot integration).	2026-05-29 13:57:40 +03:00
Дмитрий	ef19b9f256	fix(f1-rebuild): canonical ROW(...) expression matching AuditRebuildChain.php ремонт: prev rebuild left 6 mismatches на activity_log_y2026_m05 Previous workflow used t::text::bytea (full row). Canonical algorithm uses explicit ROW(col1, ..., NULL::bytea, ..., coln)::text::bytea with COLUMN_CONFIG. Workflow now switches ROW expression by partition family. +6 cspell words: psql/euo/coln/esac/cnt/bytea.	2026-05-29 13:53:18 +03:00
Дмитрий	1c4c22ab5e	fix(f1-rebuild): use shell expansion для PARTITION/FROM_ID в DO block ремонт: psql \set vars не expand'ятся в server-side plpgsql DO block В section 2 (DO $rebuild$ block) использовал :'partition' и :from_id — client-side psql substitution не работает внутри DO (server-side parse). Заменил на shell expansion ('$PARTITION', $FROM_ID) до psql. Sections 1+3 без изменений (plain psql statements там работают).	2026-05-29 13:43:30 +03:00
Дмитрий	1001b89a91	ops(incident-followup): f1-rebuild-via-superuser workflow ремонт: F1 chain rebuild для 152-ФЗ целостности Closes deferred item from docs/incidents/2026-05-29-disk-full-pg-recovery.md §4.1. Sequential hash recomputation в plpgsql DO-блоке через sudo -u postgres psql. Identical алгоритм с trigger audit_chain_hash() (post-F1 advisory-lock). Inputs: partition (whitelist), from_id, dry_run/confirm_apply. Safety: partition whitelist, ON_ERROR_STOP, COMMIT only after full loop.	2026-05-29 13:40:11 +03:00
Дмитрий	a21712c9e1	ops(incident-prevention): setup-logrotate workflow для Laravel logs ремонт: 8.7G laravel.log сожрал диск 29.05 — нужна size-based rotation 50M/5 копий Installs /etc/logrotate.d/laravel-liderra: - size 50M (rotate when >= 50MB, не daily) - rotate 5 (keep 5 rotated copies = max ~250MB total) - compress + delaycompress - copytruncate (atomic, не сбивает Laravel file handle) - su/create www-data:www-data Verified через logrotate --debug + --force. Prevents recurrence of disk-full incident 2026-05-29.	2026-05-29 13:25:40 +03:00
Дмитрий	1e5378da94	ops(incident): allow audit:rebuild-chain в artisan-run whitelist Adds audit:rebuild-chain --partition=<name> --from-id=<n> [--force] to MUTATING_RE regex group. Required to rebuild hash chain on 2 broken partitions (activity_log_y2026_m05 from id=599, balance_transactions_y2026_m05 from id=462) after F1 advisory-lock migration applied. Ref: docs/superpowers/plans/2026-05-29-audit-chain-race-fix.md Step 3.3	2026-05-29 13:15:29 +03:00
Дмитрий	8092bdb024	ops(incident): f1-apply-via-superuser workflow ремонт: deploy.yml fail на F1 миграции — schema public требует postgres superuser, у crm_migrator нет прав на CREATE OR REPLACE FUNCTION Applies F1 audit-chain advisory-lock migration via sudo -u postgres psql, then INSERTs migration row so subsequent php artisan migrate skips it. Workaround for prod deploy where crm_migrator can't modify public schema.	2026-05-29 13:03:05 +03:00
Дмитрий	7f7036f3ab	ops(incident): disk-recover v2 — laravel.log 8.7G + sudo bash redirect для PG log ремонт: v1 освободил только 440M (apt clean + nginx gz); главный виновник — laravel.log 8.7G + syslog 525M + playwright cache 440M; sudo truncate на PG log дал Permission denied — workaround через sudo bash -c ': > file' Targeted fixes for v1 issues: - laravel.log 8.7G + laravel.log.1 572M → truncate via sudo bash redirect - syslog 525M → truncate - PG log 497M → workaround via sudo bash redirect (sudo truncate gave Permission denied) - /var/www/.cache/ms-playwright ~440M → removed (dev cache, not needed in prod)	2026-05-29 12:48:04 +03:00
Дмитрий	883908ea78	ops(incident): disk-recover workflow for liderra.ru / 100% full ремонт: PG в PANIC loop из-за / 19G/19G/0, нужна целевая чистка логов чтобы PG смог записать checkpoint и завершить recovery Diagnose + safe cleanup workflow: - truncate /var/log/postgresql/postgresql-16-main.log (PG в PANIC, inode preserved) - journalctl --vacuum-size=200M - nginx old .gz >3 days - apt-get clean - Laravel storage/logs .log >7 days - generic /var/log *.gz >50M Triggered manually via gh workflow run disk-recover.yml -f confirm_apply=true Guard: confirm_apply must be true.	2026-05-29 12:45:44 +03:00
Дмитрий	f187425835	ops(incident): pg-diagnose workflow for PostgreSQL recovery diagnosis (on main for gh workflow run dispatch) ремонт: PG не отвечает 20+ мин, нужен диагностический workflow Read-only SSH-based diagnostic for PG-not-accepting-connections incident: systemctl/journalctl/df/free/uptime + tail /var/log/postgresql/postgresql-16-main.log + WAL size + dmesg + HTTPS probe of liderra.ru. Triggered manually via gh workflow run pg-diagnose.yml. No production mutations. (Cherry-picked from feat/router-gate-hard-wall `8cbb84e1` — gh workflow run requires file on default branch.)	2026-05-29 12:39:18 +03:00
Дмитрий	f97103b05f	fix(review): apply F2 review feedback — sql-runner semicolon guard + RouteSupplierLeadJob original_error log capture Important fix (sql-runner.yml): Reject multi-statement SQL — `SELECT 1; UPDATE supplier_leads ...` was passing READ_RE whitelist and executing the second statement on prod without confirm_mutating=true. Added explicit `";"` guard before regex checks. Minor fix (RouteSupplierLeadJob.php): Capture `$originalError = \$lead->error` BEFORE `\$lead->update(...)`. Laravel mutates the in-memory model, so reading `\$lead->error` after update returns the already-suffixed value, making Log::info `original_error` field useless for debugging. Both findings from F2 review subagent on commit c8c089cb. Test verification: 10/10 Pest GREEN (6 SupplierWebhookFastFail + 4 SingleLeadStorm).	2026-05-29 09:11:28 +03:00
Дмитрий	002b8c4c35	ops(sql-runner): add whitelisted SQL workflow + stuck-leads cleanup doc .github/workflows/sql-runner.yml — универсальный SQL-runner для прод-операций через GitHub Actions (workflow_dispatch). Whitelist: SELECT/WITH/EXPLAIN (read-only) + targeted UPDATE/DELETE на 5 таблицах при confirm_mutating=true. docs/ops/2026-05-29-stage5-stuck-leads-cleanup.md — шаблон rollback log + инструкции для cleanup 2 застрявших supplier_leads (id=1110, 1157, ~256k failed_webhook_jobs). Root cause: поставщик crm.bp-gr.ru шлёт B1+SMS combo, constraint chk_supplier_projects_b1_not_for_sms запрещает (Finding 2 Stage 5). Task 1 plan 2026-05-29-supplier-webhook-fast-fail-and-stuck-cleanup.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 09:11:26 +03:00
Дмитрий	23c7615284	ci(stage5-investigate): round 3 schema discovery — list columns of activity_log/balance_transactions/supplier_projects/supplier_leads; SELECT * on broken audit rows (ids 597-601 + 460-464) and stuck supplier_leads (1110, 1157) + sample failed_webhook_jobs raw_payload + all B1 supplier_projects	2026-05-29 06:41:01 +03:00
Дмитрий	fdd688dc06	ci(stage5-investigate): round 2 root-cause queries — chain triggers on broken vs healthy partitions + audit_chain_hash function + broken row context (ids 599/462 + neighbours); webhook storm — top supplier_lead_id + supplier_projects with illegal B1+SMS combo + project_id concentration + signal_type distribution + real leads processed last 24h	2026-05-29 06:32:36 +03:00
Дмитрий	ea7cc84a37	ci: stage5 day-1 investigation workflow — diagnose audit:verify-chains failures + failed_webhook_jobs 163k spike (one-shot read-only, hardcoded SQL on incidents_log/failed_jobs/failed_webhook_jobs + direct audit:verify-chains -v artisan call)	2026-05-29 06:24:30 +03:00
Дмитрий	5c02d33cce	feat(stage5): daily monitor workflow + remove non-existent partitions:list from artisan-run whitelist + checklist refinement (GitHub-cron 06:00 UTC daily 29.05-04.06 runs scheduler:check-heartbeats + incidents:watch-failures + migrate:status + 4 SQL signals from incidents_log/project_routing_snapshots/failed_webhook_jobs/scheduler_heartbeats; window auto-stops after 2026-06-05; result to job summary + artifact)	2026-05-29 05:42:30 +03:00
Дмитрий	89f124cd27	fix(artisan-run): pass command via base64 to avoid SSH shell-quote space loss (first dry-run showed 'supplier:rekey-orphansdry-run' — space eaten by printf %q + outer double-quote interaction; base64 encode locally + decode on prod side preserves spaces and special chars cleanly)	2026-05-29 05:13:14 +03:00
Дмитрий	7ec97230af	ci: add artisan-run workflow as ssh-bypass for prod artisan commands (whitelist of read-only/dry-run/inspection commands runs without confirm; mutating commands require confirm_apply=true input; output to job summary + artifact; works while dev IP 89.144.17.119 blocked by YC backbone filter)	2026-05-29 05:07:43 +03:00
Дмитрий	5e103ef5b5	ci(ssh-diagnose): add round 2 — show sshd_config.d/01-claude.conf, full nftables ruleset, ssh.service journal, fail2ban jail.d content, recidive jail check (round 1 showed dev IP not in fail2ban banlist, INPUT policy ACCEPT — narrowing to 01-claude.conf restriction or nftables f2b-table; recidive jail can persist bans beyond regular sshd bantime)	2026-05-29 04:47:10 +03:00
Дмитрий	35243de8ac	ci: add ssh-diagnose workflow to inspect prod sshd block (fail2ban/iptables/sshd_config/hosts.deny — diagnose why dev IP 89.144.17.119 cannot establish SSH banner with prod despite TCP/22 open; read-only workflow_dispatch with 12 queries to job summary)	2026-05-29 04:44:45 +03:00
Дмитрий	14c98c37c2	fix(ci/deploy): drop ON CONFLICT on migrations marker INSERT (table has no UNIQUE) Run 26566803068 created project_routing_snapshots successfully on prod (CREATE TABLE + partitions + RLS + GRANTs all committed). Marker INSERT into migrations table failed: "there is no unique or exclusion constraint matching the ON CONFLICT specification" because Laravel's migrations table has no UNIQUE on `migration` column. Replaced with INSERT...SELECT WHERE NOT EXISTS for idempotency. Table is now LIVE on prod — next workflow run will skip the CREATE block (TABLE_EXISTS check passes) and go straight to the now-fixed marker INSERT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 12:38:52 +03:00
Дмитрий	54360d6f3b	fix(ci/deploy): pre-apply partitioned migrations via postgres superuser + e2e CWD fix Workflow run 26564909645 failed: migration 2026_05_27_120000_create_project_routing_snapshots_table hit 'SET ROLE crm_migrator' failure (pgsql conn = crm_app_user, not member of crm_migrator). Failed SET ROLE poisoned transaction → subsequent CREATE TABLE failed SQLSTATE[25P02]. Fix in deploy.yml: New step 'Pre-apply partitioned migrations via postgres superuser' runs CREATE TABLE + indexes + RLS + GRANTs + partitions + system_settings insert via sudo -u postgres psql, then marks migration as ran in migrations table. Idempotent (checks both migrations table AND information_schema). Established prod pattern (memory: paused_at migration 26.05). Side fix in tools/enforce-override-limit.test.mjs: CLI e2e tests used 'node tools/enforce-override-limit.mjs' without cwd, failed when vitest ran from app/. Added cwd: projectRoot via fileURLToPath(import.meta.url). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 12:33:47 +03:00
Дмитрий	81f92ca361	fix(ci/deploy): npm ci --legacy-peer-deps + Node 22 (deploy.yml v1.1) Workflow run 26564332893 failed at 14s — most likely npm ci hit Histoire/Vite peerDep conflict (quirk #74 in feedback_environment.md). --legacy-peer-deps mirrors local install pattern. Also bumped to Node 22 (Node 20 actions deprecated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 11:45:23 +03:00
Дмитрий	7511f4e537	feat(ci): GitHub Actions deploy workflow for liderra.ru — fundamental fix for dev→prod SSH block Adds .github/workflows/deploy.yml — manual workflow_dispatch trigger that: 1) checkouts requested ref (default main) 2) builds frontend (npm ci + npm run build) 3) tarballs app + db excluding .env/storage/vendor/node_modules/bootstrap-cache 4) ssh-deploys via stored secret LIDERRA_SSH_KEY to ubuntu@111.88.246.137 5) extracts overlay + runs /var/www/liderra/redeploy.sh (composer + migrate + restart) 6) backfills today's snapshot (slepok-stage-2 Task 2.12 Step 3) 7) runs smoke tests (migrate:status, snapshots count, service health, portal http) Why this is needed: My dev VM (89.144.17.119) → prod VM (111.88.246.137) traffic passes TCP-handshake but app-layer banner exchange times out. Same VPC, SG 0.0.0.0/0, iptables empty, fail2ban clean — drop happens on YC backbone between specific source/dest pair. GitHub Actions runners come from Azure IPs, NOT affected by this filter. One-time setup needed: GitHub Settings → Secrets → Actions → New secret Name: LIDERRA_SSH_KEY Value: content of ~/.ssh/liderra_deploy (private key, full file) Future deploys: `gh workflow run deploy.yml -f ref=main` from anywhere. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 11:34:07 +03:00
Дмитрий	4382de3a79	feat(controller): C1 l1-watcher — settings.json ↔ Tooling drift detector Pure regex/JSON, 0 LLM calls. 4 Vitest tests GREEN. Per ADR-011 + spec §6.1. Smoke run surfaces REAL drift (DONE_WITH_CONCERNS — plan B5 said «that's a real signal, document, don't fix here»): 9 plugins in ~/.claude/settings.json enabledPlugins NOT formalized by exact «name@source» string in Tooling Прил. Н: - frontend-design@claude-plugins-official (informally as #30 «Frontend Design plugin») - 8× ToB plugins @trailofbits (differential-review, audit-context- building, supply-chain-risk-auditor, insecure-defaults, sharp- edges, static-analysis, variant-analysis, agentic-actions-auditor) informally as #39 «Trail of Bits Skills» This is naming-vocabulary mismatch (Tooling uses human-readable names; settings.json uses machine names). Not architectural drift. Resolution options for follow-up: - Add machine names as «external_id» attribute to Tooling Прил. Н rows. - Add tools/.l1-watcher-aliases.txt with accepted machine→human map. Until resolved: C1 will FAIL on lefthook (C5 wiring) — addressed in C5 by adding alias mechanism OR temporarily downgrade to WARN. Also fixed CLI guard bug in observer-stop-hook.mjs (B3) and l1-watcher — old guard `import.meta.url === \`file://\${argv[1]}\`` did not match on Windows (file:/// triple-slash vs file:// double-slash + relative argv[1]). New guard: argv[1].endsWith('/<filename>.mjs'). Weekly GH Actions cron (Mon 09:00 MSK) opens issue on drift. Vitest config extended to ../tools/.test.mjs with exclude for ruflo- and subagent-prompt-prefix tests (pre-existing, not part of brain governance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 06:31:18 +03:00
Дмитрий	0c36b7a28d	feat(a11y): migrate Pa11y scope from handoff prototypes to live Vue app Closes Audit #3 sole P1 (F-A11Y-PA11Y-SCOPE-01). Pa11y was scanning handoff HTML prototypes from liderra_v8_handoff/concepts/ (3 URLs, ~10 contrast violations), NOT the live Vue app. Audit #2 baseline "0 errors" was inaccurate — real portal was never covered. Changes: - pa11y.config.json: now targets http://localhost:8000/<route> for 7 guest pages (login, register, forgot, 2fa, recovery, 403, 500) - pa11y-handoff.config.json: preserves historical handoff baseline as opt-in (`npm run a11y:handoff`) - package.json: new `a11y:handoff` script; `a11y` repointed to live target - RecoveryCodesView.vue: scoped CSS override fixes Vuetify warning-tonal alert content contrast (2.03:1 → ≥4.5:1, color #0a0700 per Pa11y rec) - .github/workflows/a11y.yml: new CI job with dev-server lifecycle (php artisan serve + curl wait-on + Pa11y + screenshot artifact upload) - docs/audit-baseline-pa11y.md: first live baseline document with per-URL status, ignore selectors rationale, re-run instructions Local verification: - npm run a11y: 7/7 URLs passed (0 violations) - vue-tsc: 0 errors - ESLint: 0 errors - Vitest: 88 files / 683 passed / 3 skipped / 0 failed (no regressions) Plan: docs/superpowers/plans/2026-05-14-audit3-deferred-fixes.md Task 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 08:25:14 +03:00
Дмитрий	e5848bddec	feat(tooling): Trivy CI workflow prep — disabled until YC Docker (#26 )	2026-05-10 08:45:17 +03:00
Дмитрий	53fb1ec27e	feat(ci): Semgrep SAST workflow — push/PR to main (#25 )	2026-05-10 08:40:52 +03:00
Дмитрий	cc6e1cba72	fix(configs): Windows-fix format:sql:check + lychee/pa11y/composer/ESLint hygiene + npm-outdated CI (audit P0-03 + 5 P1/O) Закрытие аудита 2026-05-09 (`b6ae8dd`): - P0-03: format:sql:check заменён /tmp/ на db/.schema-formatted.tmp.sql (Windows-совместимо). + .gitignore: добавлен db/.schema-formatted.tmp.sql. + дополнительно: web/*/.html убран из npm run links — статические концепты web/v8/.html используют root-relative ссылки на будущие маршруты Vue, lychee не резолвит их без --root-dir; они уже в exclude_path. Lychee с 20 → 1 error (оставшийся 1 — pre-existing битая ссылка в docs/superpowers/specs/, вне scope). - P1-02: .lychee.toml exclude root-relative для web/v8/.html — добавлен regex паттерн для будущих маршрутов (login/register/legal/dashboard/deals/admin/...). - P1-12: pa11y.config.json пути обновлены на liderra_v8_handoff/concepts/v8_*.html (login/dashboard/deals). - P1-07: composer audit-offline скрипт (composer audit --locked) для офлайн-режима. - O-refactor-05: ESLint no-restricted-imports запрещает явный import из 'vuetify/components'. - O-stack-08: .github/workflows/dependency-check.yml — еженедельный (Mon 09:00 UTC) npm outdated с авто-issue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 18:31:17 +03:00

33 Commits