Kaizen Dashboard

Studio60 Continuous Improvement Agent — Last updated: 2026-03-16 ~12:30 CET (iter #206)

Iterations

206
#206: Libor pokyny zpracovány — npm audit delegován pulse+badwolf, Sim-026 HC fix→pulse. F-275 deploy pending (sentinel). Infra 11/11, relay 0.

Active Simulations

17
npm audit delegován (Libor approved) | Sim-026→pulse | Sim-022→sentinel | Sim-030 P2 waiting Libor

Implemented

10
F-254 prune ✓ | F-242 pg_dump ✓ | F-236 19DB ✓ | F-219 alert ✓ | F-201 M2M ✓

Findings

275
F-275: pulse error_events missing on prod | F-274: sentinel stagnation | F-273: stale scan timing | F-266: deploy.sh NO qa-gate

F-256–F-259: DR Gap — 4 Repos Without Off-site Backup (iter #177)

RepoSizeRemotePushedRisk
sentinel11MBGitHub ✓✓ RESOLVEDF-256 closed (iter #178)
kaizen5.5MBNONE177 iterations of improvement history
/root/scripts288KGit ✓✓ (iter #187)Has remote, committed
qa180KNONE86+ tests, NO commits at all

Coverage: 7/12 repos fully backed to GitHub (58%)
Impact: If sentinel server dies, these 4 repos are permanently lost. Service repos (auth, billit, pulse...) survive.
Action: TODO sent to sentinel (push). Libor notified via fess.

Relay Queues — ALL CLEAR (iter #202) ✓

0 unread across all 20 queues (new: hive queue). Stable since iter #181.
Resolution: Combination of sentinel execution, agent triggers, and natural consumption cleared the backlog.
Previous alert: F-230 (stagnation), F-261 (acceleration) — both RESOLVED.

F-226: Dev Velocity & Commit Quality (iter #142) — 91 commits/7d

Repo7d Commits48hStatus
Pulse1510VERY ACTIVE
Billit146VERY ACTIVE
BadWolf102ACTIVE
Auth90PAUSED
Mail51MODERATE
Sentinel30STAGNANT 5d
Venom20DORMANT (F-220)

Conventional commit rate: 96% (88/91) — +28pp from 68% baseline
Kaizen-attributed commits: 10 across pulse (3), billit (4), badwolf (2), sentinel (1)
Key pattern: Security/HC fixes get implemented fast. Housekeeping (npm audit, backup expansion) stalls.

DR Gap Audit (iter #128) — 3 New Findings

F-217kaizen repo has NO git remote — 128 iterations, state/, reports/, simulations/ at risk of total loss
F-218/root/scripts/ NOT version controlled — 9 critical ops scripts (relay-to-telegram, deploy.sh, qa-gate.sh, agent-trigger.sh)
F-219sentinel has uncommitted changes — 8 files, 414 insertions, remote fetch status unknown

Git remote coverage: 7/8 repos have remote (only kaizen missing). /root/scripts/ on 3 servers but no git.
F-169 billit-web HC IPv6: VERIFIED ✓ — healthy FS=0 after fix (3 commits by billit agent).

F-224: Agent Autonomy Gap — Only 1/8 Agents Run Autonomously (iter #140)

Agentrun.shCronUnreadStatus
kaizen*/15AUTONOMOUS
sentinel0HAS RUNNER, NO CRON
qaHAS RUNNER, NO CRON
venom5NO RUNNER
badwolf3NO RUNNER
mail2NO RUNNER
pulse2NO RUNNER
auth1NO RUNNER
billit1NO RUNNER

Root cause of delegation failures (F-204, F-220), unread relay messages (14 total), and sentinel stagnation.
Fix: Add sentinel + qa run.sh to cron. Create run.sh for service agents. Sim-017 blocked on Libor.

Delegation Effectiveness Audit (iter #118) — 67% Success Rate

ItemDelegated ToStatus
F-203 Sentinel git remotesentinelDONE ✓
F-174 Pulse CORS whitelistpulseDONE ✓
F-162 Billit API scopesbillitDONE ✓
F-180 Pulse HC port fixsentinel/pulseDONE ✓
F-169 billit-web HC IPv6billitDONE ✓
Sim-031 QA PipelineQA agentDONE ✓
F-209 Venom npm auditvenomNOT DONE
F-210 BadWolf npm auditbadwolfNOT DONE
F-201 Pulse M2M scopepulseDONE ✓
F-204 /root/scripts backupsentinelNOT DONE
Sim-030 P1 .env.examplesentinel/PMNOT DONE
Sim-022 Volume backupsentinelNOT DONE

Pattern: Security/HC fixes (high visibility, clear impact) → get done. Housekeeping improvements (npm audit, backup expansion) → stall. Items requiring Libor → blocked until direct contact.

NEW: Sim-031 QA Deploy Pipeline — ACCEPTED (iter #86)

3-tier deploy pipeline: smoke → standard → full test coverage per service.
Architecture: sentinel runs tests/run.sh directly (no cerebro roundtrip).
qa-gate.sh proposal: wraps sentinel test framework + npm test.
Implementation delegated to sentinel.

Sim-025: Agent Memory & Coordination — 5/8 Tracks DONE

COMPLETE Track A (Runtime) — Mode detection, mandatory block, feature flags, credential cleanup, GraphRAG, auto memory write (7/7)
PARTIAL Track C (deploy.sh) — deploy.sh + common.sh created. BUG: billit container names wrong (F-187).
COMPLETE Track F (GraphRAG/Memory) — knowledge_domains in 9 YAMLs, memory audit done (51 files, dedup identified), auto memory write verified.
COMPLETE Track H (Scaffolding) — create-project.sh exists (225 lines), templates in s60-tools, 19/19 agents consistent.

Status: 5/8 tracks complete, 1 partial (C), Track B exists differently than planned, 2 remaining.

NEW: Sim-030 Secret Management Audit (iter #64)

Complete secret inventory across 11 services, 5 servers, ~40 credentials mapped.

CRITICAL F-189: changeme123 hardcoded in 4+ locations (Redis compose, CLAUDE.md files, Neo4j)
HIGH F-190: Zero secret rotation ever — no policy, no tooling, no audit log
HIGH F-191: Payment credentials (Stripe/GoPay) in plaintext .env, no extra protection
HIGH F-194: Backup secrets not encrypted at rest (rsync plaintext to argus/Hetzner)
MEDIUM F-192: 5 .env.example files on prod-alfa — may reveal structure
MEDIUM F-193: billit-redis runs without password (network-isolated, low risk)

Plan: Phase 1 (CLAUDE.md cleanup) delegated to PM + sentinel. Phase 2 (password rotation) requires Libor approval. Phase 3 (SOPS/age encryption) future.

NEW: Sim-029 Claude Code Feature Audit (iter #65)

Claude Code v2.1.76 feature utilization: 8% (3/39 flags). 0 plugins, 0 hooks, 2 broken MCP servers.

F-195 66 stale session files (40MB) in kaizen project — no cleanup mechanism
F-196 2 broken MCP servers (Google Calendar/Gmail) — never authenticated, useless
F-197 0/43 plugins installed despite code-review, hookify, security-guidance available
F-198 0 hooks configured — no safety guardrails on autonomous agents
F-199 Feature utilization 8% — missing --max-budget-usd, --effort, --fallback-model, --name

SELF-IMPROVED run.sh updated: --no-session-persistence, --name, --max-budget-usd 0.75, --fallback-model sonnet
Next: Plugin install (hookify, security-guidance) + MCP cleanup — propose to Libor.

DEV SURGE: 27 Commits/24h — 2 Critical Findings RESOLVED (F-170)

Pulse: 10 commits (regression tests, error tracking, BullMQ queues, bug reporting)
Billit: 9 commits (API key scope enforcement ✓, product catalog, admin, PDF cache)
BadWolf: 7 commits (DB migration 020 ✓, BillitSync fix, RelayService)
Mail: 1 commit (CLAUDE.md update)

RESOLVED F-162: Billit API key scopes — ApiKeyScopeGuard + default-deny deployed
RESOLVED F-168: BadWolf missing tables — Migration 020 created courses, locations, online_courses, companies
PARTIAL F-163: BillitSync — Staging URL hardcode removed, service created. Data backfill still needed.

RESOLVED: billit-web HC IPv6 ✓ (F-169)

Fixed: nginx IPv6 enabled, FS=0. Healthcheck passing permanently after restarts.

CORRECTED: F-178 Was FALSE POSITIVE — Tier Data OK ✓

Iter #57 verification: tier column is varchar(20), NOT enum. Data stores hexa (3), full (3), null (2) correctly.
The pg_enum values (full_pack, monthly) are from an old unused type, not a constraint.
Billing is NOT broken. 0 runtime errors confirmed.

RESOLVED: F-155 billitInvoiceId Column EXISTS ✓

Iter #57 verification: billitInvoiceId column exists in credit_transactions table.
Migration 001_add_billit_invoice_id.sql (commit 278861e) was applied. 0 billing webhook errors.
NEW F-182: Pulse has no automated migration runner — only manual SQL files in migrations/. Future schema changes need manual SQL.

RESOLVED: Billit-Redis Network Fix ✓ (F-171, Sim-024 IMPLEMENTED)

Iter #57 verification: billit-redis now on BOTH networks (s60-network + billit_billit-internal).
Errors: 802/h → ~3/h (99.6% reduction). 20 errors in 6h vs 4,800+ previously.
billit-api on s60-network, can now reach billit-redis via DNS. BullMQ + caching operational.
Remaining: Permanent fix needed in docker-compose.yml (current fix = manual network connect, may reset on restart).

BadWolf Location.deleted_at Column Missing (F-172)

Migration 020 created locations table WITHOUT deleted_at column. Entity expects it.
4 errors: column Location.deleted_at does not exist. Soft-delete broken.
FIX: ALTER TABLE locations ADD COLUMN IF NOT EXISTS deleted_at TIMESTAMP;

MILESTONE: HC 100% — All Services Healthy (FS=0) ✓

Iter #80: Pulse Dockerfile HC port 3200→3100 fixed (F-180 RESOLVED). billit-web IPv6 resolved (F-169).
Iter #81: Permanence confirmed — after night restarts (Pulse 03:36, billit 03:25 UTC), ALL containers healthy with FS=0.
Score: 9/9 services with healthchecks = 100% healthy. Only n8n + billit-redis without HC (by design).
Duration: 16 iterations of escalation to achieve. Sim-026 IMPLEMENTED ✓

SSH Password Auth DISABLED ✓ (F-176, Sim-023 IMPLEMENTED)

PasswordAuthentication no on BOTH servers (prod-alfa + hub-alfa).
fail2ban remains ACTIVE — 0 currently banned, 369 total (brute-force now blocked at SSH level).
Combined with UFW + fail2ban = SSH fully hardened.

BadWolf BillitSync: Data Backfill Still Needed (F-163)

BillitSync service created, staging URL hardcode removed. But:
• All online_courses have company_id = NULL → sync skips 100%
• Prices not synced (unitPrice: null)
Status: Awaiting data backfill (needs Libor confirmation for company_id mapping).

Relay Queues: sentinel 6 unread — Delegation Bottleneck

19 queues total. Sentinel: 6 unread (↑↑ from 3). Minor: badwolf=2, venom=1, auth=1, mail=2.
Total messages processed: pm 316 | billit 405 | infra 197 | sentinel 193 | auth 163 | main 126 | pulse 89 | badwolf 80.
Pattern: sentinel reads kaizen messages but doesn't act on housekeeping (F-213 PG fix, F-204 scripts backup). Security/HC fixes get done, improvement tasks stall.

Security Hardening Progress

FIXED UFW Firewall ACTIVE on both servers (F-130)
FIXED Docker ports bound to 127.0.0.1 (F-132)
FIXED s60-redis has volume + AOF (F-131)
FIXED Nginx attack paths blocked — .php/.env/.git/wp-* → 403 (F-134)
FIXED Billit API key scopes enforced — default-deny guard (F-162) ✓
FIXED fail2ban active on both servers (F-166)

FIXED SSH password auth disabled — both servers (F-176, Sim-023) ✓
Implementation rate: ~43% → ~46% (↑ Sim-025 Phase 1a progress)

PG Backup Cron RESOLVED ✓ (F-160 CLOSED)

12/12 databases backed up successfully at 3:00 AM Mar 14.
P0 closed after 7 escalations. DO Managed PG remains as safety net.

REMAINING QUICK FIXES (for Libor/Sentinel)

1. Pulse Dockerfile HC port — RESOLVED (Sim-026, FS=0 permanent)
2. Pulse CORS whitelist — RESOLVED (commit 3321ed0, allowedOrigins whitelist deployed, evil.com rejected)
3. N8n stop (10 sec): docker stop s60-n8n — free 294MB RAM
4. billit-web HC fix — RESOLVED (nginx IPv6, FS=0 permanent)
5. Billit-Redis compose fix: Add billit-redis back to docker-compose.yml on s60-network (currently manual network connect)
6. Password rotation: changeme123 on s60-redis + neo4j
7. Pulse tier DB — FALSE POSITIVE (varchar, not enum) — data correct
8. Pulse billitInvoiceId — RESOLVED (column exists, migration 001 applied)
9. Billit-Redis network — RESOLVED (both networks connected, ~3 err/h)
10. SSH password disable — DONE on both servers (Sim-023)

RESOLVED Issues (cumulative)

Billit API key scopes — ApiKeyScopeGuard deployed, default-deny (F-162) ✓
BadWolf missing tables — Migration 020 deployed (F-168) ✓
NO FIREWALL — UFW ACTIVE, deny default (F-130, Sim-020) ✓
Nginx returns 200 for attacks — security-deny.conf on all sites (F-134, Sim-021) ✓
Docker ports 0.0.0.0 — All bound to 127.0.0.1 (F-132) ✓
s60-redis NO VOLUME — Named volume + AOF enabled (F-131, Sim-015 partial) ✓
Auth frontend/backend unhealthy — Both FailingStreak=0 (F-087, F-093) ✓
Docker log rotation missing — daemon.json deployed, 10m/3 files (F-151) ✓
Fess→Telegram missing — Sim-019 IMPLEMENTED, bidirectional bridge ✓
PG backup cron broken — 12/12 OK @ 3:00 Mar 14 (F-160) ✓
Pulse HC port mismatch — 3200→3100, FailingStreak 763→0 (F-173, still regresses F-180) ✓
SSH password auth enabled — Disabled on both servers (F-176, Sim-023) ✓
Pulse tier DB mismatch — FALSE POSITIVE: varchar(20), data correct (F-178 corrected iter #57) ✓
Pulse billitInvoiceId — Column exists, migration 001 applied (F-155 corrected iter #57) ✓
Billit-Redis network — Both networks connected, ~3 err/h (F-171, Sim-024 impl) ✓
Pulse CORS origin:true — Whitelist deployed, evil.com rejected (F-174) ✓

REMAINING SECURITY ALERTS

1. Billit API key scopesRESOLVED
2. SSH password authRESOLVED disabled on both servers ✓
3. serialize-javascript RCE — Remote code execution in s60-mail, badwolf (F-117)
4. Adminer publicly exposed on merlin — DB login on 6+ subdomains (F-101)
5. Neo4j default password — changeme123, 3,857 nodes (F-111)
6. s60-redis weak password — changeme123, 5 services (F-099)
7. 105 npm vulnerabilities (44 HIGH) — NestJS v10 (F-117)
8. Pulse CORS origin:trueRESOLVED whitelist deployed (F-174) ✓

DNS Map

ServerPublic IPSubdomainsStatus
WordPress hosting46.234.126.134studio60.cz, www (2) OK
prod-alfa178.104.36.211auth, mail, badwolf, venom, n8n, api (6) UFW + Nginx deny ✓
merlin (OLD)37.205.13.114pulse, billit, relay, grafana, docs, admin, s60, be (8) STALE — Adminer!
sentinel49.13.168.234sentinel, kaizen (2) OK
hub-alfa164.90.182.148hub (1) UFW + Nginx deny ✓
cerebro178.63.52.57cerebro (1) OK
argus37.205.14.239argus (1) OK

Service Dependency Map

Shared ResourceConsumersRisk
DO PostgreSQLauth, pulse, mail, badwolf, billit, n8n (6) SPOF Failure = total outage
s60-redisauth, pulse, mail, badwolf, n8n (5) IMPROVED Volume + AOF ✓ (pwd weak)
auth-backend (OIDC)pulse, billit (2) SPOF Login fails if auth down
billit-redisbillit-api (1) ISOLATED Well configured

Resource Usage (iter #57)

ContainerRAMUptimeStatus
s60-n8n294MB2dWASTE 0 workflows — Sim-027 ACCEPTED (Libor keeping for make.com migration)
billit-api~81MB34mHEALTHY Redis 0 err/h
s60-auth-backend~62MB2dHEALTHY
s60-badwolf~61MB15hHEALTHY 0 errors
s60-mail~52MB41hHEALTHY
s60-pulse~45MB2hHEALTHY FS=0 ✓
s60-redis~9MB2dHEALTHY
billit-redis~6MB2dREACHABLE both networks ✓
billit-web~5MB34mHEALTHY FS=0 ✓
s60-venom~5MB2dHEALTHY
s60-auth-frontend~5MB2dHEALTHY
Total: ~625MB / 7.6GB (8%)HC: 9/9 healthy (100%), all 11 running

Disk: 24G/150G (17%) | RAM: 1.9G/7.7G (24%)

Implementation Rate

SimulationStatusProgress
Sim-021 Nginx Attack Surface IMPLEMENTED ✓
Sim-020 Firewall Hardening IMPLEMENTED ✓
Sim-019 Relay-to-Telegram IMPLEMENTED ✓
Sim-005 Service Availability IMPLEMENTED ✓
Sim-015 Redis Hardening PARTIAL ✓
Sim-023 SSH Password Disable IMPLEMENTED ✓
Sim-025 Agent Memory & Coordination Phase 1a IN PROGRESS Track A DONE (7/7), Track C PARTIAL (deploy.sh has bugs)
Sim-026 Dockerfile HC Fix IMPLEMENTED ✓
Sim-027 N8n Removal ACCEPTED 294MB RAM freed. docker compose down + nginx cleanup.
Sim-029 Claude Code Feature Audit ACCEPTED + PHASE 1 DONE run.sh improved. Plugins + MCP cleanup pending Libor.
Sim-030 Secret Management ACCEPTED 3 phases: CLAUDE.md cleanup (delegated), password rotation (needs Libor), encryption at rest (future).
Sim-022 Volume Backup Expansion ACCEPTED 2-line fix. 3/5 → 5/5 coverage.
Sim-014 Docker Compose StdIN PROGRESS
Sim-001 Deploy ManifestPARTIAL
Sim-017 Sentinel CyclePENDING LIBORCron 6h
Sim-016 Merlin DNSNOT IMPL.8 stale records
5 more simulationsBLOCKEDAwaiting Libor

Questions for Libor (prioritized)

  1. Pulse CORS whitelistRESOLVED ✓ whitelist deployed (commit 3321ed0)
  2. Pulse M2M API key scopes — any valid key gets admin role without scope check (F-201).
  3. Neo4j + Redis passwords — changeme123 v produkci (Sim-030 Phase 2).
  4. Sentinel autonomous cycle — Sim-017, delegace bez toho nefunguje.
  5. npm audit fix — 105 vulns (44 HIGH), serialize-javascript RCE (F-117).
  6. Billit-Redis compose — make network connect permanent in docker-compose.yml.
  7. Sentinel git remote — 11MB local-only repo, no remote (F-203).