# Next Steps — Operator Runbook **Last Updated:** 2026-03-26 **Purpose:** Single runbook of copy-paste commands for all remaining operator/LAN/creds steps. Use after automated steps are done. **References:** [REMAINING_WORK_DETAILED_STEPS.md](REMAINING_WORK_DETAILED_STEPS.md), [WAVE2_WAVE3_OPERATOR_CHECKLIST.md](WAVE2_WAVE3_OPERATOR_CHECKLIST.md), [INFRA_DEPLOYMENT_LOCKED_AND_LOADED.md](../03-deployment/INFRA_DEPLOYMENT_LOCKED_AND_LOADED.md). **Single fixes checklist (required + optional):** [FIXES_PREPARED.md](../04-configuration/FIXES_PREPARED.md). **Full fixes (validators, block/tx, Sentries, RPCs, network, optional):** [FULL_FIXES_PREPARED.md](../04-configuration/FULL_FIXES_PREPARED.md). **All next steps (consolidated):** [NEXT_STEPS_ALL.md](NEXT_STEPS_ALL.md). **Dev/Codespaces (76.53.10.40):** [DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md](../04-configuration/DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md). **Dev/Codespaces completion evidence:** [DEV_CODESPACES_COMPLETION_20260207.md](../04-configuration/verification-evidence/DEV_CODESPACES_COMPLETION_20260207.md). --- ## Completed in this session (2026-03-26) | Item | Result | |------|--------| | NPMplus CT recovery | Port `81` on `192.168.11.167` accepted TCP but stalled at HTTP; `pct reboot 10233` on `r630-01` restored the expected `301`. | | NPMplus proxy host update | `NPM_URL=https://192.168.11.167:81 bash scripts/nginx-proxy-manager/update-npmplus-proxy-hosts-api.sh` completed with **39 hosts updated, 0 failed**. | | Sankofa routing | `sankofa.nexus`, `www.sankofa.nexus`, `phoenix.sankofa.nexus`, `www.phoenix.sankofa.nexus`, `studio.sankofa.nexus`, and `the-order.sankofa.nexus` now pass in the public E2E profile. | | Public E2E verification | Latest run `bash scripts/verify/verify-end-to-end-routing.sh --profile=public` exited `0`; **Failed: 0**, **DNS passed: 37**, **HTTPS passed: 22**. DBIS, Mifos, and MIM4U public endpoints also passed. Evidence: `docs/04-configuration/verification-evidence/e2e-verification-20260326_115013/`. | | Private E2E verification | Latest run `bash scripts/verify/verify-end-to-end-routing.sh --profile=private` exited `0`; **Failed: 0**, **DNS passed: 4**. Private HTTP and WS RPC endpoints all passed. Evidence: `docs/04-configuration/verification-evidence/e2e-verification-20260326_120939/`. | | NPMplus backup | Fresh backup completed at `backups/npmplus/backup-20260326_115622.tar.gz`. | | Blockscout verification | `./scripts/verify/run-contract-verification-with-proxy.sh` completed; contracts were submitted or skipped if already verified. | | Private RPC redirect fix | `rpc-http-prv.d-bis.org` now responds with JSON-RPC `200` after updating the NPMplus host to stop forcing HTTPS redirects on POSTs. | | `.env` handling | NPM-only runs should use targeted `NPM_EMAIL` / `NPM_PASSWORD` extraction when exporting the full `.env` causes `Argument list too long`. | **Still from LAN:** no public or private E2E follow-up was needed in the latest runs; only re-run the maintenance section if those endpoints regress. --- ## Completed in this session (2026-02-20) | Item | Result | |------|--------| | Completable tasks | `run-completable-tasks-from-anywhere.sh` — config validation OK, on-chain 45/45, run-all-validation --skip-genesis OK, reconcile-env --print. | | Doc consolidation | NEXT_STEPS_INDEX, DOCUMENTATION_CONSOLIDATION_PLAN; batches and root cleanup recorded in [ARCHIVE_CANDIDATES.md](ARCHIVE_CANDIDATES.md) ("Last reviewed" set). | ## Completed in previous session (2026-02-19) | Item | Result | |------|--------| | Completable tasks | `run-completable-tasks-from-anywhere.sh` — config, 46 on-chain, validation passed. | | Operator script | `run-all-operator-tasks-from-lan.sh` — W0-1 skipped (off-LAN); Blockscout verify attempted (Blockscout unreachable). | | RPC 2101 verify | `verify-rpc-2101-approve-and-sync.sh` — ✅ Chain 138, 19 peers, 5 validators, blocks advancing. | | 502 script | `address-all-remaining-502s.sh` — backends 10130/10150/10151 OK; Besu 2101 restarted (finish from LAN for NPMplus). | | Optional Phase 9 | Smart accounts kit (informational) — ran; next: deploy EntryPoint/AccountFactory/Paymaster. | | E2E verification | `verify-end-to-end-routing.sh` with E2E_ACCEPT_502_INTERNAL=1 — run (report in verification-evidence). | **Still from LAN:** NPMplus backup, Blockscout verification, full 502/NPMplus proxy update. **Runbooks:** [OPERATOR_READY_CHECKLIST.md](OPERATOR_READY_CHECKLIST.md), [../04-configuration/NPMPLUS_QUICK_REF.md](../04-configuration/NPMPLUS_QUICK_REF.md), [../04-configuration/EXPLORER_LINKS_AND_ISSUES_DIAGNOSTIC.md](../04-configuration/EXPLORER_LINKS_AND_ISSUES_DIAGNOSTIC.md) (`scripts/verify/check-explorer-links.sh`). --- ## Completed in previous session (2026-02-06) | Item | Result | |------|--------| | Validation | `run-all-validation.sh --skip-genesis` — passed | | W1-1 dry-run | `setup-ssh-key-auth.sh --dry-run` — steps printed | | W1-2 dry-run | `firewall-proxmox-8006.sh --dry-run` — UFW commands printed (ADMIN_CIDR=192.168.11.0/24) | | NPMplus backup | `backup-npmplus.sh` — ran successfully (local + on host); backup pulled to `backups/npmplus/backup-20260206_171756.tar.gz` | | Bridge dry-run | `run-send-cross-chain.sh 0.01 --dry-run` — simulated (real run when PRIVATE_KEY/LINK ready) | | .env NPM | NPM_URL/NPM_HOST set to 192.168.11.167:81 (use .167 if .166 refuses) | | **Copy to host** | Scripts copied to **root@192.168.11.11:/tmp/proxmox-scripts-run** (wave0, backup, secure-validator-keys, create-missing-containers, schedule cron scripts, daily-weekly-checks) | | **Wave 0 on host** | Ran on r630-01: W0-1 (19 NPMplus proxy hosts updated), W0-3 (backup); backup also on host at `.../backups/npmplus/backup-20260206_171756.tar.gz` | | **Backup pulled** | Host backup copied to local `backups/npmplus/backup-20260206_171756.tar.gz` | | **Validator keys** | `secure-validator-keys.sh --dry-run` run on host — 1000–1002 would be secured; 1003–1004 not running, skipped. Use `--apply` on host when ready. | | **Cron scripts on host** | schedule-npmplus-backup-cron.sh and schedule-daily-weekly-cron.sh (and daily-weekly-checks.sh) copied; use `--show` then `--install` from `/tmp/proxmox-scripts-run` if you want cron there (note: /tmp may be cleared on reboot; for permanent cron, clone repo to a persistent path on the host). | | **Cron installed on host** | NPMplus backup cron (03:00) and daily/weekly cron (08:00 daily, Sun 09:00 weekly) installed on root@192.168.11.11. Logs: `/tmp/proxmox-scripts-run/logs/npmplus-backup.log`, `daily-weekly-checks.log`. | | **Validator keys applied** | `secure-validator-keys.sh` run on host (no --dry-run): VMIDs 1000, 1001, 1002 secured (chmod 600/700, chown besu); 1003, 1004 not running, skipped. | --- ## Wave 0 — Gates ### W0-2: sendCrossChain (real) **When:** PRIVATE_KEY and LINK (or fee token) approved in `.env`; you are ready to broadcast. ```bash cd /path/to/proxmox # Optional: dry-run first bash scripts/bridge/run-send-cross-chain.sh 0.01 --dry-run # Real (no --dry-run) bash scripts/bridge/run-send-cross-chain.sh 0.01 # Or with recipient: bash scripts/bridge/run-send-cross-chain.sh 0.01 0xYourRecipientAddress ``` Bridge contract (reference): `0xcacfd227A040002e49e2e01626363071324f820a`. Ensure `CCIPWETH9_BRIDGE_CHAIN138` and `RPC_URL_138`/`CHAIN138_RPC` in `.env`. ### W0-3: NPMplus backup (re-run anytime) Backup already ran once; re-run when NPMplus is up and you want a fresh backup: ```bash cd /path/to/proxmox bash scripts/verify/backup-npmplus.sh ``` From a host without NPM API access, use: `bash scripts/run-via-proxmox-ssh.sh wave0 --host 192.168.11.11` (r630-01) to run W0-1 + W0-3 on the host. --- ## Crontab (install on jump host or Proxmox node) ```bash cd /path/to/proxmox # Show lines bash scripts/maintenance/schedule-npmplus-backup-cron.sh --show bash scripts/maintenance/schedule-daily-weekly-cron.sh --show # Install bash scripts/maintenance/schedule-npmplus-backup-cron.sh --install bash scripts/maintenance/schedule-daily-weekly-cron.sh --install ``` --- ## Wave 1 — Security (run on each Proxmox host or via SSH) ### W1-1: SSH key-based auth (disable password) **Pre-requisite:** Deploy SSH keys to all hosts (`ssh-copy-id root@`); test login; have break-glass access. ```bash cd /path/to/proxmox # On each Proxmox host (or: ssh root@192.168.11.11 'cd /path/to/proxmox && bash scripts/security/setup-ssh-key-auth.sh --apply') bash scripts/security/setup-ssh-key-auth.sh --apply ``` ### W1-2: Firewall — restrict Proxmox API port 8006 **Pre-requisite:** Run on host where UFW is used (or apply equivalent iptables). Default CIDR: 192.168.11.0/24. ```bash cd /path/to/proxmox # Dry-run (already done) bash scripts/security/firewall-proxmox-8006.sh --dry-run # Apply (allow only ADMIN_CIDR) bash scripts/security/firewall-proxmox-8006.sh --apply # Or with custom CIDR: bash scripts/security/firewall-proxmox-8006.sh --apply 192.168.11.0/24 ``` Then verify: `https://:8006` only from allowed IPs. ### W1-19: Secure validator keys (on Proxmox host as root) ```bash cd /path/to/proxmox bash scripts/secure-validator-keys.sh --dry-run # review bash scripts/secure-validator-keys.sh # apply (chmod 600, chown besu) ``` --- --- ## VMIDs 2506, 2507, 2508 — Destroyed 2026-02-08 Containers 2506, 2507, 2508 were **removed and destroyed** on all Proxmox hosts. Script: `scripts/destroy-vmids-2506-2508.sh`. Besu RPC range is **2500–2505** only. See [MISSING_CONTAINERS_LIST.md](../03-deployment/MISSING_CONTAINERS_LIST.md). --- ## Dev/Codespaces (76.53.10.40) — Full completion **Single ordered checklist:** [04-configuration/DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md](../04-configuration/DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md) — Phases 1–7 (fourth NPMplus, dev VM, UDM port forward, Cloudflare tunnel, NPMplus proxy hosts, projects/dotenv, verification). **Key commands (after fourth NPMplus and dev VM exist):** | Step | Command | |------|---------| | Create fourth NPMplus LXC (10236 @ 192.168.11.170) | `bash scripts/npmplus/create-npmplus-fourth-container.sh` | | Create dev VM (5700 @ 192.168.11.59) | `bash scripts/create-dev-vm-5700.sh` | | Setup dev VM users + Gitea | `ssh root@192.168.11.11 "pct exec 5700 -- bash -s" < scripts/setup-dev-vm-users-and-gitea.sh` | | Tunnel + DNS (set CLOUDFLARE_TUNNEL_ID_DEV_CODESPACES in .env first) | `bash scripts/cloudflare/configure-dev-codespaces-tunnel-and-dns.sh` | | Fourth NPMplus proxy hosts | `NPM_URL=https://192.168.11.170:81 NPM_PASSWORD='...' bash scripts/nginx-proxy-manager/update-npmplus-fourth-proxy-hosts.sh` | UDM Pro: add port forward 76.53.10.40 → 192.168.11.170 (80/81/443), optional 22 → 192.168.11.59. See [UDM_PRO_DEV_CODESPACES_PORT_FORWARD.md](../04-configuration/UDM_PRO_DEV_CODESPACES_PORT_FORWARD.md). --- ## Wave 2 & Wave 3 — Full checklist Use the ordered checklist: - **[WAVE2_WAVE3_OPERATOR_CHECKLIST.md](WAVE2_WAVE3_OPERATOR_CHECKLIST.md)** — W2-1 (monitoring) through W2-8 (NPMplus HA), then W3-1 (CCIP Fleet), W3-2 (Phase 4 isolation). Summary: | Wave | Tasks | |------|--------| | W2-1 | Monitoring stack (Prometheus, Grafana, Loki, Alertmanager) | | W2-2 | Grafana via Cloudflare Access; alerts | | W2-3 | VLAN enablement (UDM Pro, Proxmox bridge) | | W2-4 | Phase 3 CCIP: Ops/Admin (5400–5401); NAT; scripts | | W2-5 | Phase 4 sovereign tenant VLANs | | W2-6 | ~~2506–2508~~ Destroyed 2026-02-08 (RPC 2500–2505 only) | | W2-7 | DBIS services (10100–10151) | | W2-8 | NPMplus HA (optional) | | W3-1 | CCIP Fleet (commit/execute/RMN nodes) | | W3-2 | Phase 4 tenant isolation enforcement | --- ## Explorer SSL (manual) If **explorer.d-bis.org** shows "Your connection isn't private": 1. Open NPMplus: **https://192.168.11.167:81** (credentials: `NPM_EMAIL`, `NPM_PASSWORD` from `.env`). 2. SSL Certificates → Add Let's Encrypt for `explorer.d-bis.org` (DNS Challenge + Cloudflare credential if needed). 3. Proxy Hosts → explorer.d-bis.org → SSL tab → assign cert, Force SSL, Save. See [EXPLORER_TROUBLESHOOTING.md](../04-configuration/EXPLORER_TROUBLESHOOTING.md). --- ## E2E 502s (when public domains return 502) From **LAN** (SSH to Proxmox + reach NPMplus): | Goal | Command | |------|---------| | Fix all 502 backends + NPMplus proxy + RPC diagnostics | `./scripts/maintenance/address-all-remaining-502s.sh` | | Also Besu config fix + E2E at end | `./scripts/maintenance/address-all-remaining-502s.sh --run-besu-fix --e2e` | | Re-run E2E only | `./scripts/verify/verify-end-to-end-routing.sh` | **Runbook:** [502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md](502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md). --- ## Remaining (operator only) - **W0-2** — sendCrossChain real (when PRIVATE_KEY/LINK ready). - **W1-1 / W1-2** — SSH key auth and firewall 8006 `--apply` on each Proxmox host (after keys deployed / CIDR decided). - **Cron** — ✅ Installed on root@192.168.11.11 (NPMplus 03:00; daily 08:00; weekly Sun 09:00). Re-install if you move repo to a permanent path. - **Validator keys** — ✅ Applied on host for 1000–1002; 1003–1004 skipped (not running). Re-run when 1003/1004 are up if needed. - **2506–2508** — Destroyed 2026-02-08; no action. - **Wave 2 / 3** — Monitoring, VLAN, CCIP, NPMplus HA, Phase 4 per WAVE2_WAVE3_OPERATOR_CHECKLIST. - **Explorer SSL** — Let's Encrypt for explorer.d-bis.org in NPMplus UI (see above). One-time (and after NPMplus restore if certs lost). - **Explorer VM 5000 thin pool** — If thin1-r630-02 is >85% or full, migrate VMID 5000 to thin5 per [BLOCKSCOUT_FIX_RUNBOOK.md](../03-deployment/BLOCKSCOUT_FIX_RUNBOOK.md) § "Fix: Migrate VM 5000 to thin5". Weekly cron now checks thin pool (138a); act when it warns or fails. - **NPMplus cert 134 (cross-all.defi-oracle.io)** — If verification reports "cert files missing" for cert ID 134: in NPMplus at https://192.168.11.167:81 → SSL Certificates → find cross-all.defi-oracle.io → re-save or request Let's Encrypt again to restore cert files on disk. - **Dev/Codespaces (76.53.10.40)** — Complete all phases in [DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md](../04-configuration/DEV_CODESPACES_NEXT_STEPS_CHECKLIST.md): fourth NPMplus (10236), dev VM (5700), UDM port forward, Cloudflare tunnel, NPMplus fourth proxy hosts, Let's Encrypt, rsync/dotenv, verification. --- ## After running "complete all next steps" 1. **Automated (workspace):** `bash scripts/run-all-next-steps.sh` — report in `docs/04-configuration/verification-evidence/NEXT_STEPS_RUN_*.md`. 2. **Validators + tx-pool:** `bash scripts/fix-all-validators-and-txpool.sh` (requires SSH to .10, .11). 3. **Flush stuck tx (if any):** `bash scripts/flush-stuck-tx-rpc-and-validators.sh --full` (clears RPC 2101 + validators 1000–1004). 4. **Verify from LAN:** From a host on 192.168.11.x run `bash scripts/monitoring/monitor-blockchain-health.sh` and `bash scripts/skip-stuck-transactions.sh`. See [NEXT_STEPS_COMPLETION_RUN_20260208.md](../04-configuration/verification-evidence/NEXT_STEPS_COMPLETION_RUN_20260208.md) § Verify from LAN. --- ## Quick command index | Goal | Command | |------|---------| | **Run all automated next steps** | `bash scripts/run-all-next-steps.sh` (validation, E2E, explorer check, dry-runs; report in verification-evidence/NEXT_STEPS_RUN_*.md) | | W0-2 real | `bash scripts/bridge/run-send-cross-chain.sh 0.01` | | W0-3 backup | `bash scripts/verify/backup-npmplus.sh` | | W0 from LAN | `bash scripts/run-wave0-from-lan.sh` | | W1-1 apply | `bash scripts/security/setup-ssh-key-auth.sh --apply` (on each host) | | W1-2 apply | `bash scripts/security/firewall-proxmox-8006.sh --apply` | | NPMplus cron | `bash scripts/maintenance/schedule-npmplus-backup-cron.sh --install` | | Daily/weekly cron | `bash scripts/maintenance/schedule-daily-weekly-cron.sh --install` | | Validator keys | On Proxmox: `bash scripts/secure-validator-keys.sh` (after --dry-run) | | Wave 0 via SSH | `bash scripts/run-via-proxmox-ssh.sh wave0 --host 192.168.11.11` | | Request cert (via SSH) | `bash scripts/run-via-proxmox-ssh.sh request-cert --host 192.168.11.11` | | Fourth NPMplus container | `bash scripts/npmplus/create-npmplus-fourth-container.sh` | | Dev VM create | `bash scripts/create-dev-vm-5700.sh` | | Dev/Codespaces tunnel+DNS | `bash scripts/cloudflare/configure-dev-codespaces-tunnel-and-dns.sh` (set CLOUDFLARE_TUNNEL_ID_DEV_CODESPACES in .env) | | Fourth NPMplus proxy hosts | `NPM_URL=https://192.168.11.170:81 NPM_PASSWORD='...' bash scripts/nginx-proxy-manager/update-npmplus-fourth-proxy-hosts.sh` | | **Address all 502s (LAN)** | `./scripts/maintenance/address-all-remaining-502s.sh` (use `--run-besu-fix --e2e` for full flow) | | E2E routing (after NPMplus/DNS change) | `bash scripts/verify/verify-end-to-end-routing.sh` | | Explorer E2E from LAN (after frontend/Blockscout deploy) | `bash explorer-monorepo/scripts/e2e-test-explorer.sh` | | Blockscout migrations (version/config change) | On r630-02: `bash scripts/fix-blockscout-ssl-and-migrations.sh` — see [BLOCKSCOUT_FIX_RUNBOOK.md](../03-deployment/BLOCKSCOUT_FIX_RUNBOOK.md) | | When decommissioning RPC used by explorer | Update Blockscout RPC URL on VM 5000; restart Blockscout — see [OPERATIONAL_RUNBOOKS.md](../03-deployment/OPERATIONAL_RUNBOOKS.md) § "When decommissioning or changing RPC nodes" |