Files
proxmox/docs/03-deployment/PUBLIC_SECTOR_LIVE_DEPLOYMENT_CHECKLIST.md

121 lines
8.8 KiB
Markdown
Raw Normal View History

# Public sector live deployment checklist (Complete Credential, SMOA, Phoenix)
**Last updated:** 2026-03-26
**Related:** [PUBLIC_SECTOR_TENANCY_MARKETPLACE_AND_DEPLOYMENT_BASELINE.md](../02-architecture/PUBLIC_SECTOR_TENANCY_MARKETPLACE_AND_DEPLOYMENT_BASELINE.md), [COMPLETE_CREDENTIAL_EIDAS_PROGRAM_REPOS.md](../11-references/COMPLETE_CREDENTIAL_EIDAS_PROGRAM_REPOS.md), [DEPLOY_CONFIRM_AND_FULL_E2E_RUNBOOK.md](../00-meta/DEPLOY_CONFIRM_AND_FULL_E2E_RUNBOOK.md), [`config/public-sector-program-manifest.json`](../../config/public-sector-program-manifest.json)
This checklist tracks **proxmox-repo automation** and **sibling repos** (`../complete-credential`, `../smoa`). Rows marked **Done (session)** were executed from an operator host with LAN access unless noted.
---
## Execution log (2026-03-23)
| Action | Result |
|--------|--------|
| Sankofa `api` + `portal` (workstation) | API: `websocket.ts` imports `logger`; GraphQL/schema fixes under `api/src`. Portal: Apollo + dashboard GraphQL, UI primitives, **`root: true`** `.eslintrc.json` with **`@typescript-eslint` + strict `no-explicit-any` / `no-console` / a11y / `import/order`** (optional hardening: lib clients typed with `unknown`, form `htmlFor`/`id`, escaped entities). **`pnpm exec tsc --noEmit`** + **`pnpm build`** clean. **Deploy:** sync `portal/` (+ lockfile) to CT **7801**, `pnpm install && pnpm build`, restart `sankofa-portal`; sync **7800** API if needed |
## Execution log (2026-03-26)
| Action | Result |
|--------|--------|
| `./scripts/run-all-operator-tasks-from-lan.sh` (live, no `--dry-run`) | Exit 0 (~36 min); W0-1 NPMplus RPC/proxy host updates; W0-3 live NPMplus backup; Blockscout verification step ran |
| NPMplus update script | Some hosts logged duplicate-create then PUT recovery; `rpc.tw-core.d-bis.org` and `*.tw-core.d-bis.org` showed repeated failures — **review those rows in NPM UI** if traffic depends on them |
| `scripts/maintenance/diagnose-vm-health-via-proxmox-ssh.sh` | Completed: Phoenix CTs **78007803** running on r630-01; NPMplus **10233** up; port 81 check OK |
| `scripts/maintenance/npmplus-verify-port81.sh` | **Restored** in repo; loopback :81 returns HTTP 301 (redirect) — treated as reachable |
| `pct exec 7800` / `7801`: `ss -tlnp` | **As of 2026-03-26 session:** no listeners. **As of 2026-03-23 follow-up:** **7800** API can reach `active` + `/health` on **:4000** when `sankofa-api` is deployed; **7801** portal needs **current** portal tree + successful **`pnpm build`** on the CT (see 2026-03-23 log row above) |
| `pct exec 7802` Keycloak | `http://127.0.0.1:8080/`**200**; `/health/ready` → 404 (version may use different health path) |
| `./scripts/run-completable-tasks-from-anywhere.sh` | Exit 0 |
| `E2E_ACCEPT_502_INTERNAL=1 ./scripts/verify/verify-end-to-end-routing.sh` | 0 failed; report `docs/04-configuration/verification-evidence/e2e-verification-20260325_182512/` |
| `./scripts/verify/run-contract-verification-with-proxy.sh` | Exit 0 |
| `complete-credential` Phase 1 compose + `run-phase1-synthetic.sh` | OK (operator console 8087 = 200) |
| `../smoa`: `./gradlew :app:assembleDebug` | BUILD SUCCESSFUL; APK: `smoa/app/build/outputs/apk/debug/app-debug.apk` |
| `scripts/deployment/sync-sankofa-portal-7801.sh` + NPM alignment | Portal tree synced to CT **7801**, `pnpm install` + `pnpm build`, `sankofa-portal` **active** (`*:3000`). NPM proxy IDs **36**: `sankofa.nexus` / `www`**192.168.11.51:3000**; `phoenix.sankofa.nexus` / `www`**192.168.11.50:4000**. Repeatable deploy: `./scripts/deployment/sync-sankofa-portal-7801.sh` (`--dry-run` first). |
| `validate-config-files.sh` / `run-completable-tasks-from-anywhere.sh` | Exit 0 |
---
## Execution log (2026-03-25)
| Action | Result |
|--------|--------|
| RPC `192.168.11.221:8545` / `192.168.11.211:8545` | HTTP 201 |
| SSH `root@192.168.11.10` / `.11` | OK (BatchMode) |
| `./scripts/run-completable-tasks-from-anywhere.sh` | Exit 0 |
| `./scripts/verify/check-contracts-on-chain-138.sh` | **64/64** present |
| `E2E_ACCEPT_502_INTERNAL=1 ./scripts/verify/verify-end-to-end-routing.sh` | 37 domains, 0 failed; report under `docs/04-configuration/verification-evidence/e2e-verification-20260325_165153/` |
| `https://phoenix.sankofa.nexus/`, `https://sankofa.nexus/` | HTTP 200 |
| `http://192.168.11.50:4000/health`, `:51:3000`, `:52:8080/health/ready` | No HTTP response from operator host (hosts ping; services may be down, firewalled, or not bound) — **re-check on Proxmox / in-container** |
| `./scripts/verify/backup-npmplus.sh --dry-run` | OK |
| `./scripts/verify/run-contract-verification-with-proxy.sh` | Exit 0 |
| `./scripts/run-all-operator-tasks-from-lan.sh --dry-run` | Printed wave0 + verify sequence |
| `cd smom-dbis-138 && forge test --match-path 'test/e2e/*.sol'` | Exit 0 |
| `cd ../smoa && ./gradlew smoaVerify --no-daemon` | Exit 0 |
| `complete-credential`: `git submodule status` | Submodules present on commits |
| `docker compose -f integration/docker-compose.phase1.yml config` | Valid |
| `docker compose -f integration/docker-compose.phase1.yml up -d` | All Phase 1 containers up |
| Rebuild + recreate `cc-operator-console`; `./integration/run-phase1-synthetic.sh` | OK |
---
## Checklist
| ID | Task | Status |
|----|------|--------|
| A1 | LAN / VPN; Proxmox SSH | Done (session) |
| A2 | Root `.env` + `smom-dbis-138/.env` for operator | Operator to confirm secrets present |
| A3 | `config/public-sector-program-manifest.json` valid | Done (completable) |
| B1 | NPMplus proxy + TLS for public FQDNs | **Done (2026-03-26)**`run-wave0-from-lan.sh` / update script applied; spot-check `rpc.tw-core` / `*.tw-core` in NPM if needed |
| B2 | `scripts/verify/backup-npmplus.sh` (live) | **Done (2026-03-26)** — W0-3 as part of `run-all-operator-tasks-from-lan.sh` |
| B3 | `scripts/maintenance/npmplus-verify-port81.sh` | **Done** — script restored; SSH `pct exec 10233` loopback :81 |
| C1 | Phoenix stack VMIDs 78007803 per `SERVICE_DESCRIPTIONS.md` | **7802 Keycloak:** HTTP 200 on `/` inside CT. **7800 API:** listener **:4000** (`/health` OK). **7801 portal:** `sankofa-portal` active, Next on **:3000** (sync via `scripts/deployment/sync-sankofa-portal-7801.sh`) |
| C2 | Keycloak realms: admin / tenant / org-unit RBAC | Product + IdP work — not automated here |
| C3 | Phoenix API + portal wired; GraphQL `/graphql`, `/health` | **API:** `curl -sS http://192.168.11.50:4000/health`. **Portal:** `curl -sS http://192.168.11.51:3000/` (Next HTML). NPM: apex `sankofa` / `phoenix` hosts → **.51:3000** / **.50:4000** (not Blockscout) |
| C4 | Service catalog SKUs + entitlements (billing optional) | Product — see tenancy baseline G2 |
| D1 | SMOA LXC per `smoa/backend/docs/LXC-PROXMOX-CONTAINERS.md` | Deploy on Proxmox |
| D2 | SMOA API behind NPM | After D1 |
| D3 | Release APK + download URL or MDM | **Debug APK built (2026-03-26):** `../smoa/app/build/outputs/apk/debug/app-debug.apk` — publish via CI signed release + NPM/static URL or MDM |
| D4 | Device E2E against prod API | After D2D3 |
| E1 | `complete-credential` submodules initialized | Done (session) |
| E2 | Phase 1 Docker stack local/CI | Done (session) — not yet Proxmox production |
| E3 | `./integration/run-phase1-synthetic.sh` after console rebuild | Done (session) |
| E4 | Production slice / dedicated LXC for `cc-*` | Architecture choice (profile A/B/C) |
| F1 | Chain 138 on-chain contract check | Done (session) |
| F2 | Blockscout verification | Done (session) |
| F3 | Public E2E routing | Done (session, 502-tolerant flag) |
| G1 | Logs, metrics, DB backups for Phoenix + SMOA + CC DBs | Operational runbooks |
| G2 | Incident ownership per stack | Process |
---
## Quick commands (repo root unless noted)
```bash
./scripts/run-completable-tasks-from-anywhere.sh
source scripts/lib/load-project-env.sh && ./scripts/verify/check-contracts-on-chain-138.sh
E2E_ACCEPT_502_INTERNAL=1 ./scripts/verify/verify-end-to-end-routing.sh
./scripts/verify/run-contract-verification-with-proxy.sh
./scripts/deployment/sync-sankofa-portal-7801.sh --dry-run # then run without --dry-run (portal → CT 7801)
./scripts/verify/backup-npmplus.sh --dry-run # then run without --dry-run
```
**Complete Credential (sibling clone):**
```bash
cd ../complete-credential
docker compose -f integration/docker-compose.phase1.yml up -d --build
./integration/run-phase1-synthetic.sh
```
**SMOA:**
```bash
cd ../smoa && ./gradlew smoaVerify --no-daemon
```
---
## Follow-ups
1. **Phoenix LAN services:** Curl `192.168.11.50:4000/health` and `192.168.11.51:3000/`; if portal is down, run `sync-sankofa-portal-7801.sh` or `systemctl status sankofa-portal` on CT **7801**.
2. **Operator full wave:** `./scripts/run-all-operator-tasks-from-lan.sh` only when NPM RPC fix + backup + verify are intentionally desired (mutates NPM).
3. **Production Complete Credential:** Move from laptop Docker to **dedicated LXC** and NPM routes per deployment profile.