Add Sankofa consolidated hub operator tooling

This commit is contained in:
defiQUG
2026-04-13 21:41:14 -07:00
parent 49740f1a59
commit b7eebb87b3
42 changed files with 2635 additions and 14 deletions

View File

@@ -0,0 +1,96 @@
# Non-chain ecosystem — hyperscaler-style design and deployment
**Status:** Architecture / target operating model
**Last updated:** 2026-04-13
**Scope:** Everything **except** blockchain-adjacent guests and services (Besu validators and RPC lanes, Blockscout-style explorers, bridge relayers, Chain 138 deploy paths, token-aggregation **runtime** tied to chain RPC). Those stay on their own **chain plane** with chain-specific runbooks. This document is the **application and edge plane** for Sankofa, Phoenix, DBIS core, portals, NPM, identity, and supporting data.
---
## 1. What “ecosystem” means here
A coherent **platform**: operators and clients interact through a small number of **managed surfaces** (DNS, TLS, APIs, portals), backed by **clear boundaries** (identity, data, observability, change management). Hyperscalers do not run “one random VM per microsite”; they run **regional edge**, **shared app runtimes**, **managed data**, and **global control planes** with strict contracts.
Your non-chain ecosystem should **feel** like that: fewer hand-crafted snowflakes, more **repeatable cells** (LXC or VM patterns), **declared** upstreams, and **observable** health—not a flat list of unrelated CTs.
---
## 2. Hyperscaler concepts mapped to this program
| Hyperscaler idea | Plain language | This ecosystem (non-chain) |
|------------------|----------------|----------------------------|
| **Region** | Geography / failure domain | **LAN site** (e.g. VLAN 11 + Proxmox cluster) — one “region” today; multi-site is a later region pair. |
| **Availability zone** | Independent power/network within a region | **Proxmox nodes** (e.g. r630-01 vs r630-04) — place **stateless edge** and **burst** workloads across nodes; keep **tightly coupled** DB + app tiers co-located unless latency and HA analysis say otherwise. |
| **Edge / front door** | TLS termination, routing, WAF | **NPMplus** (and optional Cloudflare in front) — single place for certs, forced HTTPS, and upstream policy. |
| **API gateway / mesh ingress** | One front for many backends | **Phoenix API hub** (nginx Tier 1 today; optional BFF Tier 2) — `/graphql`, `/api`, consistent headers, rate limits, `TRUST_PROXY` alignment for `dbis_core`. |
| **Managed Kubernetes / App Service** | Standard runtime for web APIs | **LXC templates**: one pattern for “Node + systemd”, one for “nginx static only”, one for “Postgres only” — same packages, same hardening checklist. |
| **Identity (IdP)** | Central auth | **Keycloak** — realms, clients, MFA policy; portals are **clients**, not bespoke login servers. |
| **Managed database** | Durable state, backups, PITR | **Postgres** for Phoenix / portal data — backups, restore drills, connection limits documented. |
| **Service directory** | What runs where | **`ALL_VMIDS_ENDPOINTS.md`** + `config/ip-addresses.conf` + (when adopted) **hub env** — treat as **service catalog**, not tribal knowledge. |
| **Observability** | Metrics, logs, traces | **Per-cell**: node_exporter or similar where you standardize; **aggregator** (Grafana/Loki stack when you add it) — same pattern as “send logs to the regional pipeline.” |
| **Landing zone / policy** | Guardrails before workloads land | **`PROXMOX_OPS_APPLY`**, `PROXMOX_OPS_ALLOWED_VMIDS`, dry-run scripts, `proxmox-production-guard.sh` — “no mutation without contract,” similar to Azure Policy / SCP ideas at small scale. |
| **IaC / GitOps** | Desired state from repo | **This repo**: scripts + `config/` + runbooks; optional future **declarative** host config (e.g. cloud-init templates per role) so new CTs are **cloned from role**, not artisanal. |
---
## 3. Target cell types (non-chain)
Design the fleet from a **small menu** of cell types; anything that does not fit forces a design review.
1. **Edge-static cell** — nginx only; multiple `server_name` or `map $host`; static `root` per product line. Lowest RAM. Good for marketing, entity microsites (exported), status pages, and **SPAs that only talk to APIs** (no server-only NextAuth on that host). **IRU / marketplace discovery** often stays **dynamic** (SSR or browser app against `dbis_core`) until a deliberate static-export pipeline exists—do not assume static-first fits all catalog UX.
2. **Edge-SSR cell** — one Node process (or small cluster later) for NextAuth / server components; **one** cell per “SSR family,” not one per brand, where host-based routing suffices.
3. **API hub cell** — nginx (or future BFF) only; upstreams to Phoenix Apollo and `dbis_core` over LAN. **Prefer placement** on a node with headroom (see [SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](../03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md)).
4. **Data cell** — Postgres (and optional read replica pattern later); no arbitrary co-install of app servers.
5. **Identity cell** — Keycloak; isolated upgrades and backup story.
6. **Operator / control** — NPM, IT read API, inventory jobs — same hardening and backup discipline as “regional tooling” accounts in public cloud.
**Anti-pattern:** one-off CTs that mix “random nginx + cron + manual edits” without a role name in the catalog.
---
## 4. Practices to adopt (hyperscaler-aligned)
- **Single edge story:** NPM (and DNS) as the **only** public entry contract; internal IPs are implementation details.
- **Hub-and-spoke APIs:** clients talk to **one** Phoenix-facing origin where possible; backends stay private on LAN. **CORS** allowlists must include **every browser origin** that calls the API (portal, admin, studio, marketplace SPAs)—not only hostnames served by the static web hub.
- **Blast radius:** consolidating statics **reduces** attack surface and cert sprawl; moving hubs off overloaded nodes **reduces** correlated failure under load.
- **Versioned change:** runbooks + script `--dry-run` first; VMID allowlists for mutations.
- **Observability contract:** every cell exposes **`/health`** (or documented equivalent) and logs to a **single** retention policy.
- **Naming:** FQDN → owner → cell type in docs (already directionally in `FQDN_EXPECTED_CONTENT.md` / E2E lists).
---
## 5. Explicit exclusion (blockchain plane)
Do **not** fold these into the “hyperscaler-style non-chain cell” menu without a **dedicated** chain runbook merge:
- Besu validators, sentries, core/public RPC CTs
- Blockscout / explorer stacks
- CCIP / relay / XDC-zero **chain** workers
- Chain 138 deploy RPC paths and token-aggregation **as chain execution** (writes, signers, keeper paths)
**Boundary nuance:** a **read-only** token-aggregation or quote service that only calls **public** RPC may be operated like an **edge-adjacent** app cell; anything holding **keys**, executing **writes**, or coupling to **validator** timing stays on the **chain plane**.
They remain a **separate plane** with different SLOs, upgrade windows, and safety rules. The **non-chain** ecosystem **integrates** with them only via **documented APIs and RPC URLs**, not by sharing generic web cells.
---
## 6. Related documents
- [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](./SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md)
- [SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](../03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md)
- [SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md](./SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md)
- [PROXMOX_LOAD_BALANCING_RUNBOOK.md](../04-configuration/PROXMOX_LOAD_BALANCING_RUNBOOK.md)
- [PUBLIC_SECTOR_TENANCY_MARKETPLACE_AND_DEPLOYMENT_BASELINE.md](./PUBLIC_SECTOR_TENANCY_MARKETPLACE_AND_DEPLOYMENT_BASELINE.md) (tenancy and catalog vs marketing)
- [NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](./NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) (inconsistencies, P0/P1 backlog, NPM/WebSocket/`TRUST_PROXY`)
---
## 7. Adoption (incremental)
You do not need a “big bang.” Order of operations:
1. Name current CTs against the **cell types** in section 3; mark gaps.
2. Stand up **one** edge-static or API-hub cell on a **nonr630-01** node as a template.
3. Migrate **lowest-risk** FQDNs (static marketing) first; then API hub; then SSR if needed.
4. Retire redundant CTs after rollback window; update inventory and `get_host_for_vmid`.
Fill a short decision log in [SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](../03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md) as you execute.

View File

@@ -0,0 +1,166 @@
# Non-chain ecosystem plan — detailed review, gaps, and inconsistencies
**Purpose:** Critical review of the consolidated Phoenix / web hub / r630-01 offload / hyperscaler-style documents and scripts as of **2026-04-13**. Use this as a **remediation backlog**; update linked docs when items close.
**Scope reviewed:**
[NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](./NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md),
[SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](./SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md),
[SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](../03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md),
`scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh`,
`scripts/verify/verify-sankofa-consolidated-hub-lan.sh`,
`config/ip-addresses.conf` hub defaults,
`scripts/lib/load-project-env.sh` `get_host_for_vmid`.
---
## 1. Cross-document consistency
| Topic | Hyper-scaler model | Consolidated hub doc | r630-01 goal doc | Verdict |
|-------|---------------------|----------------------|------------------|---------|
| Chain vs non-chain boundary | Explicit exclusion list | Matches | Matches | **Aligned** |
| API hub Tier 1 | Gateway row | Tier 1 nginx | Phase 2 move hub off 7800 | **Aligned**; live state (hub on **7800**) is **interim** per r630 doc |
| Web hub | Edge-static / SSR cells | Options A/B/C | Phase 1 | **Aligned** |
| Load relief | Fewer cells + placement | “Moving hubs” note | Non-goal: nginx CPU on same node | **Aligned** |
| NPM | Single edge story | Fewer upstream IPs possible | NPM repoint | **Partial gap:** NPM often still **one row per FQDN**; “fewer rows” is **upstream IP convergence**, not necessarily fewer proxy host records (see §4.1). |
---
## 2. Technical gaps (must fix in implementation, not only docs)
### 2.1 `TRUST_PROXY` and client IP for `dbis_core` (high)
**Issue:** Tier-1 nginx forwards `X-Forwarded-For` / `X-Real-IP`, but `dbis_core` IRU rate limits and abuse logic require **`TRUST_PROXY=1`** (and correct **trusted hop**: NPM → hub → app). If `dbis_core` does not trust the hub IP, it sees **only the hubs** LAN address for all users.
**Remediation:** Document in cutover checklist: set `TRUST_PROXY=1` on `dbis_core` **and** restrict trusted proxy list to **NPM** and **API hub** subnets/IPs. Add integration test: rate limit key changes when `X-Forwarded-For` varies.
**Doc fix:** Already mentioned in consolidated §3.3; add explicit **“before NPM → hub cutover”** gate in [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](./SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md) operator checklist.
**Repo (2026-04-13):** `dbis_core` supports **`TRUST_PROXY_HOPS`** (110) so Express `trust proxy` matches NPM-only vs NPM→hub→app; see `dbis_core/.env.example`. IP allowlisting for proxies remains an ops/network task.
### 2.2 GraphQL WebSocket through NPM + hub (high)
**Issue:** `graphql-ws` requires **Upgrade** end-to-end. NPM custom locations must **allow WebSockets**; hub nginx already sets `Upgrade` / `Connection` to Apollo. If NPM strips or times out upgrades, subscriptions break **silently** for some clients.
**Remediation:** Add explicit E2E: `wscat` or Apollo subscription smoke **through public URL** after any NPM port/path change. Document NPM “Websockets support” toggle if applicable.
**Repo:** `scripts/verify/smoke-phoenix-graphql-wss-public.sh` (curl **HTTP 101** upgrade on `wss://…/graphql-ws`; use `PHOENIX_WSS_INCLUDE_LAN=1` for hub `:8080`).
### 2.3 CORS and browser origins (medium)
**Issue:** Consolidated doc says CORS allowlist “web hub FQDNs only.” Browsers calling **`https://phoenix.sankofa.nexus/graphql`** from **`https://portal.sankofa.nexus`** are **cross-origin**; allowlist must include **portal**, **admin**, **studio**, and any SPA origins that call the API—not only the web hub static hostnames.
**Remediation:** Replace wording with **“all documented browser origins that invoke Phoenix or `dbis_core` from the browser.”** Cross-ref [SANKOFA_MARKETPLACE_SURFACES.md](../03-deployment/SANKOFA_MARKETPLACE_SURFACES.md) for IRU public routes.
### 2.4 Health check path in operator checklist (low — doc error)
**Issue:** Cutover checklist suggested `GET /api/v1/health`; `dbis_core` exposes **`/health`** and **`/v1/health`**, not under `/api/v1/`.
**Remediation:** Checklist corrected in consolidated doc to **`/health` via hub** (`/api/` prefix does not apply to root health).
### 2.5 Dual public paths (4000 vs 8080) during migration (medium)
**Issue:** While both ports are open, **clients can bypass** hub policies (CORS, future WAF) by targeting **:4000** directly if firewalled only at NPM. Hyperscaler model prefers **one** ingress.
**Remediation:** After NPM cutover to **8080**, **firewall** Phoenix **:4000** to **localhost + hub IP only** on CT 7800, or bind Apollo to **127.0.0.1** only (application config change—needs Phoenix runbook).
**Repo (2026-04-13):** `scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh` sets **`HOST=127.0.0.1`** for Fastify on **7800** when hub upstream is **127.0.0.1:4000**.
### 2.6 Stock `nginx` package disabled on 7800 (medium)
**Issue:** Installer `systemctl disable nginx` removes the default **Debian `nginx.service`**. If operators expect `nginx` for ad-hoc static files on that CT, they lose it. Today intentional for **dedicated** `sankofa-phoenix-api-hub.service`.
**Remediation:** Document on CT 7800: **only** `sankofa-phoenix-api-hub` serves nginx; do not re-enable stock unit without conflict check.
### 2.7 `proxy_pass` URI and trailing slashes (low)
**Issue:** `location /api/` + `proxy_pass http://dbis_core_rest;` preserves URI prefix—correct for `dbis_core` mounted at `/api/v1`. If any route is mounted at root on upstream, mismatch possible.
**Remediation:** Keep; add note: new BFF routes must use **distinct prefixes** (`/bff/`) to avoid colliding with Apollo or `dbis_core`.
---
## 3. Inventory and automation gaps
### 3.1 `get_host_for_vmid` omits explicit Sankofa VMIDs (medium)
**Issue:** Sankofa stack VMIDs **78007806** fell through to **default** `*)` → r630-01. Behavior matched inventory but was **implicit**—easy to break if default changes.
**Remediation:** Add explicit `7800|7801|7802|7803|7806` case arm to `get_host_for_vmid` with comment “Sankofa Phoenix stack — verify with `pct list` when migrating.”
**Repo (2026-04-13):** Explicit **`78007806`** arm on r630-01 in `scripts/lib/load-project-env.sh` (includes gov portals 7804 and studio 7805).
### 3.2 Fleet scripts and hub env vars (medium)
**Issue:** `IP_SANKOFA_PHOENIX_API_HUB` / `SANKOFA_PHOENIX_API_HUB_PORT` exist in `ip-addresses.conf`, but **`update-npmplus-proxy-hosts-api.sh`** (and friends) may still **hardcode** or use only `IP_SANKOFA_PHOENIX_API` + `4000`.
**Remediation:** Grep fleet scripts; add optional branch: when `SANKOFA_PHOENIX_API_HUB_PORT=8080` and flag file or env `SANKOFA_NPM_USE_API_HUB=1`, emit upstream **:8080**. Until then, document **manual** NPM row for hub cutover.
**Repo (2026-04-13):** `update-npmplus-proxy-hosts-api.sh` uses **`SANKOFA_NPM_PHOENIX_PORT`** (default `SANKOFA_PHOENIX_API_PORT`) and **`IP_SANKOFA_NPM_PHOENIX_API`** for `phoenix.sankofa.nexus` / `www.phoenix`. See [SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md](../03-deployment/SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md).
### 3.3 `PROXMOX_HOST` for install script (low)
**Issue:** `install-sankofa-api-hub-nginx-on-pve.sh` defaults `PROXMOX_HOST` to r630-01. For hub on **r630-04**, operator must export `PROXMOX_HOST`—easy to miss.
**Remediation:** Script header already mentions; add **one-line echo** of resolved host at start of `--apply` (done partially); extend dry-run to print `get_host_for_vmid` suggestion when `SANKOFA_API_HUB_TARGET_NODE` set (future env).
**Repo (2026-04-13):** Header states **PROXMOX_HOST = PVE node**; dry-run prints **`get_host_for_vmid`** when `load-project-env.sh` is sourced.
---
## 4. Hyperscaler model — internal tensions
### 4.1 “Single edge” vs NPM reality
**Tension:** Model says NPM is the **only** public entry contract. Technically true for TLS, but **NPM** often implements **one proxy host per FQDN**. Hyperscalers use **one ALB** with many rules. **Semantic alignment:** treat NPM as **ALB-equivalent**; “single edge” means **single trust and cert pipeline**, not literally one row.
### 4.2 Static-first IRU / marketplace
**Tension:** [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](./SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md) suggests static export for IRU/marketplace **where compatible**. Today much of partner discovery is **dynamic** (`dbis_core` + Phoenix marketplace). **Over-optimistic** without a “dynamic shell + CDN” alternative.
**Remediation:** In NON_CHAIN doc §3, clarify **Edge-static** is for **marketing and post-login SPAs that only call APIs**; **IRU public catalog** may remain **Edge-SSR** or **API-driven SPA** until a static export pipeline exists.
### 4.3 Token-aggregation and “chain plane” boundary
**Tension:** [NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](./NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md) excludes **token-aggregation runtime tied to chain RPC**. Many deployments colocate **token-aggregation** with **explorer** or **info** nginx—**hybrid**. Risk: teams mis-classify a service and consolidate wrong CT.
**Remediation:** Add one line: **“Token-aggregation API that only proxies to public RPC may be treated as edge-adjacent; workers that hold keys or execute chain writes stay chain-plane.”**
### 4.4 Postgres coupling
**Tension:** r630 doc says stack is **tightly coupled** for latency. Hyperscaler “managed DB” often implies **network separation**. Acceptable as **single-AZ** pattern; document **when** splitting Phoenix API from **7803** Postgres requires **read replicas** or **connection pooler** (PgBouncer) first.
---
## 5. Missing runbook sections (add over time)
| Missing item | Why it matters |
|--------------|----------------|
| **Backup/restore** before hub install and before `pct migrate` | Hub nginx does not replace backup discipline for Postgres / Keycloak. |
| **Keycloak redirect URIs** when origins move to web hub IP/hostnames | OIDC failures post-cutover. |
| **Certificate issuance** when many FQDNs share one upstream IP | NPM still requests certs per host; rate limits / ACME. |
| **Rollback:** restore NPM upstream + `systemctl start nginx` on 7800? | Dual-stack rollback path. |
| **SLO / error budget** | Hyperscaler practice; currently implicit. |
| **CI for `nginx -t`** on example configs | GitHub Actions: `.github/workflows/validate-sankofa-nginx-examples.yml` (Gitea: mirror or add equivalent workflow). |
---
## 6. Document maintenance items (quick fixes)
1. **Consolidated doc §5** — ensure artifact table always lists **`install-sankofa-api-hub-nginx-on-pve.sh`** and **`verify-sankofa-consolidated-hub-lan.sh`** next to other operator scripts.
2. **Consolidated §3.2 Tier 1** — prefer **LAN upstream to `dbis_core`** as the default narrative (colocated `127.0.0.1:3000` is the special case). **Clarified** in repo.
3. **Decision log** — “Web hub pattern” vs filled API tier: use **TBD / interim** until a web hub is chosen. **Updated** in repo.
4. **This file** linked from [NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](./NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md) §6 and [MASTER_INDEX.md](../MASTER_INDEX.md).
---
## 7. Prioritized remediation backlog
| Priority | Item | Owner |
|----------|------|--------|
| P0 | Verify `TRUST_PROXY` + **`TRUST_PROXY_HOPS`** + production trust boundaries for `dbis_core` when using hub | **LAN:** `TRUST_PROXY=1` on **10150/10151** via `ensure-dbis-api-trust-proxy-on-ct.sh`; validate rate-limit keys from two public IPs |
| P0 | WebSocket E2E through NPM after hub port change | **Done:** `smoke-phoenix-graphql-wss-public.sh`**HTTP 101**; `pnpm run verify:phoenix-graphql-ws-subscription`**connection_ack** (remove unused `@fastify/websocket` on 7800 if RSV1; see runbook). |
| P1 | CORS / allowed origins list includes all browser callers | App + API |
| P1 | Firewall or bind Apollo to localhost after NPM → 8080 | **Done:** `ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh` on **7800** (or use firewall plan if HOST cannot be set) |
| P2 | Explicit `get_host_for_vmid` entries for 78007806 | **Done** in `load-project-env.sh` — re-verify on migrate |
| P2 | NPM fleet **`SANKOFA_NPM_PHOENIX_PORT`** / **`IP_SANKOFA_NPM_PHOENIX_API`** | **Done** in `update-npmplus-proxy-hosts-api.sh` |
| P3 | Backup/rollback runbook sections | [SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md](../03-deployment/SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md) §0 / §5 |
| P3 | Clarify static-first vs dynamic IRU in NON_CHAIN §3 | Docs |
---
## 8. Conclusion
The plan is **directionally sound**: chain plane separation, cell typing, phased offload from r630-01, and Tier-1 API hub are **consistent**. The largest **gaps** are **operational truth** items (client IP trust, WebSockets, CORS wording, dual-port exposure) and **automation drift** (NPM scripts vs new env vars, implicit VMID→host). Closing **P0P1** before wide NPM cutover matches how hyperscalers treat **ingress migrations**: prove identity and transport contracts first, then shift traffic.

View File

@@ -0,0 +1,158 @@
# Sankofa Phoenix — consolidated non-chain frontend and API hub
**Status:** Architecture proposal (resource conservation)
**Last updated:** 2026-04-13
**LAN status (operator):** Tier-1 API hub **nginx on VMID 7800** listening **`http://192.168.11.50:8080`** (`sankofa-phoenix-api-hub.service`). Apollo (Fastify) binds **`127.0.0.1:4000`** only (`HOST=127.0.0.1` in `/opt/sankofa-api/.env`; apply: `scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh`). NPM → **:8080** + WebSocket upgrades is live for `phoenix.sankofa.nexus` (fleet 2026-04-13). Install hub: `scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh` with `PROXMOX_OPS_APPLY=1` + `PROXMOX_OPS_ALLOWED_VMIDS=7800`. Readiness: `scripts/verify/verify-sankofa-consolidated-hub-lan.sh`, hub GraphQL `scripts/verify/smoke-phoenix-api-hub-lan.sh`, WebSocket upgrade `scripts/verify/smoke-phoenix-graphql-wss-public.sh` (`pnpm run verify:phoenix-graphql-wss`), graphql-ws handshake `pnpm run verify:phoenix-graphql-ws-subscription`, hub `/graphql-ws` headers `scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh`.
**r630-01 load goal:** consolidating frontends and **moving hub LXCs** to quieter nodes is what reduces guest count and hypervisor pressure — see [SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](../03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md).
**Ecosystem shape (non-chain, hyperscaler-style):** [NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](./NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md) (cell types, edge vs chain plane).
**Scope:** Non-blockchain Sankofa / Phoenix surfaces only. **Out of scope:** Chain 138 explorer, Besu/RPC, CCIP/relayers, token-aggregation compute — keep those on dedicated LXCs/VMs per existing runbooks.
---
## 1. Problem
Today, multiple LXCs/VMIDs often run **one primary workload each** (portal, corporate web, Phoenix API, DBIS API, gov dev shells, etc.). Each Node or Next process carries **base RAM** (V8 heap, file watchers in dev, separate copies of dependencies). Nginx-only static sites are cheap; **many separate Node servers are not**.
This document defines a **consolidated runtime** that:
1. Puts **all non-chain web frontends** behind **one LAN endpoint** (one LXC or one Docker host — your choice), using **static-first** or **one Node process** where SSR is required.
2. Puts **all Phoenix-facing backend traffic** behind **one logical API** (one public origin and port): GraphQL (current Phoenix), REST/BFF (`dbis_core` and future middleware), health, and webhooks.
Canonical surface taxonomy remains [SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md](./SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md). Consolidation changes **packaging**, not the names of visitor vs client vs operator paths.
---
## 2. Single “web hub” LXC (frontends)
### 2.1 Option A — Static-first (lowest RAM)
**When:** Marketing pages, IRU/marketplace **after** static export, simple entity microsites, post-login SPAs that call the API hub only.
- Build: `next build` with `output: 'export'` **where compatible** (no server-only APIs on those routes).
- Serve: **nginx** with one `server` per FQDN (`server_name`) or one server + `map $host $site_root` → different `root` directories under `/var/www/...`.
- **NPM:** All affected FQDNs point to the **same** upstream `http://<WEB_HUB_IP>:80`.
**Tradeoff:** NextAuth / OIDC callback flows and server components need either **client-only OIDC** (PKCE) against Keycloak or a **small** SSR slice (see option B).
### 2.2 Option B — One Node process for all SSR Next apps (moderate RAM)
**When:** Portal (`portal.sankofa.nexus`), admin, or any app that must keep `getServerSideProps`, NextAuth, or middleware.
- **Monorepo** (e.g. Turborepo/Nx): multiple Next “apps” merged into **one deployable** using:
- **Next multi-zone** (primary + mounted sub-apps), or
- **Single Next 15 app** with `middleware.ts` rewriting by `Host`, or
- **Single custom server** (less ideal) proxying to child apps — avoid unless necessary.
**Outcome:** One `node` process (or one `standalone` output + one PID supervisor) on **one port** (e.g. 3000). Nginx in front optional (TLS termination usually at NPM).
### 2.3 Option C — Hybrid (practical migration)
- **nginx:** static corporate apex, static entity sites, docs mirrors.
- **One Node:** portal + Phoenix “shell” that must stay dynamic.
Still **fewer** LXCs than “one LXC per microsite.”
### 2.4 What stays out of this box
- Blockscout / explorer stacks
- `info.defi-oracle.io`, MEV GUI, relay health — separate nginx LXCs as today unless you explicitly merge **static** mirrors only
- Keycloak — **keep separate** (identity is its own security domain)
---
## 3. Single consolidated API (Phoenix hub)
### 3.1 Responsibilities
| Path family | Today (typical) | Hub role |
|-------------|-----------------|----------|
| `/graphql`, `/graphql-ws` | Phoenix VMID 7800 :4000 | **Reverse proxy** to existing Apollo until merged in code |
| `/api/v1/*`, `/api-docs` | `dbis_core` (e.g. :3000) | **Reverse proxy** mount |
| `/health` | Multiple | **Aggregate** (optional): hub returns 200 only if subgraphs pass |
| Future BFF | N/A | **Implement in hub** (session, composition, rate limits) |
**Naming:** Introduce an internal service name e.g. `sankofa-phoenix-hub-api`. Public FQDN can remain `phoenix.sankofa.nexus` or split to `api.phoenix.sankofa.nexus` for clarity; NPM decides.
### 3.2 Implementation tiers (phased)
**Tier 1 — Thin hub (fastest, lowest risk)**
One process: **nginx** or **Caddy**. **Typical production pattern:** hub on its own LXC or same CT as Apollo — `proxy_pass` Phoenix to **`127.0.0.1:4000`** when colocated, and `dbis_core` to **`IP_DBIS_API:3000`** (LAN) as in `install-sankofa-api-hub-nginx-on-pve.sh`. **Single public port** (e.g. 443 behind NPM → **8080** on the hub). Before NPM sends public traffic to the hub, validate **`TRUST_PROXY`** and trusted proxy hops for `dbis_core` (see [NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](./NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) §2.1).
**Tier 2 — Application hub**
Single **Node** (Fastify/Express) app: validates JWT once, applies rate limits, `proxy` to subgraphs, adds **BFF** routes (`/bff/portal/...`).
**Tier 3 — Monolith (long-term)**
Merge routers and schema into one codebase — only after boundaries and ownership are clear.
### 3.3 Middleware cross-cutting
Centralize in the hub:
- **CORS** allowlist (origins = web hub FQDNs only)
- **Rate limiting** (especially IRU public POST — align with `dbis_core` **`TRUST_PROXY=1`** and a **trusted proxy list** that includes NPM and this hubs LAN IP, or rate limits see only the hub)
- **Request ID** propagation
- **mTLS** or IP allowlist for operator-only routes (optional)
---
## 4. NPM and inventory
After cutover:
- **Fewer distinct upstream IPs** in NPM (many FQDNs can point at the **same** `IP:port`); NPM may still use **one proxy host record per FQDN** for TLS—equivalent to one ALB with many listener rules, not literally one row total. Host-based routing then lives in **web hub** nginx (`server_name` / `map`) or in **Next** `middleware.ts`.
- Update [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md) and `get_host_for_vmid` in `scripts/lib/load-project-env.sh` when VMIDs are **retired** or **replaced** by hub VMIDs.
- **`config/ip-addresses.conf`** defines optional hub variables that **default to the current discrete CT IPs** (`IP_SANKOFA_WEB_HUB` → portal IP, `IP_SANKOFA_PHOENIX_API_HUB` → Phoenix API IP). Override in `.env` when hub LXCs exist.
---
## 5. Concrete file references in this repo
| Artifact | Purpose |
|----------|---------|
| [config/nginx/sankofa-non-chain-frontends.example.conf](../../config/nginx/sankofa-non-chain-frontends.example.conf) | Example **host → static root** nginx for web hub |
| [config/nginx/sankofa-phoenix-api-hub.example.conf](../../config/nginx/sankofa-phoenix-api-hub.example.conf) | Example **path → upstream** for API hub (Tier 1); tune `upstream` to LAN or `127.0.0.1` when colocated |
| [config/nginx/sankofa-hub-main.example.conf](../../config/nginx/sankofa-hub-main.example.conf) | Top-level `nginx.conf` for web hub CT (`-c` for systemd) |
| [config/nginx/sankofa-api-hub-main.example.conf](../../config/nginx/sankofa-api-hub-main.example.conf) | Top-level `nginx.conf` for API hub CT |
| [config/systemd/sankofa-non-chain-web-hub-nginx.service.example](../../config/systemd/sankofa-non-chain-web-hub-nginx.service.example) | systemd unit for web hub nginx |
| [config/systemd/sankofa-phoenix-api-hub-nginx.service.example](../../config/systemd/sankofa-phoenix-api-hub-nginx.service.example) | systemd unit for API hub nginx |
| [config/compose/sankofa-consolidated-runtime.example.yml](../../config/compose/sankofa-consolidated-runtime.example.yml) | Optional Docker Compose sketch (API hub container only) |
| [scripts/verify/check-sankofa-consolidated-nginx-examples.sh](../../scripts/verify/check-sankofa-consolidated-nginx-examples.sh) | **`nginx -t`** on example snippets (host `nginx` or **Docker** fallback) |
| [scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh](../../scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh) | Read-only cutover reminder + resolved env from `load-project-env.sh` |
| [scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh](../../scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh) | Tier-1 hub install on CT (`--dry-run` / `--apply` + `PROXMOX_OPS_*`) |
| [scripts/verify/verify-sankofa-consolidated-hub-lan.sh](../../scripts/verify/verify-sankofa-consolidated-hub-lan.sh) | Read-only LAN smoke (Phoenix, portal, dbis `/health`, Keycloak realm) |
---
## 6. Operator cutover checklist (complete in order)
1. Run `bash scripts/verify/check-sankofa-consolidated-nginx-examples.sh` (CI or laptop).
2. Provision **one** non-chain web hub LXC and/or **one** API hub LXC (or colocate nginx on an existing CT — document the choice).
3. Copy and edit nginx snippets from `config/nginx/` into `/etc/sankofa-web-hub/` and `/etc/sankofa-phoenix-api-hub/` per systemd examples; install **systemd** units from `config/systemd/*.example` (drop `.example`, adjust paths).
4. Set **`.env`** overrides: `IP_SANKOFA_WEB_HUB`, `SANKOFA_WEB_HUB_PORT`, `IP_SANKOFA_PHOENIX_API_HUB`, `SANKOFA_PHOENIX_API_HUB_PORT` (see `plan-sankofa-consolidated-hub-cutover.sh` output after `source scripts/lib/load-project-env.sh`).
5. **Dry-run** NPM upstream changes; then apply during a maintenance window. Confirm **WebSocket** (GraphQL subscriptions) through NPM if clients use `graphql-ws`.
6. Smoke: `curl -fsS http://<API_HUB>:<PORT>/health`, GraphQL POST to `/graphql`, **`dbis_core`** health via hub as **`GET /api-docs`** or **`GET /health`** on upstream `:3000` through `/api/` only if mounted there — simplest: `curl` **`http://<hub>:<port>/api-docs`** (proxied) per [NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](./NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) §2.4.
7. Update inventory docs and VMID table; decommission retired CTs only after rollback window. Optionally **bind Apollo to 127.0.0.1:4000** or firewall **:4000** from LAN once NPM uses hub only ([NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](./NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) §2.5).
---
## 7. Related docs
- [SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md](./SANKOFA_PHOENIX_CANONICAL_BOUNDARIES_AND_TAXONOMY.md)
- [SANKOFA_MARKETPLACE_SURFACES.md](../03-deployment/SANKOFA_MARKETPLACE_SURFACES.md)
- [ENTITY_INSTITUTIONS_WEB_PORTAL_COMPLETION.md](../03-deployment/ENTITY_INSTITUTIONS_WEB_PORTAL_COMPLETION.md)
- [SERVICE_DESCRIPTIONS.md](./SERVICE_DESCRIPTIONS.md)
- [NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](./NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) (gaps, inconsistencies, P0/P1 backlog)
---
## 8. Decision log (fill when adopted)
| Decision | Choice | Date |
|----------|--------|------|
| Web hub pattern | **TBD** (interim: discrete CTs; target: A / B / C) | |
| API hub Tier | **1** (nginx on VMID 7800, LAN 2026-04-13) | 2026-04-13 |
| Public API hostname | phoenix.sankofa.nexus (NPM → **8080** hub; Apollo **127.0.0.1:4000**) | 2026-04-13 |
| Retired VMIDs | none | |

View File

@@ -1,10 +1,18 @@
# Sankofa Services - Service Descriptions
**Last Updated:** 2026-03-25
**Last Updated:** 2026-04-13
**Status:** Active Documentation
---
## Consolidated runtime (optional)
To reduce LXC count for **non-chain** web and to expose **one** Phoenix-facing API origin (GraphQL + `dbis_core` REST behind path routes), see [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](./SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md). `config/ip-addresses.conf` adds `IP_SANKOFA_WEB_HUB` and `IP_SANKOFA_PHOENIX_API_HUB` (defaulting to todays portal and Phoenix API IPs until you set hub LXCs in `.env`). Blockchain-adjacent stacks (explorer, RPC, relayers) stay **out** of this consolidation.
For **how** the non-chain fleet should be designed (edge cells, API hub, IdP, data) in hyperscaler-style terms—**excluding** the blockchain plane—see [NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](./NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md).
---
## Brand and Product Relationship
### Company and Product Analogy
@@ -41,8 +49,8 @@ This document describes the purpose and function of each service in the Sankofa
- **Purpose:** Cloud infrastructure management portal (API service)
- **VMID:** 7800
- **IP:** 192.168.11.50
- **Port:** 4000
- **External Access:** https://phoenix.sankofa.nexus, https://www.phoenix.sankofa.nexus
- **Port:** **4000** (Apollo direct) and **`8080`** (optional Tier-1 **API hub** nginx: `/graphql` → 4000, `/api``dbis_core` on `IP_DBIS_API:3000`)
- **External Access:** https://phoenix.sankofa.nexus, https://www.phoenix.sankofa.nexus (NPM upstream may stay **4000** until you cut over to **8080**)
**Details:**
- GraphQL API service for Phoenix cloud platform

View File

@@ -0,0 +1,92 @@
# Sankofa Phoenix API hub — NPM cutover and post-cutover
**Purpose:** Ordered steps when moving public `phoenix.sankofa.nexus` traffic from direct Apollo (`:4000`) to Tier-1 nginx on the Phoenix stack (`:8080` by default). Complements [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](../02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md) and [SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](./SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md).
**Not covered here:** corporate apex, portal SSO, or Keycloak realm edits (see portal/Keycloak runbooks).
---
## 0. Preconditions
- API hub installed and healthy on LAN: `curl -sS "http://${IP_SANKOFA_PHOENIX_API:-192.168.11.50}:8080/health"` and a GraphQL POST to `/graphql` succeed.
- Backup: NPM export or UI backup, plus application-level backup if you change Phoenix/dbis systemd or env on the CT.
---
## 1. `dbis_core` (rate limits and `req.ip`)
1. Set `TRUST_PROXY=1` on the `dbis_core` process (see `scripts/deployment/ensure-dbis-api-trust-proxy-on-ct.sh` for VMIDs **10150** / **10151**).
2. **`TRUST_PROXY_HOPS`** (optional; default `1` in code): Express counts **reverse proxies that terminate the TCP connection to Node** — typically **one** (either NPM **or** the API hub), even when browsers traversed Cloudflare → NPM → hub → `dbis_core`. Raise hops only if your stack adds **another** reverse proxy **in series** directly in front of the same listener (uncommon). When unsure, leave unset and validate `req.ip` / rate-limit keys with two real client IPs.
3. Ensure **`ALLOWED_ORIGINS`** lists every browser origin that calls the API (portal, admin, studio, marketing SPAs as applicable). Production forbids `*`.
4. Restart `dbis_core` and confirm logs show no CORS or startup validation errors.
---
## 2. NPM fleet (`update-npmplus-proxy-hosts-api.sh`)
1. In repo `.env` (operator workstation), set:
- `SANKOFA_NPM_PHOENIX_PORT=8080`
- Optionally `IP_SANKOFA_NPM_PHOENIX_API=…` if the hub listens on a different LAN IP than `IP_SANKOFA_PHOENIX_API`.
2. Run the fleet script with valid `NPM_*` credentials (same as other NPM updates).
3. In NPM UI, confirm `phoenix.sankofa.nexus` and `www.phoenix.sankofa.nexus` forward WebSockets (subscriptions use `/graphql-ws`).
---
## 3. Verification
| Check | Command or action |
|--------|-------------------|
| Public HTTPS health | `curl -fsS "https://phoenix.sankofa.nexus/health"` (or hub-exposed health path you standardized) |
| GraphQL | POST `https://phoenix.sankofa.nexus/graphql` with a trivial query |
| WebSocket upgrade (TLS + hub) | `bash scripts/verify/smoke-phoenix-graphql-wss-public.sh` (expects **HTTP 101** via `curl --http1.1`; optional `PHOENIX_WSS_INCLUDE_LAN=1` for hub `:8080`; optional `PHOENIX_WSS_CURL_MAXTIME` default **8**s per probe because curl waits on open WS). Full handshake: `pnpm run verify:phoenix-graphql-ws-subscription` (`connection_init` → `connection_ack`). If Node clients report **RSV1** on `/graphql-ws`, CT **7800** should not register `@fastify/websocket` alongside standalone `ws` (apply `scripts/deployment/ensure-sankofa-phoenix-graphql-ws-remove-fastify-websocket-7800.sh`). **Process crashes on WS disconnect:** `websocket.ts` must import `logger``scripts/deployment/ensure-sankofa-phoenix-websocket-ts-import-logger-7800.sh`. Hub nginx: `scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh` (`Accept-Encoding ""`, `proxy_buffering off` in `/graphql-ws`). Optional host guard: `scripts/deployment/ensure-sankofa-phoenix-7800-nft-dport-4000-guard.sh` + `config/nftables/sankofa-phoenix-7800-guard-dport-4000.nft`. |
| IRU / public limits | Hit a rate-limited route from two different public IPs and confirm keys differ (validates forwarded client IP) |
---
## 4. Post-cutover hardening (dual path)
After NPM points at `:8080` and traffic is stable:
- **Bind Apollo to loopback** (recommended when hub upstream is `127.0.0.1:4000`):
`PROXMOX_OPS_APPLY=1` `PROXMOX_OPS_ALLOWED_VMIDS=7800` `bash scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh --apply --vmid 7800`
Confirm VLAN cannot connect to `:4000`; hub `:8080` and public `https://phoenix.sankofa.nexus` still work. **Alternative:** host firewall on CT 7800 — see `scripts/deployment/plan-phoenix-apollo-port-4000-restrict-7800.sh --ssh`.
- **Hub `/graphql-ws` proxy headers** (idempotent; safe with existing installs):
`PROXMOX_OPS_APPLY=1` `PROXMOX_OPS_ALLOWED_VMIDS=7800` `bash scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh --apply --vmid 7800`
- **Hub nginx `ExecReload`** (systemd, idempotent):
`PROXMOX_OPS_APPLY=1` `PROXMOX_OPS_ALLOWED_VMIDS=7800` `bash scripts/deployment/ensure-sankofa-phoenix-api-hub-systemd-exec-reload-7800.sh --apply --vmid 7800`
- **Phoenix API DB migrations** (after DB auth works):
`PROXMOX_OPS_APPLY=1` `PROXMOX_OPS_ALLOWED_VMIDS=7800` `bash scripts/deployment/ensure-sankofa-phoenix-api-db-migrate-up-7800.sh --apply --vmid 7800`
- **Phoenix API `.env` LAN parity** (Keycloak + Sankofa Postgres host, dedupe passwords, `NODE_ENV` policy, `TERMINATE_TLS_AT_EDGE`):
`source scripts/lib/load-project-env.sh` then
`PROXMOX_OPS_APPLY=1` `PROXMOX_OPS_ALLOWED_VMIDS=7800` `bash scripts/deployment/ensure-sankofa-phoenix-api-env-lan-parity-7800.sh --apply --vmid 7800`
Default appends **`NODE_ENV=development`** until `DB_PASSWORD` / `KEYCLOAK_CLIENT_SECRET` meet production length; use **`PHOENIX_API_NODE_ENV=production`** only after secrets and TLS policy are ready.
If Postgres returns **28P01** (auth failed), align **`DB_USER`** (typically **`sankofa`**, not `postgres`) and **`DB_PASSWORD`** with the **`sankofa`** role on VMID **7803** (`ALTER USER … PASSWORD` on the Postgres CT), then run **`ensure-sankofa-phoenix-api-db-migrate-up-7800.sh`** so **`audit_logs`** exists — see [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md).
For **`PHOENIX_API_NODE_ENV=production`** without local certs: run **`ensure-sankofa-phoenix-tls-config-terminate-at-edge-7800.sh`** first and keep **`TERMINATE_TLS_AT_EDGE=1`** in `.env`.
- Inventory: [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md) (Phoenix row + VMID 7800 table).
---
## 5. Rollback
1. Unset `SANKOFA_NPM_PHOENIX_PORT` or set it back to `4000` (or your direct Apollo port).
2. Re-run the NPM fleet script.
3. If `dbis_core` had `TRUST_PROXY_HOPS=2` only for the hub path, reduce hops or disable trust proxy per your direct topology.
---
## 6. References
- Installer: `scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh`
- Hub graphql-ws headers (live CT): `scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh`
- Phoenix `websocket.ts` logger import (prevents crash on disconnect): `scripts/deployment/ensure-sankofa-phoenix-websocket-ts-import-logger-7800.sh`
- Phoenix API `.env` LAN parity: `scripts/deployment/ensure-sankofa-phoenix-api-env-lan-parity-7800.sh`
- Phoenix API DB migrate up (CT 7800): `scripts/deployment/ensure-sankofa-phoenix-api-db-migrate-up-7800.sh`
- Phoenix TLS (terminate at edge, production without local certs): `scripts/deployment/ensure-sankofa-phoenix-tls-config-terminate-at-edge-7800.sh`
- Hub unit `ExecReload`: `scripts/deployment/ensure-sankofa-phoenix-api-hub-systemd-exec-reload-7800.sh`
- LAN smoke: `scripts/verify/verify-sankofa-consolidated-hub-lan.sh`
- Hub GraphQL smoke: `scripts/verify/smoke-phoenix-api-hub-lan.sh`
- Public / LAN WebSocket upgrade smoke: `scripts/verify/smoke-phoenix-graphql-wss-public.sh`
- Loopback bind for Apollo: `scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh`
- Read-only plan (firewall alternative): `scripts/deployment/plan-phoenix-apollo-port-4000-restrict-7800.sh` (`--ssh` on LAN)
- Example config syntax: `scripts/verify/check-sankofa-consolidated-nginx-examples.sh`
- Gap review: `docs/02-architecture/NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md`

View File

@@ -0,0 +1,96 @@
# Goal: relieve r630-01 via consolidation + hub placement (not nginx alone)
**Status:** Operator goal / runbook
**Last updated:** 2026-04-13
## 1. What you are optimizing for
**Primary goal:** reduce **guest count** and **steady-state CPU / pressure** on **r630-01** (`192.168.11.11`) by:
1. **Retiring CTs** that only existed to serve **small, non-chain web** surfaces (static or low-SSR), after those surfaces are merged into a **single web hub** guest (or static export + nginx).
2. **Placing new hub LXCs** (nginx-only or low-RAM) on **less busy nodes** (typically **r630-03 / r630-04** per health reports), instead of stacking more edge services on r630-01.
3. **Optionally migrating** existing Sankofa / Phoenix / DBIS-related CTs **off** r630-01 when they are **not** chain-critical for that node.
**Non-goal:** expecting the **API hub nginx** colocated on VMID **7800** to materially lower r630-01 load. That pattern is for **routing simplicity** and a path to **fewer public upstreams**; load relief comes from **fewer guests** and **better placement**, not from reverse proxy CPU.
---
## 2. Current anchor facts (from inventory docs)
Treat `pct list` on each node as authoritative when planning; the table below is a **documentation snapshot** of common r630-01-adjacent workloads:
| Area | Typical on r630-01 today | Notes |
|------|---------------------------|--------|
| Sankofa Phoenix stack | **7800** API, **7801** portal, **7802** Keycloak, **7803** Postgres, **7806** public web | Tightly coupled for latency; migrations need cutover windows |
| DBIS API | **10150** (`IP_DBIS_API`) | Often co-dependent with Phoenix / portal flows |
| NPMplus | **10233** / **10234** (see `ALL_VMIDS_ENDPOINTS.md`) | Edge; may stay on r630-01 or follow your NPM HA policy |
| Chain-critical | **2101**, **2103** (Besu core lanes) | **Do not** “consolidate away” without chain runbooks |
---
## 3. Phased execution (explicit consolidation + placement)
### Phase 0 — Measure (read-only)
1. Latest cluster health JSON: `bash scripts/verify/poll-lxc-cluster-health.sh` (writes `reports/status/lxc_cluster_health_*.json`).
2. Rebalance **plan only**:
`bash scripts/verify/plan-lxc-rebalance-from-health-report.sh --source r630-01 --target r630-04 --limit 12`
Adjust `--target` to the node with **headroom** (load, PSI, storage). Review exclusions (chain-critical / infra patterns) in the script output.
3. Record **which VMIDs must stay** on r630-01 vs **candidates to move** in your change ticket.
### Phase 1 — Consolidate **non-chain web** (fewer guests)
1. Architecture: [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](../02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md) (static-first vs one Node process).
2. Build static exports (or one monorepo SSR host) so **multiple FQDNs** can share **one nginx** `server_name` / `map $host` pattern (`config/nginx/sankofa-non-chain-frontends.example.conf`).
3. **Provision the web hub LXC on the target node** (not r630-01 if the goal is offload). Use a **new IP** from your IPAM; update `.env` overrides `IP_SANKOFA_WEB_HUB` / port when ready.
4. NPM dry-run → apply: point marketing / microsite hosts at the web hub upstream.
**Outcome:** retire legacy one-site-one-CT guests **after** TTL / rollback window.
### Phase 2 — API hub **placement** (avoid piling onto r630-01)
**Today:** Tier-1 API hub nginx may be colocated on **7800** (same CT as Apollo) for a fast LAN proof — that does **not** reduce r630-01 guest count.
**Target pattern for load relief:**
1. Create a **small** Debian LXC on **r630-03 or r630-04** (dedicated “phoenix-api-hub” VMID), **only** nginx + `sankofa-phoenix-api-hub.service`.
2. Upstreams in that hub: `proxy_pass` to **LAN IPs** of **7800:4000** (GraphQL) and **10150:3000** (`dbis_core`) — cross-node proxy is fine on VLAN 11.
3. Run `install-sankofa-api-hub-nginx-on-pve.sh` with `--vmid <new-hub-vmid>` on the **target** nodes PVE host (set `PROXMOX_HOST` if not r630-01).
4. NPM: point `phoenix.sankofa.nexus` to **hub IP:8080** (or keep **4000** direct until validated). Before declaring success, run **WebSocket** smoke (`graphql-ws` through NPM) and confirm **`dbis_core` `TRUST_PROXY`** + trusted proxy list include the hub (see [NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](../02-architecture/NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) §2.12.2).
5. **Disable / remove** hub nginx from **7800** if you no longer want dual stacks (maintenance window; validate `systemctl stop sankofa-phoenix-api-hub` on 7800 only after NPM uses the new hub).
**Outcome:** Phoenix CT can stay on r630-01 for DB locality, while **edge proxy RAM/CPU** sits on a lighter node — or later migrate 7800 itself after Phase 3.
### Phase 3 — Migrate heavy CTs (optional, highest impact)
Use **scoped** `pct migrate` from the planner output. Rules from project safety:
- Named VMID list, **dry-run** first, maintenance window, rollback IP/NPM plan.
- After any move: update `get_host_for_vmid` in `scripts/lib/load-project-env.sh` and [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md).
### Phase 4 — Retire + verify
1. Destroy **only** CTs that are fully replaced (config backups, DNS, NPM rows removed).
2. Re-run health poll + E2E verifier profile for public hosts you moved.
---
## 4. Decision record (fill as you execute)
| Decision | Choice | Date |
|----------|--------|------|
| Web hub target node | r630-0? | |
| API hub target node (nginx-only LXC) | r630-0? | |
| NPM phoenix upstream | :4000 direct / :8080 hub | |
| VMIDs retired after consolidation | | |
---
## 5. Related references
- [NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](../02-architecture/NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md) (cell types, edge plane vs chain plane)
- [SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](../02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md)
- [PROXMOX_LOAD_BALANCING_RUNBOOK.md](../04-configuration/PROXMOX_LOAD_BALANCING_RUNBOOK.md)
- [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md)
- `scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh`
- `scripts/verify/verify-sankofa-consolidated-hub-lan.sh`

View File

@@ -334,7 +334,7 @@ The following VMIDs were the older `25xx` RPC identities before the `21xx/22xx/2
| VMID | IP Address | Hostname | Status | Endpoints | Purpose |
|------|------------|----------|--------|-----------|---------|
| 7800 | 192.168.11.50 | sankofa-api-1 | ✅ Running | GraphQL: 4000, Health: /health | Phoenix API (Cloud Platform Portal) |
| 7800 | 192.168.11.50 | sankofa-api-1 | ✅ Running | **Apollo :4000** loopback-only (`HOST=127.0.0.1`); **Tier-1 hub :8080** (`/graphql`→127.0.0.1:4000); hub `/health` | Phoenix API (Cloud Platform Portal) |
| 7801 | 192.168.11.51 | sankofa-portal-1 | ✅ Running | Web: 3000 | Hybrid cloud **client portal** (`portal.sankofa.nexus` / `admin.sankofa.nexus` when NPM routes); not the long-term corporate apex app — see `IP_SANKOFA_PUBLIC_WEB` / `sync-sankofa-public-web-to-ct.sh` |
| 7802 | 192.168.11.52 | sankofa-keycloak-1 | ✅ Running | Keycloak: 8080, Admin: /admin | Identity and Access Management |
| 7803 | 192.168.11.53 | sankofa-postgres-1 | ✅ Running | PostgreSQL: 5432 | Database Service |
@@ -346,8 +346,8 @@ The following VMIDs were the older `25xx` RPC identities before the `21xx/22xx/2
- `sankofa.nexus` / `www.sankofa.nexus`**`IP_SANKOFA_PUBLIC_WEB`:`SANKOFA_PUBLIC_WEB_PORT** (canonical current target: **7806** `192.168.11.63:3000`). Fleet script: `scripts/nginx-proxy-manager/update-npmplus-proxy-hosts-api.sh`. **`www`** → **301** → apex `https://sankofa.nexus` (`$request_uri`). ✅
- `portal.sankofa.nexus` / `admin.sankofa.nexus`**`IP_SANKOFA_CLIENT_SSO`:`SANKOFA_CLIENT_SSO_PORT** (typical: 7801 `:3000`). NextAuth / OIDC public URL: **`https://portal.sankofa.nexus`**. ✅ when NPM proxy rows exist (fleet script creates/updates them).
- `dash.sankofa.nexus` → Set **`IP_SANKOFA_DASH`** (+ `SANKOFA_DASH_PORT`) in `config/ip-addresses.conf` to enable upstream in the fleet script; IP allowlist at NPM is operator policy. 🔶 until dash app + env are set.
- `phoenix.sankofa.nexus`Routes to `http://192.168.11.50:4000` (Phoenix API/VMID 7800) ✅
- `www.phoenix.sankofa.nexus` → Same upstream; **301** to **`https://phoenix.sankofa.nexus`**. ✅
- `phoenix.sankofa.nexus`NPM upstream **`http://192.168.11.50:8080`** (Tier-1 API hub on VMID **7800**; WebSocket upgrades **on**). Apollo listens on **`127.0.0.1:4000`** only (not reachable from VLAN); hub proxies to loopback. ✅ (2026-04-13 fleet + loopback bind)
- `www.phoenix.sankofa.nexus` → Same **:8080** upstream; **301** to **`https://phoenix.sankofa.nexus`**. ✅
- `the-order.sankofa.nexus` / `www.the-order.sankofa.nexus` → OSJ management portal (secure auth). App source: **the_order** at `~/projects/the_order`. NPMplus default upstream: **order-haproxy** `http://192.168.11.39:80` (VMID **10210**), which proxies to Sankofa portal `http://192.168.11.51:3000` (7801). Fallback: set `THE_ORDER_UPSTREAM_IP` / `THE_ORDER_UPSTREAM_PORT` to `.51` / `3000` if HAProxy is offline. **`www.the-order.sankofa.nexus`** → **301** **`https://the-order.sankofa.nexus`** (same as `www.sankofa` / `www.phoenix`).
- `studio.sankofa.nexus` → Routes to `http://192.168.11.72:8000` (Sankofa Studio / VMID 7805; app-owned `/``/studio/` redirect)
@@ -614,13 +614,14 @@ This section lists all endpoints that should be configured in NPMplus, extracted
| `secure.mim4u.org` | `192.168.11.37` | `http` | `80` | ❌ No | MIM4U Secure Portal (VMID 7810) |
| `training.mim4u.org` | `192.168.11.37` | `http` | `80` | ❌ No | MIM4U Training Portal (VMID 7810) |
| **Sankofa Phoenix Services** |
| *(optional hub)* | **`IP_SANKOFA_WEB_HUB`** / **`IP_SANKOFA_PHOENIX_API_HUB`** (default in `config/ip-addresses.conf` = portal / Phoenix API until `.env` overrides) | `http` | per hub nginx | ❌ No | Consolidated non-chain web + path API hub — see `docs/02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md` |
| `sankofa.nexus` | **`IP_SANKOFA_PUBLIC_WEB`** (`192.168.11.63` on VMID 7806 in the current deployment) | `http` | **`SANKOFA_PUBLIC_WEB_PORT`** (`3000`) | ❌ No | Corporate apex; fleet script `update-npmplus-proxy-hosts-api.sh` |
| `www.sankofa.nexus` | same as apex | `http` | same | ❌ No | **301**`https://sankofa.nexus` |
| `portal.sankofa.nexus` | **`IP_SANKOFA_CLIENT_SSO`** (typ. `.51` / 7801) | `http` | **`SANKOFA_CLIENT_SSO_PORT`** (`3000`) | ❌ No | Client SSO portal; `NEXTAUTH_URL=https://portal.sankofa.nexus` |
| `admin.sankofa.nexus` | same as portal | `http` | same | ❌ No | Client access admin (same upstream until split) |
| `dash.sankofa.nexus` | **`IP_SANKOFA_DASH`** (set in `ip-addresses.conf`) | `http` | **`SANKOFA_DASH_PORT`** | ❌ No | Operator dash — row omitted from fleet script until `IP_SANKOFA_DASH` set |
| `phoenix.sankofa.nexus` | `192.168.11.50` | `http` | `4000` | ❌ No | Phoenix API - Cloud Platform Portal (VMID 7800) ✅ **Deployed** |
| `www.phoenix.sankofa.nexus` | `192.168.11.50` | `http` | `4000` | ❌ No | Phoenix API (VMID 7800) ✅ **Deployed** |
| `phoenix.sankofa.nexus` | `192.168.11.50` | `http` | **`8080`** (Tier-1 **API hub** nginx; `/graphql`→**127.0.0.1:4000**, `/api`→dbis_core); **WebSocket: yes** | ❌ No | NPM fleet: `SANKOFA_NPM_PHOENIX_PORT=8080`; Apollo **not** on `0.0.0.0:4000` (loopback bind); break-glass: `pct exec 7800``curl http://127.0.0.1:4000/health` |
| `www.phoenix.sankofa.nexus` | `192.168.11.50` | `http` | **`8080`** | ❌ No | Same; **301** → apex HTTPS |
| `the-order.sankofa.nexus`, `www.the-order.sankofa.nexus` | `192.168.11.39` (10210 HAProxy; default) or `192.168.11.51` (direct portal if env override) | `http` | `80` or `3000` | ❌ No | NPM → **.39:80** by default; HAProxy → **.51:3000** |
| `studio.sankofa.nexus` | `192.168.11.72` | `http` | `8000` | ❌ No | Sankofa Studio (FusionAI Creator) — VMID 7805 |
@@ -648,7 +649,7 @@ Some domains use path-based routing in NPM configs:
| `sankofa.nexus`, `www.sankofa.nexus` | **Public web:** **7806**, 192.168.11.63:3000 (`IP_SANKOFA_PUBLIC_WEB`) | 192.168.11.140 (Blockscout) |
| `portal.sankofa.nexus`, `admin.sankofa.nexus` | **7801**, 192.168.11.51:3000 (`IP_SANKOFA_CLIENT_SSO`) | 192.168.11.140 (Blockscout) |
| `dash.sankofa.nexus` | Set **`IP_SANKOFA_DASH`** when operator dash exists | 192.168.11.140 (Blockscout) |
| `phoenix.sankofa.nexus`, `www.phoenix.sankofa.nexus` | 7800, 192.168.11.50:4000 | 192.168.11.140 (Blockscout) |
| `phoenix.sankofa.nexus`, `www.phoenix.sankofa.nexus` | **7800**, `192.168.11.50:8080` (NPM → hub); Apollo **:4000** on same CT behind hub | 192.168.11.140 (Blockscout) |
| `the-order.sankofa.nexus`, `www.the-order.sankofa.nexus` | 10210, 192.168.11.39:80 | 192.168.11.140 (Blockscout) |
| `studio.sankofa.nexus` | 7805, 192.168.11.72:8000 | — |

View File

@@ -88,7 +88,9 @@
| **Sankofa / Phoenix public vs portal vs admin endpoints (fix list)** | [03-deployment/SANKOFA_PHOENIX_PUBLIC_PORTAL_ADMIN_ENDPOINT_CORRECTION_TASKS.md](03-deployment/SANKOFA_PHOENIX_PUBLIC_PORTAL_ADMIN_ENDPOINT_CORRECTION_TASKS.md) | — |
| **Sankofa marketplace surfaces** (native vs partner offerings; IRU catalog vs portal SSO vs Studio landing) | [03-deployment/SANKOFA_MARKETPLACE_SURFACES.md](03-deployment/SANKOFA_MARKETPLACE_SURFACES.md) | — |
| **Entity institutions** (Aseret, TAJ, Solace Bank Group — web/portal completion tracker) | [03-deployment/ENTITY_INSTITUTIONS_WEB_PORTAL_COMPLETION.md](03-deployment/ENTITY_INSTITUTIONS_WEB_PORTAL_COMPLETION.md) | Code: `~/projects/Aseret_Bank`, `~/projects/TAJ_PSFO/web`, `~/projects/Solace_Bank_Group/web`; static: [`solace-bank-group-portal/`](../solace-bank-group-portal/) |
| **Sankofa / Phoenix consolidated runtime** (single non-chain web hub + single API hub — resource model) | [02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md) | Examples + systemd: `config/nginx/sankofa-*.example.conf`, `config/systemd/sankofa-*-hub-nginx.service.example`, [`config/compose/sankofa-consolidated-runtime.example.yml`](../config/compose/sankofa-consolidated-runtime.example.yml); verify [`scripts/verify/check-sankofa-consolidated-nginx-examples.sh`](../scripts/verify/check-sankofa-consolidated-nginx-examples.sh); plan [`scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh`](../scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh) |
| **Non-chain ecosystem (hyperscaler-style cells, excl. blockchain plane)** | [02-architecture/NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md](02-architecture/NON_CHAIN_ECOSYSTEM_HYPERSCALER_STYLE_MODEL.md) | Edge, API hub, IdP, data cells; chain CTs stay separate |
| **Non-chain plan — gap analysis & backlog** | [02-architecture/NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md](02-architecture/NON_CHAIN_ECOSYSTEM_PLAN_REVIEW_AND_GAPS.md) | `TRUST_PROXY`, WebSockets, CORS, NPM vs ALB, `get_host_for_vmid`, dual-port exposure |
| **Sankofa / Phoenix consolidated runtime** (single non-chain web hub + single API hub — resource model) | [02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md](02-architecture/SANKOFA_PHOENIX_CONSOLIDATED_FRONTEND_AND_API.md); **r630-01 offload goal (phases + placement):** [03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md](03-deployment/SANKOFA_R630_01_CONSOLIDATION_AND_HUB_PLACEMENT_GOAL.md); **NPM hub cutover:** [03-deployment/SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md](03-deployment/SANKOFA_API_HUB_NPM_CUTOVER_AND_POST_CUTOVER_RUNBOOK.md) | Examples + systemd: `config/nginx/sankofa-*.example.conf`, `config/systemd/sankofa-*-hub-nginx.service.example`, [`config/compose/sankofa-consolidated-runtime.example.yml`](../config/compose/sankofa-consolidated-runtime.example.yml); `bash scripts/verify/check-sankofa-consolidated-nginx-examples.sh`; `bash scripts/verify/verify-sankofa-consolidated-hub-lan.sh`; `bash scripts/verify/smoke-phoenix-api-hub-lan.sh`; **`pnpm run verify:phoenix-graphql-wss`** (HTTP 101 WS upgrade); **`pnpm run verify:phoenix-graphql-ws-subscription`** (`connection_ack`); [`scripts/deployment/ensure-sankofa-phoenix-graphql-ws-remove-fastify-websocket-7800.sh`](../scripts/deployment/ensure-sankofa-phoenix-graphql-ws-remove-fastify-websocket-7800.sh); [`scripts/deployment/ensure-sankofa-phoenix-7800-nft-dport-4000-guard.sh`](../scripts/deployment/ensure-sankofa-phoenix-7800-nft-dport-4000-guard.sh); [`scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh`](../scripts/deployment/ensure-sankofa-phoenix-api-hub-graphql-ws-proxy-headers-7800.sh); [`scripts/deployment/ensure-sankofa-phoenix-api-env-lan-parity-7800.sh`](../scripts/deployment/ensure-sankofa-phoenix-api-env-lan-parity-7800.sh); [`scripts/deployment/ensure-sankofa-phoenix-api-db-migrate-up-7800.sh`](../scripts/deployment/ensure-sankofa-phoenix-api-db-migrate-up-7800.sh); plan [`scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh`](../scripts/deployment/plan-sankofa-consolidated-hub-cutover.sh); **Apollo loopback on 7800:** [`scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh`](../scripts/deployment/ensure-sankofa-phoenix-apollo-bind-loopback-7800.sh); **Firewall plan (read-only):** [`scripts/deployment/plan-phoenix-apollo-port-4000-restrict-7800.sh`](../scripts/deployment/plan-phoenix-apollo-port-4000-restrict-7800.sh); **API hub install (PVE):** [`scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh`](../scripts/deployment/install-sankofa-api-hub-nginx-on-pve.sh); **dbis `TRUST_PROXY` on CT:** [`scripts/deployment/ensure-dbis-api-trust-proxy-on-ct.sh`](../scripts/deployment/ensure-dbis-api-trust-proxy-on-ct.sh); CI: [`.github/workflows/validate-sankofa-nginx-examples.yml`](../.github/workflows/validate-sankofa-nginx-examples.yml) |
| **IP conflict resolutions** | [reports/status/IP_CONFLICTS_RESOLUTION_COMPLETE.md](../reports/status/IP_CONFLICTS_RESOLUTION_COMPLETE.md), `scripts/resolve-ip-conflicts.sh` | — |
| **Wormhole AI docs (LLM / MCP / RAG)** | [04-configuration/WORMHOLE_AI_RESOURCES_LLM_PLAYBOOK.md](04-configuration/WORMHOLE_AI_RESOURCES_LLM_PLAYBOOK.md), [04-configuration/WORMHOLE_AI_RESOURCES_RAG.md](04-configuration/WORMHOLE_AI_RESOURCES_RAG.md), `scripts/doc/sync-wormhole-ai-resources.sh`, `scripts/verify/verify-wormhole-ai-docs-setup.sh`, [`mcp-wormhole-docs/`](../mcp-wormhole-docs/) | Wormhole protocol reference only — not Chain 138 canonical addresses (use [11-references/EXPLORER_TOKEN_LIST_CROSSCHECK.md](11-references/EXPLORER_TOKEN_LIST_CROSSCHECK.md), CCIP runbooks for 138) |