Files
proxmox/docs/03-deployment/PHASE1_DISCOVERY_RUNBOOK.md

120 lines
4.0 KiB
Markdown
Raw Permalink Normal View History

# Phase 1 — Reality mapping runbook
**Last updated:** 2026-03-28
**Purpose:** Operational steps for [dbis_chain_138_technical_master_plan.md](../../dbis_chain_138_technical_master_plan.md) Sections 3 and 19.119.3: inventory Proxmox, Besu, optional Hyperledger CTs, and record dependency context.
**Outputs:** Timestamped report under `reports/phase1-discovery/` (created by the orchestrator script).
**Pass / fail semantics:** the orchestrator still writes a full evidence report when a critical section fails, but it now exits **non-zero** and appends a final **Critical failure summary** section. Treat the markdown as evidence capture, not automatic proof of success.
---
## Prerequisites
- Repo root; `jq` recommended for template audit.
- **LAN:** SSH keys to Proxmox nodes (default `192.168.11.10`, `.11`, `.12` from `config/ip-addresses.conf`).
- Optional: `curl` for RPC probe.
---
## One-command orchestrator
```bash
bash scripts/verify/run-phase1-discovery.sh
```
Optional Hyperledger container smoke checks (SSH to r630-02, `pct exec`):
```bash
HYPERLEDGER_PROBE=1 bash scripts/verify/run-phase1-discovery.sh
```
Each run writes:
- `reports/phase1-discovery/phase1-discovery-YYYYMMDD_HHMMSS.md` — human-readable report with embedded diagram and command output.
- `reports/phase1-discovery/phase1-discovery-YYYYMMDD_HHMMSS.log` — same content log mirror.
Critical sections for exit status:
- Proxmox template audit
- `pvecm` / `pvesm` / `pct list` / `qm list`
- Chain 138 core RPC quick probe
- `check-chain138-rpc-health.sh`
- `verify-besu-enodes-and-ips.sh`
- optional Hyperledger CT probe when `HYPERLEDGER_PROBE=1`
See also `reports/phase1-discovery/README.md`.
---
## Dependency graph (logical)
Ingress → RPC/sentries/validators → explorer; CCIP relay on r630-01 uses public RPC; FireFly/Fabric/Indy are optional DLT sides for the Section 18 flow.
```mermaid
flowchart TB
subgraph edge [EdgeIngress]
CF[Cloudflare_DNS]
NPM[NPMplus_LXC]
end
subgraph besu [Chain138_Besu]
RPCpub[RPC_public_2201]
RPCcore[RPC_core_2101]
Val[Validators_1000_1004]
Sen[Sentries_1500_1508]
end
subgraph observe [Observability]
BS[Blockscout_5000]
end
subgraph relay [CrossChain]
CCIP[CCIP_relay_r63001_host]
end
subgraph dlt [Hyperledger_optional]
FF[FireFly_6200_6201]
Fab[Fabric_6000_plus]
Indy[Indy_6400_plus]
end
CF --> NPM
NPM --> RPCpub
NPM --> RPCcore
NPM --> BS
RPCpub --> Sen
RPCcore --> Sen
Sen --> Val
CCIP --> RPCpub
FF --> Fab
FF --> Indy
```
**References:** [PROXMOX_VE_OPERATIONAL_DEPLOYMENT_TEMPLATE.md](PROXMOX_VE_OPERATIONAL_DEPLOYMENT_TEMPLATE.md), [ALL_VMIDS_ENDPOINTS.md](../04-configuration/ALL_VMIDS_ENDPOINTS.md), [NETWORK_CONFIGURATION_MASTER.md](../11-references/NETWORK_CONFIGURATION_MASTER.md).
---
## Manual follow-ups
| Task | Command / doc |
|------|----------------|
| Template vs live VMIDs | `bash scripts/verify/audit-proxmox-operational-template.sh` |
| Besu configs | `bash scripts/audit-besu-configs.sh` (review before running; LAN) |
| IP audit | `bash scripts/audit-all-vm-ips.sh` |
| Node role constitution | [DBIS_NODE_ROLE_MATRIX.md](../02-architecture/DBIS_NODE_ROLE_MATRIX.md) |
---
## ML110 documentation reconciliation
**Physical inventory** summary must match **live** role:
- If `192.168.11.10` still runs **Proxmox** and hosts guests, state that explicitly.
- If migration to **OPNsense/pfSense WAN aggregator** is in progress or complete, align with [NETWORK_CONFIGURATION_MASTER.md](../11-references/NETWORK_CONFIGURATION_MASTER.md) and [PHYSICAL_HARDWARE_INVENTORY.md](../02-architecture/PHYSICAL_HARDWARE_INVENTORY.md).
Use `pvecm status` and `pct list` on `.10` from the orchestrator output as evidence.
---
## Related
- [DBIS_NODE_ROLE_MATRIX.md](../02-architecture/DBIS_NODE_ROLE_MATRIX.md)
- [DBIS_PHASE2_PROXMOX_SOVEREIGNIZATION_ROADMAP.md](../02-architecture/DBIS_PHASE2_PROXMOX_SOVEREIGNIZATION_ROADMAP.md)
- [DBIS_PHASE3_E2E_PRODUCTION_SIMULATION_RUNBOOK.md](DBIS_PHASE3_E2E_PRODUCTION_SIMULATION_RUNBOOK.md)