Files
proxmox/docs/02-architecture/DBIS_PHASE2_PROXMOX_SOVEREIGNIZATION_ROADMAP.md
defiQUG 6f53323eae
All checks were successful
Deploy to Phoenix / deploy (push) Successful in 6s
Finalize DBIS infra verification and runtime baselines
2026-03-28 19:18:32 -07:00

70 lines
4.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DBIS Phase 2 — Proxmox sovereignization roadmap
**Last updated:** 2026-03-28
**Purpose:** Close the gap between **todays** Proxmox footprint (23 active cluster nodes, ZFS/LVM-backed guests, VLAN 11 LAN) and the **target** in [dbis_chain_138_technical_master_plan.md](../../dbis_chain_138_technical_master_plan.md) Sections 45 and 8 (multi-node HA, Ceph-backed storage, stronger segmentation, standardized templates).
**Current ground truth:** [PROXMOX_VE_OPERATIONAL_DEPLOYMENT_TEMPLATE.md](../03-deployment/PROXMOX_VE_OPERATIONAL_DEPLOYMENT_TEMPLATE.md), [config/proxmox-operational-template.json](../../config/proxmox-operational-template.json), [STORAGE_GROWTH_AND_HEALTH.md](../04-configuration/STORAGE_GROWTH_AND_HEALTH.md).
---
## Current state (summary)
| Area | As deployed (typical) | Master plan target |
|------|----------------------|-------------------|
| Cluster | Corosync cluster **h** on ml110 + r630-01 + r630-02 (ml110 **may** be repurposed — verify Phase 1) | 3+ control-oriented nodes, odd quorum, HA services |
| Storage | Local ZFS / LVM thin pools per host | Ceph OSD tier + pools for VM disks and/or RBD |
| Network | Primary **192.168.11.0/24**, VLAN 11, UDM Pro edge, NPMplus ingress | Additional VLANs: storage replication, validator-only, identity, explicit DMZ mapping |
| Workloads | Chain 138 Besu validators/RPC, Hyperledger CTs, apps — see [DBIS_NODE_ROLE_MATRIX.md](DBIS_NODE_ROLE_MATRIX.md) | Same roles, **template-standardized** provisioning |
---
## Milestone 1 — Cluster quorum and fleet expansion
- Bring **r630-03+** online per [R630_13_NODE_DOD_HA_MASTER_PLAN.md](R630_13_NODE_DOD_HA_MASTER_PLAN.md) and [11-references/13_NODE_AND_ASSETS_BRING_ONLINE_CHECKLIST.md](../11-references/13_NODE_AND_ASSETS_BRING_ONLINE_CHECKLIST.md).
- Maintain **odd** node count for Corosync quorum; use qdevice if temporarily even-count during ml110 migration ([UDM_PRO_PROXMOX_CLUSTER.md](../04-configuration/UDM_PRO_PROXMOX_CLUSTER.md)).
---
## Milestone 2 — ML110 migration / WAN aggregator
- **Before** repurposing ml110 to OPNsense/pfSense ([ML110_OPNSENSE_PFSENSE_WAN_AGGREGATOR.md](../11-references/ML110_OPNSENSE_PFSENSE_WAN_AGGREGATOR.md)): migrate all remaining CT/VM to R630s ([NETWORK_CONFIGURATION_MASTER.md](../11-references/NETWORK_CONFIGURATION_MASTER.md)).
- Re-document **physical inventory** row for `.10` after cutover ([PHYSICAL_HARDWARE_INVENTORY.md](PHYSICAL_HARDWARE_INVENTORY.md)).
---
## Milestone 3 — Ceph introduction (decision + prerequisites)
- **Decision record:** whether Ceph replaces or complements ZFS/LVM for new workloads; minimum network (10G storage net, jumbo frames if used), disk layout, and JBOD attachment per [HARDWARE_INVENTORY_MASTER.md](../11-references/HARDWARE_INVENTORY_MASTER.md).
- Pilot: non-production pool → migrate one test CT → expand OSD count.
---
## Milestone 4 — Network segmentation (incremental)
Map master plan layers to implementable steps:
1. Dedicated **storage replication** VLAN (Ceph backhaul or ZFS sync).
2. **Validator / P2P** constraints (firewall rules between sentry and RPC tiers — align [CHAIN138_CANONICAL_NETWORK_ROLES_VALIDATORS_SENTRY_AND_RPC.md](CHAIN138_CANONICAL_NETWORK_ROLES_VALIDATORS_SENTRY_AND_RPC.md)).
3. **Identity / Indy** tier isolation when multi-entity governance requires it.
---
## Milestone 5 — VM / CT templates (Section 7 of master plan)
- Align [PROXMOX_VM_CREATION_RUNBOOK.md](../03-deployment/PROXMOX_VM_CREATION_RUNBOOK.md) with template types: Identity (Indy/Aries), Settlement (Besu), Institutional (Fabric), Workflow (FireFly), Observability (Explorer/monitoring).
- Encode **preferred_node** and sizing in [DBIS_NODE_ROLE_MATRIX.md](DBIS_NODE_ROLE_MATRIX.md) and sync [proxmox-operational-template.json](../../config/proxmox-operational-template.json).
---
## Milestone 6 — Backup and DR alignment (master plan Sections 8, 16)
- Hourly/daily snapshot policy per guest tier; cross-site replication targets (RPO/RTO) documented outside this file when available.
- Reference: existing backup scripts for NPMplus and operator checklist.
---
## Related
- [PHASE1_DISCOVERY_RUNBOOK.md](../03-deployment/PHASE1_DISCOVERY_RUNBOOK.md)
- [DBIS_PHASE3_E2E_PRODUCTION_SIMULATION_RUNBOOK.md](../03-deployment/DBIS_PHASE3_E2E_PRODUCTION_SIMULATION_RUNBOOK.md)