- Organized 252 files across project - Root directory: 187 → 2 files (98.9% reduction) - Moved configuration guides to docs/04-configuration/ - Moved troubleshooting guides to docs/09-troubleshooting/ - Moved quick start guides to docs/01-getting-started/ - Moved reports to reports/ directory - Archived temporary files - Generated comprehensive reports and documentation - Created maintenance scripts and guides All files organized according to established standards.
6.0 KiB
6.0 KiB
Cluster Migration Plan - LXC Containers to pve2
Date: $(date)
Status: 📋 Planning Phase
Cluster Overview
Current Cluster Status
Cluster Name: h
Nodes: 3 (ml110, pve, pve2)
Status: ✅ Quorate (all nodes online)
Node Resources
| Node | CPUs | RAM | RAM Used | RAM % | Disk | Disk Used | Disk % | Status |
|---|---|---|---|---|---|---|---|---|
| ml110 | 6 | 125.67 GB | 35.61 GB | 28.3% | 93.93 GB | 7.21 GB | 7.7% | 🟢 Online |
| pve | 32 | 503.79 GB | 5.62 GB | 1.1% | 538.78 GB | 2.06 GB | 0.4% | 🟢 Online |
| pve2 | 56 | 251.77 GB | 4.49 GB | 1.8% | 222.90 GB | 1.97 GB | 0.9% | 🟢 Online |
Analysis:
- ml110 is heavily loaded (28.3% RAM, 9.4% CPU) with all 25 containers
- pve2 has abundant resources (1.8% RAM, 0.06% CPU) - ideal migration target
- pve also has capacity but pve2 has more CPUs (56 vs 32)
Current Container Distribution
All containers are currently on ml110 node (25 total):
Infrastructure Services (Keep on ml110)
- 100: proxmox-mail-gateway
- 101: proxmox-datacenter-manager
- 102: cloudflared
- 103: omada
- 104: gitea
- 105: nginxproxymanager
- 130: monitoring-1
Besu Blockchain Nodes (High Priority for Migration)
Validators (High resource usage - 8GB RAM each):
- 1000: besu-validator-1
- 1001: besu-validator-2
- 1002: besu-validator-3
- 1003: besu-validator-4
- 1004: besu-validator-5
Sentries (Moderate resource usage - 4GB RAM each):
- 1500: besu-sentry-1
- 1501: besu-sentry-2
- 1502: besu-sentry-3
- 1503: besu-sentry-4
RPC Nodes (Very high resource usage - 16GB RAM each):
- 2500: besu-rpc-1
- 2501: besu-rpc-2
- 2502: besu-rpc-3
Application Services (Medium Priority)
- 3000-3003: ml110 containers (4 containers)
- 3500: oracle-publisher-1
- 3501: ccip-monitor-1
- 5000: blockscout-1 (database intensive)
- 6200: firefly-1
Migration Strategy
Phase 1: High Resource Containers (Priority 1)
Target: Move high-resource Besu nodes to pve2
Containers to Migrate:
- Besu RPC nodes (2500-2502) - 16GB RAM each = 48GB total
- Besu Validators (1000-1004) - 8GB RAM each = 40GB total
- Blockscout (5000) - Database intensive
Expected Impact:
- Reduces ml110 RAM usage by ~88GB+ (from 35.61GB to much lower)
- Reduces ml110 CPU load significantly
- Utilizes pve2's 56 CPUs and 251GB RAM capacity
Migration Order:
- Start with RPC nodes (one at a time to minimize disruption)
- Then validators (can migrate in parallel if needed)
- Finally Blockscout (database may take longer)
Phase 2: Medium Resource Containers (Priority 2)
Containers to Migrate:
- Besu Sentries (1500-1503) - 4GB RAM each = 16GB total
- Oracle Publisher (3500)
- CCIP Monitor (3501)
- Firefly (6200)
- ml110 containers (3000-3003) - if needed
Phase 3: Keep on ml110 (Infrastructure)
Containers to Keep:
- Infrastructure services (100-105) - Core infrastructure
- Monitoring (130) - Should remain on primary node
Migration Commands
Single Container Migration
# Migrate a single container
ssh root@192.168.11.10 "pct migrate <VMID> pve2 --restart"
# Example: Migrate besu-rpc-1
ssh root@192.168.11.10 "pct migrate 2500 pve2 --restart"
Using Migration Script
# Dry run to see what would be migrated
./scripts/migrate-containers-to-pve2.sh --dry-run
# Execute migration
./scripts/migrate-containers-to-pve2.sh
Batch Migration
# Migrate all RPC nodes
for vmid in 2500 2501 2502; do
ssh root@192.168.11.10 "pct migrate $vmid pve2 --restart"
sleep 30 # Wait between migrations
done
# Migrate all validators
for vmid in 1000 1001 1002 1003 1004; do
ssh root@192.168.11.10 "pct migrate $vmid pve2 --restart"
sleep 30
done
Migration Considerations
Pre-Migration Checklist
- Verify cluster is quorate
- Verify target node (pve2) is online
- Check available storage on pve2
- Verify network connectivity between nodes
- Plan maintenance window if needed
- Backup critical containers (if needed)
- Notify users of potential brief service interruption
During Migration
- Migration is live - Containers remain running during migration
- Network downtime - Brief network interruption during cutover
- Storage migration - Container disk images are copied to target node
- Restart - Container is restarted on target node after migration
Post-Migration Verification
# Verify containers are on pve2
ssh root@192.168.11.10 "pvesh get /nodes/pve2/lxc"
# Check container status
ssh root@192.168.11.10 "pct status 2500"
# Verify network connectivity from container
ssh root@192.168.11.10 "pct exec 2500 -- ping -c 3 192.168.11.250"
Rollback Plan
If migration fails or issues occur:
# Migrate container back to ml110
ssh root@192.168.11.10 "pct migrate <VMID> ml110 --restart"
Expected Results
After Phase 1 Migration
ml110:
- RAM usage: ~35.61GB → ~10GB (reduced by ~25GB)
- CPU usage: ~9.4% → ~3-4%
- Containers: 25 → ~14 containers
pve2:
- RAM usage: ~4.49GB → ~90GB (increased by ~85GB)
- CPU usage: ~0.06% → ~5-10%
- Containers: 0 → ~11 containers
Resource Distribution
| Node | Containers | RAM Usage | Status |
|---|---|---|---|
| ml110 | ~14 | ~10GB (8%) | ✅ Balanced |
| pve | 0 | ~5.6GB (1.1%) | ✅ Available |
| pve2 | ~11 | ~90GB (36%) | ✅ Well utilized |
Next Steps
- ✅ Review and approve migration plan
- ⏳ Execute Phase 1 migrations (RPC nodes, validators, Blockscout)
- ⏳ Verify all containers are running correctly on pve2
- ⏳ Monitor resource usage on both nodes
- ⏳ Execute Phase 2 migrations if needed
- ⏳ Document final container distribution
Related Scripts
scripts/analyze-cluster-migration.sh- Analyze cluster and container distributionscripts/migrate-containers-to-pve2.sh- Execute container migrationsscripts/get-container-distribution.sh- List containers by node
Last Updated: $(date)
Status: Ready for execution