- Provision oracle-publisher on CT 3500 (quoted DATA_SOURCE URLs, dotenv). - Host-side pct-lxc-3501-net-up for ccip-monitor eth0 after migrate. - CoinGecko key script: avoid sed & corruption; document quoted URLs. - Besu node list reload, fstrim/RPC scripts, storage health docs. - Submodule smom-dbis-138: web3 v6 pin, oracle check default host r630-02. Made-with: Cursor
8.9 KiB
Migrate LXC Containers from r630-01 to r630-02
Purpose: Free space on r630-01’s LVM thin pool (data) by moving selected containers to r630-02. Use after the pool is near or at 100% (e.g. to stabilise 2101 and other Besu nodes).
Hosts:
- Source: r630-01 — 192.168.11.11 (pool
data200G; ~74.48% after all migrations: 5200–5202, 6000–6002, 6400–6402, 5700) - Target: r630-02 — 192.168.11.12 (pools:
thin1-r630-02,thin2–thin6; seepvesm statuson r630-02)
Completed 2026-02-15: CTs 5200, 5201, 5202 (Cacti), 6000, 6001, 6002 (Fabric), 6400, 6401, 6402 (Indy), 5700 (dev-vm) migrated from r630-01 to r630-02 (backup → copy → destroy on source → restore on target → start). Storage: Cacti → thin1-r630-02; Fabric → thin2; Indy + dev-vm → thin6. r630-01 pool dropped to 74.48%. Cluster migration (pct migrate) was not used (aliased volumes / storage mismatch). Script: scripts/maintenance/migrate-ct-r630-01-to-r630-02.sh.
1. Check cluster (optional)
If both nodes are in the same Proxmox cluster, you can use live migration and skip backup/restore:
ssh root@192.168.11.11 "pvecm status"
ssh root@192.168.11.12 "pvecm status"
If both show the same cluster and the other node is listed, migration is:
# From r630-01 (or from any cluster node)
pct migrate <VMID> r630-02 --restart
CLI caveat: pct migrate may fail if the CT references storages that do not exist on the target (e.g. local-lvm on r630-02) or if the source storage ID is inactive on the target (e.g. thin1 on r630-02 vs thin1-r630-02). Remove stale unusedN volumes only after verifying with lvs that they are not the same LV as rootfs (see incident note below).
Recommended (PVE API, maps rootfs to target pool): use pvesh from the source node so disks land on e.g. thin5:
ssh root@192.168.11.11 "pvesh create /nodes/r630-01/lxc/<VMID>/migrate --target r630-02 --target-storage thin5 --restart 1"
This is the path that succeeded for 3501 (ccip-monitor) on 2026-03-28.
Storage will be copied to the target. The source volume is removed after a successful migrate. Do not use pct set <vmid> --delete unused0 when unused0 and rootfs both name vm-<id>-disk-0 on different storages — Proxmox can delete the only root LV (Oracle publisher 3500 incident, 2026-03-28).
If the nodes are not in a cluster, use the backup/restore method below.
2. Migration by backup/restore (standalone nodes)
Use this when there is no cluster or when you prefer a full backup before moving.
Prerequisites
- SSH as
rootto both 192.168.11.11 and 192.168.11.12 - Enough free space on r630-01 for the backup (or use a temporary NFS/shared path)
- Enough free space on r630-02 in the chosen storage (e.g.
thin1)
Steps (one container)
Replace <VMID> (e.g. 5200) and <TARGET_STORAGE> (e.g. thin1) as needed.
1. Stop the container on r630-01
ssh root@192.168.11.11 "pct stop <VMID>"
2. Create backup on r630-01
ssh root@192.168.11.11 "vzdump <VMID> --mode stop --compress zstd --storage local --dumpdir /var/lib/vz/dump"
Backup file will be under /var/lib/vz/dump/ (e.g. vzdump-lxc-<VMID>-*.tar.zst).
3. Copy backup to r630-02
BACKUP=$(ssh root@192.168.11.11 "ls -t /var/lib/vz/dump/vzdump-lxc-<VMID>-*.tar.zst 2>/dev/null | head -1")
scp "root@192.168.11.11:$BACKUP" /tmp/
scp "/tmp/$(basename $BACKUP)" root@192.168.11.12:/var/lib/vz/dump/
Or from r630-01:
BACKUP=$(ls -t /var/lib/vz/dump/vzdump-lxc-<VMID>-*.tar.zst 2>/dev/null | head -1)
scp "$BACKUP" root@192.168.11.12:/var/lib/vz/dump/
4. Restore on r630-02
ssh root@192.168.11.12 "pct restore <VMID> /var/lib/vz/dump/$(basename $BACKUP) --storage <TARGET_STORAGE>"
If the config has a fixed rootfs size (e.g. 50G), use:
ssh root@192.168.11.12 "pct restore <VMID> /var/lib/vz/dump/vzdump-lxc-<VMID>-*.tar.zst --storage thin1 -rootfs thin1:50"
5. Start container on r630-02
ssh root@192.168.11.12 "pct start <VMID>"
6. Free space on r630-01 (destroy original)
Only after you have verified the container works on r630-02:
ssh root@192.168.11.11 "pct destroy <VMID> --purge 1"
7. Update docs and scripts
- Update any references that assume the container runs on r630-01 (e.g.
config/ip-addresses.confcomments, runbooks, maintenance scripts). The IP does not change; only the Proxmox host changes. - If something (e.g. NPM, firewall) was keyed by host, point it at the same IP (unchanged).
3. Good candidates to migrate
Containers that free meaningful space on r630-01 and are reasonable to run on r630-02 (same LAN, same IP after move).
| VMID | Name / role | Approx. size (virtual) | Notes |
|---|---|---|---|
| 5200 | cacti-1 | 50G | ✅ Migrated (thin1-r630-02) |
| 5201 | cacti-alltra-1 | 50G | ✅ Migrated (thin1-r630-02) |
| 5202 | cacti-hybx-1 | 50G | ✅ Migrated (thin1-r630-02) |
| 6000 | fabric-1 | 50G | ✅ Migrated (thin2) |
| 6001 | fabric-alltra-1 | 100G | ✅ Migrated (thin2) |
| 6002 | fabric-hybx-1 | 100G | ✅ Migrated (thin2) |
| 6400 | indy-1 | 50G | ✅ Migrated (thin6) |
| 6401 | indy-alltra-1 | 100G | ✅ Migrated (thin6) |
| 6402 | indy-hybx-1 | 100G | ✅ Migrated (thin6) |
| 5700 | dev-vm | 400G (thin) | ✅ Migrated (thin6) |
| 3500 | oracle-publisher-1 | 20G thin1 (was) | 2026-03-28: root LV accidentally removed; CT recreated on r630-02 thin5 (fresh template). Redeploy app + .env. |
| 3501 | ccip-monitor-1 | 20G | 2026-03-28: migrated to r630-02 thin5 via pvesh … /migrate --target-storage thin5. Networking: unprivileged Ubuntu image may leave eth0 DOWN after migrate; unprivileged cannot be toggled later. Mitigation: on r630-02 install scripts/maintenance/pct-lxc-3501-net-up.sh to /usr/local/sbin/ and optional @reboot cron (see script header). |
High impact (larger disks):
- 5700 (dev-vm) — 400G virtual (only ~5% used). Migrating it frees a lot of thin pool potential; actual freed space depends on usage. Consider moving to r630-02 to avoid future pool pressure.
Do not migrate (keep on r630-01 for now):
- 2101 (Core RPC) — critical; fix pool first, then decide.
- 2500–2505 (RPC nodes) — same pool pressure; migrate only after pool is healthy or after moving other CTs.
- 10130, 10150, 10151 (DBIS) — core apps; move only with a clear plan.
- 1000–1502 (validators/sentries) — chain consensus; treat as critical.
4. Check storage on r630-02
Before restoring, confirm target storage name and space:
ssh root@192.168.11.12 "pvesm status"
ssh root@192.168.11.12 "lvs -o lv_name,data_percent,size"
Use a pool that has free space (e.g. thin1 at <85% or another thin*).
5. Scripted single-CT migration
From project root you can run (script below):
./scripts/maintenance/migrate-ct-r630-01-to-r630-02.sh <VMID> [target_storage]
Example:
./scripts/maintenance/migrate-ct-r630-01-to-r630-02.sh 5200 thin1
See the script for exact steps (stop, vzdump, scp, restore, start, optional destroy on source).
Unprivileged CTs: vzdump often fails with tar Permission denied under lxc-usernsexec. Prefer section 1 pvesh … /migrate with --target-storage instead of this script for those guests.
5a. Reprovision Oracle Publisher (VMID 3500) on r630-02
After a fresh LXC template or data loss, from project root (LAN, secrets loaded):
source scripts/lib/load-project-env.sh # or ensure PRIVATE_KEY / smom-dbis-138/.env
./scripts/deployment/provision-oracle-publisher-lxc-3500.sh
Uses web3 6.x (POA middleware). If on-chain updateAnswer fails, use a PRIVATE_KEY for an EOA allowed on the aggregator contract.
5b. r630-02 disk / VG limits (cannot automate)
Each thin1–thin6 VG on r630-02 is a single ~231 GiB SSD with ~124 MiB vg_free. There is no space to lvextend pools until you grow the partition/PV or add hardware. Guest fstrim and migration to thin5 reduce data usage only within existing pools.
6. References
- 502_DEEP_DIVE_ROOT_CAUSES_AND_FIXES.md — LVM thin pool full, 2101/2500–2505
- BLOCKSCOUT_FIX_RUNBOOK.md — Migrate VM 5000 to thin5 (same-host example)
- ALL_VMIDS_ENDPOINTS.md ·
config/proxmox-operational-template.json— VMID list and IPs - Proxmox: Backup and Restore