proxmox/docs/archive/historical/REMAINING_STEPS.md

# Remaining Steps - Proxmox VE Configuration

**Date:** 2025-01-20
**Status:** Critical tasks complete, optional steps remaining

---

## ✅ Completed Tasks

1. ✅ **Hostname Migration**
   - r630-01: `pve` → `r630-01` ✅
   - r630-02: `pve2` → `r630-02` ✅

2. ✅ **IP Address Audit**
   - 34 VMs/containers scanned
   - 0 IP conflicts ✅
   - All IPs documented ✅

3. ✅ **Storage Configuration**
   - r630-01: thin1 (200GB) + local-lvm (200GB) enabled ✅
   - r630-02: thin1 (113GB available) + thin2-thin6 enabled ✅
   - Total: ~2.4TB+ available ✅

---

## ⚠️ HIGH PRIORITY - Remaining Steps

### 1. Update Cluster Configuration ⚠️ CRITICAL

**Issue:** Cluster still shows old hostnames (`pve`, `pve2`) instead of new hostnames (`r630-01`, `r630-02`)

**Current Status:**
```
Cluster nodes:
- Node 1: ml110 ✅
- Node 2: pve (should be r630-01) ⚠️
- Node 3: pve2 (should be r630-02) ⚠️
```

**Action Required:**
```bash
# Option 1: Update cluster node names (if supported)
pvecm updatecerts -f
# May require cluster reconfiguration

# Option 2: Verify if hostname changes are sufficient
# Cluster may auto-update on next quorum change
```

**Verification:**
```bash
pvecm status
pvecm nodes
# Should show r630-01 and r630-02
```

**Impact:** Cluster operations may reference old hostnames, could cause confusion

---

### 2. Verify VMs on r630-02 Storage ⚠️ RECOMMENDED

**Issue:** Storage shows VMs exist (VMIDs: 100, 101, 102, 103, 104, 105, 130, 5000, 6200 on thin1, VMID 7800 on thin4), but `pct list` and `qm list` show nothing

**Action Required:**
```bash
ssh root@192.168.11.12

# Check all storage for VMs
pvesm list thin1
pvesm list thin4

# Check if VMs are registered
pct list
qm list

# Check VM configurations
ls -la /etc/pve/nodes/r630-02/lxc/
ls -la /etc/pve/nodes/r630-02/qemu-server/

# Check if VMs are on different node
pvesh get /nodes --output-format json | jq
```

**Possible Issues:**
- VMs may be registered on different node (ml110 or r630-01)
- VMs may be orphaned (storage exists but not registered)
- VMs may need to be re-registered

**Impact:** Need to understand VM status before starting new VMs

---

### 3. Test Storage Performance ⚠️ RECOMMENDED

**Action Required:**
```bash
# On r630-01
ssh root@192.168.11.11
# Create test container
pct create 9999 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
    --storage local-lvm --hostname test-storage --net0 name=eth0,bridge=vmbr0
# Test performance
# Delete test container
pct destroy 9999

# On r630-02
ssh root@192.168.11.12
# Create test container
pct create 9999 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
    --storage thin1 --hostname test-storage --net0 name=eth0,bridge=vmbr0
# Test performance
# Delete test container
pct destroy 9999
```

**Purpose:** Verify storage is working correctly before deploying production VMs

---

## 📋 OPTIONAL - Optimization Steps

### 4. Distribute VMs Across Hosts 📋 RECOMMENDED

**Current State:** All 34 VMs on ml110 (overloaded)

**Recommended Distribution:**
- **ml110:** Keep 10-15 lightweight/management VMs
- **r630-01:** Migrate 10-15 medium workload VMs
- **r630-02:** Migrate 10-15 heavy workload VMs (best CPU - 56 cores)

**Migration Commands:**
```bash
# From ml110 to r630-01
pct migrate <VMID> r630-01 --storage local-lvm

# From ml110 to r630-02
pct migrate <VMID> r630-02 --storage thin2  # or thin3, thin5, thin6
```

**Benefits:**
- Better performance (ml110 CPU is slower)
- Better resource utilization
- Improved redundancy

**Estimated Time:** 1-2 hours (depending on VM sizes)

---

### 5. Implement Storage Monitoring 📋 RECOMMENDED

**Action Required:**
```bash
# Set up storage alerts (manual or via monitoring system)
# Monitor:
# - Storage usage >80%
# - Thin pool metadata usage
# - Storage growth trends
```

**Tools:**
- Proxmox built-in monitoring
- External monitoring (Prometheus, Grafana)
- Custom scripts

**Purpose:** Proactive alerting before storage issues occur

---

### 6. Security Hardening 📋 RECOMMENDED

**Actions:**
1. **Update Passwords**
   ```bash
   # Change weak passwords on r630-01 and r630-02
   passwd root
   ```

2. **SSH Key Authentication**
   ```bash
   # Set up SSH keys instead of passwords
   ssh-copy-id root@192.168.11.11
   ssh-copy-id root@192.168.11.12
   ```

3. **Firewall Configuration**
   ```bash
   # Review and configure firewall rules
   # Restrict access where needed
   ```

4. **Access Control Review**
   - Review user permissions
   - Implement least privilege
   - Audit access logs

---

### 7. Network Optimization 📋 OPTIONAL

**Actions:**
1. **VLAN Migration** (Planned)
   - Segment network by service type
   - Improve security
   - Better traffic management

2. **Network Monitoring**
   - Monitor bandwidth usage
   - Track performance
   - Alert on issues

---

### 8. Documentation Updates 📋 OPTIONAL

**Actions:**
1. Update any scripts/configs that reference old hostnames (`pve`, `pve2`)
2. Update documentation with new hostnames
3. Update inventory files if needed

**Search for references:**
```bash
grep -r "pve\|pve2" scripts/ config/ docs/ --exclude-dir=.git
```

---

## 🚀 Ready to Start VMs

### Pre-Start Checklist Status

- [x] Hostnames migrated ✅
- [x] IP addresses audited ✅
- [x] No IP conflicts ✅
- [x] Storage enabled on r630-01 ✅
- [x] Storage enabled on r630-02 ✅
- [x] Proxmox services operational ✅
- [ ] **Cluster configuration updated** ⚠️
- [ ] **VMs on r630-02 verified** ⚠️
- [ ] **Storage tested** ⚠️

### Can Start VMs Now?

**Yes, but recommended to:**
1. Update cluster configuration first (prevents confusion)
2. Verify r630-02 VMs (understand existing state)
3. Test storage (ensure it works)

**Critical blockers:** None - all critical tasks complete

---

## 📊 Priority Summary

### 🔴 CRITICAL (Do Before Production)
1. Update cluster configuration
2. Verify r630-02 VMs status

### ⚠️ HIGH PRIORITY (Recommended)
3. Test storage performance
4. Distribute VMs across hosts

### 📋 RECOMMENDED (For Optimization)
5. Implement monitoring
6. Security hardening
7. Network optimization
8. Documentation updates

---

## 🎯 Quick Action Plan

### Immediate (15-30 minutes)
1. Update cluster configuration
2. Verify r630-02 VMs

### Short-term (1-2 hours)
3. Test storage
4. Plan VM distribution

### Long-term (Ongoing)
5. Implement monitoring
6. Security hardening
7. Network optimization

---

## 📝 Commands Reference

### Cluster Management
```bash
# Check cluster status
pvecm status

# List nodes
pvecm nodes

# Update certificates (may help with hostname updates)
pvecm updatecerts -f
```

### VM Verification
```bash
# List all VMs on a node
pct list
qm list

# Check storage contents
pvesm list <storage-name>

# Check VM configurations
ls -la /etc/pve/nodes/<node>/lxc/
ls -la /etc/pve/nodes/<node>/qemu-server/
```

### Storage Testing
```bash
# Create test container
pct create 9999 local:vztmpl/ubuntu-22.04-standard_22.04-1_amd64.tar.zst \
    --storage <storage-name> --hostname test

# Destroy test container
pct destroy 9999
```

---

**Last Updated:** 2025-01-20
**Status:** Critical tasks complete, optional steps available