Files
proxmox/docs/archive/completion/PROXMOX_PVE_PVE2_FIX_COMPLETE.md
defiQUG bea1903ac9
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
Sync all local changes: docs, config, scripts, submodule refs, verification evidence
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-21 15:46:06 -08:00

5.5 KiB

Proxmox VE Fix Complete - pve and pve2

Date: 2025-01-20
Status: ALL ISSUES RESOLVED


Issues Fixed

Root Cause

The primary issue was hostname resolution failure. The pve-cluster service could not resolve the hostname "pve" or "pve2" to a non-loopback IP address, causing:

  • pve-cluster service to fail
  • /etc/pve filesystem not mounting
  • SSL certificates not accessible
  • pveproxy workers crashing

Error Message

Unable to resolve node name 'pve' to a non-loopback IP address - missing entry in '/etc/hosts' or DNS?

Fixes Applied

1. Hostname Resolution Fix

Script: scripts/fix-proxmox-hostname-resolution.sh

What it did:

  • Added proper entries to /etc/hosts on both hosts
  • Ensured hostnames resolve to their actual IP addresses (not loopback)
  • Added both current hostname (pve/pve2) and correct hostname (r630-01/r630-02)

Results:

  • pve-cluster service started successfully on both hosts
  • /etc/pve filesystem is now mounted
  • SSL certificates are accessible

2. SSL and Cluster Service Fix

Script: scripts/fix-proxmox-ssl-cluster.sh

What it did:

  • Regenerated SSL certificates
  • Restarted all Proxmox services in correct order
  • Verified service status

Results:

  • All services running
  • Web interface accessible (HTTP 200)
  • No worker exit errors

Current Status

pve (192.168.11.11 - r630-01)

Service Status Notes
pve-cluster Active (running) Cluster filesystem mounted
pvestatd Active (running) Status daemon working
pvedaemon Active (running) API daemon working
pveproxy Active (running) Web interface accessible
Web Interface Accessible HTTP Status: 200
Port 8006 Listening Workers running normally

pve2 (192.168.11.12 - r630-02)

Service Status Notes
pve-cluster Active (running) Cluster filesystem mounted
pvestatd Active (running) Status daemon working
pvedaemon Active (running) API daemon working
pveproxy Active (running) Web interface accessible
Web Interface Accessible HTTP Status: 200
Port 8006 Listening Workers running normally

/etc/hosts Configuration

pve (192.168.11.11)

192.168.11.11    pve pve.sankofa.nexus r630-01 r630-01.sankofa.nexus

pve2 (192.168.11.12)

192.168.11.12    pve2 pve2.sankofa.nexus r630-02 r630-02.sankofa.nexus

Key Point: The hostname (pve/pve2) must resolve to the actual IP address (192.168.11.11/12), not to 127.0.0.1. This is required for pve-cluster to function.


Cluster Status

Both nodes are in a cluster:

  • Cluster Name: h
  • Config Version: 3
  • Transport: knet
  • Status: Operational

Verification

Web Interface Access

# pve
curl -k https://192.168.11.11:8006/
# Returns: HTTP 200 ✅

# pve2
curl -k https://192.168.11.12:8006/
# Returns: HTTP 200 ✅

Service Status

# Check services on pve
ssh root@192.168.11.11 "systemctl status pve-cluster pvestatd pvedaemon pveproxy"

# Check services on pve2
ssh root@192.168.11.12 "systemctl status pve-cluster pvestatd pvedaemon pveproxy"

No Worker Exits

# Check for worker exit errors
ssh root@192.168.11.11 "journalctl -u pveproxy -n 50 | grep 'worker exit'"
# Should return: No recent worker exit errors ✅

Scripts Created

  1. scripts/diagnose-proxmox-hosts.sh

    • Comprehensive diagnostic tool
    • Tests connectivity, SSH, and all Proxmox services
    • Usage: ./scripts/diagnose-proxmox-hosts.sh [pve|pve2|both]
  2. scripts/fix-proxmox-hostname-resolution.sh

    • Fixes hostname resolution issues
    • Updates /etc/hosts with correct entries
    • Usage: ./scripts/fix-proxmox-hostname-resolution.sh
  3. scripts/fix-proxmox-ssl-cluster.sh

    • Fixes SSL and cluster service issues
    • Regenerates certificates and restarts services
    • Usage: ./scripts/fix-proxmox-ssl-cluster.sh [pve|pve2|both]

Lessons Learned

  1. Hostname Resolution is Critical

    • Proxmox VE requires hostnames to resolve to non-loopback IPs
    • /etc/hosts must have proper entries
    • DNS alone may not be sufficient
  2. Service Dependencies

    • pve-cluster must be running before other services
    • /etc/pve filesystem must be mounted for SSL certificates
    • Services must be started in correct order
  3. Cluster Filesystem

    • pmxcfs (Proxmox Cluster File System) is required
    • It provides /etc/pve as a FUSE filesystem
    • Without it, SSL certificates and configuration are inaccessible

Next Steps

  1. Monitor Services

    • Watch for any worker exit errors
    • Verify web interface remains accessible
  2. Consider Hostname Migration

    • Current hostnames: pve, pve2
    • Correct hostnames: r630-01, r630-02
    • Migration can be done later if needed (see HOSTNAME_MIGRATION_GUIDE.md)
  3. Document Cluster Configuration

    • Document cluster setup
    • Note any cluster-specific requirements


Last Updated: 2025-01-20
Status: All Issues Resolved
Both hosts are now fully operational!