Files
Sankofa/docs/runbooks/PROXMOX_TROUBLESHOOTING.md
defiQUG 9daf1fd378 Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements
- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
2025-12-12 18:01:35 -08:00

6.1 KiB

Proxmox Troubleshooting Guide

Common Issues and Solutions

Provider Not Connecting

Symptoms

  • Provider logs show connection errors
  • ProviderConfig status is not Ready
  • VM creation fails with connection errors

Solutions

  1. Verify Endpoint:

    curl -k https://your-proxmox:8006/api2/json/version
    
  2. Check Credentials:

    kubectl get secret proxmox-credentials -n crossplane-system -o yaml
    
  3. Test Authentication:

    curl -k -X POST \
      -d "username=root@pam&password=your-password" \
      https://your-proxmox:8006/api2/json/access/ticket
    
  4. Check Provider Logs:

    kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=100
    

VM Creation Fails

Symptoms

  • VM resource stuck in Creating state
  • Error messages in VM resource status
  • No VM appears in Proxmox

Solutions

  1. Check VM Resource:

    kubectl describe proxmoxvm <vm-name>
    
  2. Verify Site Configuration:

    • Site must exist in ProviderConfig
    • Endpoint must be reachable
    • Node name must match actual Proxmox node
  3. Check Proxmox Resources:

    • Storage pool must exist
    • Network bridge must exist
    • OS template must exist
  4. Check Proxmox Logs:

    • Log into Proxmox Web UI
    • Check System Log
    • Review task history

VM Status Not Updating

Symptoms

  • VM status remains unknown
  • IP address not populated
  • State not reflecting actual VM state

Solutions

  1. Check Provider Connectivity:

    kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox | grep -i error
    
  2. Verify VM Exists in Proxmox:

    • Check Proxmox Web UI
    • Verify VM ID matches
  3. Check Reconciliation:

    kubectl get proxmoxvm <vm-name> -o yaml | grep -A 5 conditions
    

Storage Issues

Symptoms

  • VM creation fails with storage errors
  • "Storage not found" errors
  • Insufficient storage errors

Solutions

  1. List Available Storage:

    # Via Proxmox API
    curl -k -H "Authorization: PVEAuthCookie=TOKEN" \
      https://your-proxmox:8006/api2/json/storage
    
  2. Check Storage Capacity:

    • Log into Proxmox Web UI
    • Check Storage section
    • Verify available space
  3. Update Storage Name:

    • Verify actual storage pool name
    • Update VM manifest if needed

Network Issues

Symptoms

  • VM created but no network connectivity
  • IP address not assigned
  • Network bridge errors

Solutions

  1. Verify Network Bridge:

    # Via Proxmox API
    curl -k -H "Authorization: PVEAuthCookie=TOKEN" \
      https://your-proxmox:8006/api2/json/nodes/ML110-01/network
    
  2. Check Network Configuration:

    • Verify bridge name in VM manifest
    • Check bridge exists on node
    • Verify bridge is active
  3. Check DHCP:

    • Verify DHCP server is running
    • Check network configuration
    • Review VM network settings

Authentication Failures

Symptoms

  • 401 Unauthorized errors
  • Authentication failed messages
  • Token/ticket errors

Solutions

  1. Verify Credentials:

    • Check username format: user@realm
    • Verify password is correct
    • Check token format if using tokens
  2. Test Authentication:

    # Password auth
    curl -k -X POST \
      -d "username=root@pam&password=your-password" \
      https://your-proxmox:8006/api2/json/access/ticket
    
    # Token auth
    curl -k -H "Authorization: PVEAuthCookie=TOKEN" \
      https://your-proxmox:8006/api2/json/version
    
  3. Check Permissions:

    • Verify user has VM creation permissions
    • Check token permissions
    • Review Proxmox user roles

Provider Pod Issues

Symptoms

  • Provider pod not starting
  • Provider pod crashing
  • Provider pod in Error state

Solutions

  1. Check Pod Status:

    kubectl get pods -n crossplane-system -l app=crossplane-provider-proxmox
    kubectl describe pod -n crossplane-system -l app=crossplane-provider-proxmox
    
  2. Check Pod Logs:

    kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=100
    
  3. Check Image:

    kubectl get deployment -n crossplane-system crossplane-provider-proxmox -o yaml | grep image
    
  4. Verify Resources:

    kubectl get deployment -n crossplane-system crossplane-provider-proxmox -o yaml | grep -A 5 resources
    

Diagnostic Commands

Check Provider Health

# Provider status
kubectl get deployment -n crossplane-system crossplane-provider-proxmox

# Provider logs
kubectl logs -n crossplane-system -l app=crossplane-provider-proxmox --tail=50

# Provider metrics
kubectl port-forward -n crossplane-system deployment/crossplane-provider-proxmox 8080:8080
curl http://localhost:8080/metrics

Check VM Resources

# List all VMs
kubectl get proxmoxvm

# Get VM details
kubectl get proxmoxvm <vm-name> -o yaml

# Check VM events
kubectl describe proxmoxvm <vm-name>

Check ProviderConfig

# List ProviderConfigs
kubectl get providerconfig

# Get ProviderConfig details
kubectl get providerconfig proxmox-provider-config -o yaml

# Check ProviderConfig status
kubectl describe providerconfig proxmox-provider-config

Escalation Procedures

Level 1: Basic Troubleshooting

  1. Check provider logs
  2. Verify credentials
  3. Test connectivity
  4. Review VM resource status

Level 2: Advanced Troubleshooting

  1. Check Proxmox Web UI
  2. Review Proxmox logs
  3. Verify network connectivity
  4. Check resource availability

Level 3: Infrastructure Issues

  1. Contact Proxmox administrator
  2. Check infrastructure status
  3. Review network configuration
  4. Verify DNS resolution

Prevention

  1. Regular Monitoring: Set up alerts for provider health
  2. Resource Verification: Verify resources before deployment
  3. Credential Rotation: Rotate credentials regularly
  4. Backup Configuration: Backup ProviderConfig and secrets
  5. Documentation: Keep documentation up to date