# QEMU Guest Agent: Complete Setup and Verification Procedure **Last Updated**: 2025-12-11 **Status**: ✅ Complete and Verified --- ## Overview This document provides comprehensive procedures for ensuring QEMU Guest Agent is properly configured in all VMs across the Sankofa Phoenix infrastructure. The guest agent is critical for: - Graceful VM shutdown/restart - VM lock prevention - Guest OS command execution - IP address detection - Resource monitoring --- ## Architecture ### Two-Level Configuration 1. **Proxmox Level** (`agent: 1` in VM config) - Configured by Crossplane provider automatically - Enables guest agent communication channel 2. **Guest OS Level** (package + service) - `qemu-guest-agent` package installed - `qemu-guest-agent` service running - Configured via cloud-init in all templates --- ## Automatic Configuration ### ✅ Crossplane Provider (Automatic) The Crossplane provider **automatically** sets `agent: 1` during: - **VM Creation** (`pkg/proxmox/client.go:317`) - **VM Cloning** (`pkg/proxmox/client.go:242`) - **VM Updates** (`pkg/proxmox/client.go:671`) **No manual intervention required** - this is handled by the provider. ### ✅ Cloud-Init Templates (Automatic) All VM templates include enhanced guest agent configuration: 1. **Package Installation**: `qemu-guest-agent` in packages list 2. **Service Enablement**: `systemctl enable qemu-guest-agent` 3. **Service Start**: `systemctl start qemu-guest-agent` 4. **Verification**: Automatic retry logic with status checks 5. **Error Handling**: Automatic installation if package missing **Templates Updated**: - ✅ `examples/production/basic-vm.yaml` - ✅ `examples/production/medium-vm.yaml` - ✅ `examples/production/large-vm.yaml` - ✅ `crossplane-provider-proxmox/examples/vm-example.yaml` - ✅ `gitops/infrastructure/claims/vm-claim-example.yaml` - ✅ All 29 production VM templates (via enhancement script) --- ## Verification Procedures ### 1. Check Proxmox Configuration **On Proxmox Node:** ```bash # Check if guest agent is enabled in VM config qm config | grep agent # Expected output: # agent: 1 ``` **If not enabled:** ```bash qm set --agent 1 ``` ### 2. Check Guest OS Package **On Proxmox Node (requires working guest agent):** ```bash # Check if package is installed qm guest exec -- dpkg -l | grep qemu-guest-agent # Expected output: # ii qemu-guest-agent amd64 Guest communication agent for QEMU ``` **If not installed (via console/SSH):** ```bash apt-get update apt-get install -y qemu-guest-agent systemctl enable qemu-guest-agent systemctl start qemu-guest-agent ``` ### 3. Check Guest OS Service **On Proxmox Node:** ```bash # Check service status qm guest exec -- systemctl status qemu-guest-agent # Expected output: # ● qemu-guest-agent.service - QEMU Guest Agent # Loaded: loaded (...) # Active: active (running) since ... ``` **If not running:** ```bash qm guest exec -- systemctl enable qemu-guest-agent qm guest exec -- systemctl start qemu-guest-agent ``` ### 4. Comprehensive Check Script **Use the automated check script:** ```bash # On Proxmox node /usr/local/bin/complete-vm-100-guest-agent-check.sh # Or for any VM: VMID=100 /usr/local/bin/complete-vm-100-guest-agent-check.sh ``` **Script checks:** - ✅ VM exists and is running - ✅ Proxmox guest agent config (`agent: 1`) - ✅ Package installation - ✅ Service status - ✅ Provides clear error messages --- ## Troubleshooting ### Issue: "No QEMU guest agent configured" **Symptoms:** - `qm guest exec` commands fail - Proxmox shows "No Guest Agent" in UI **Causes:** 1. Guest agent not enabled in Proxmox config 2. Package not installed in guest OS 3. Service not running in guest OS 4. VM needs restart after configuration **Solutions:** 1. **Enable in Proxmox:** ```bash qm set --agent 1 ``` 2. **Install in Guest OS:** ```bash # Via console or SSH apt-get update apt-get install -y qemu-guest-agent systemctl enable qemu-guest-agent systemctl start qemu-guest-agent ``` 3. **Restart VM:** ```bash qm shutdown # Graceful (requires working agent) # OR qm stop # Force stop qm start ``` ### Issue: VM Lock Issues **Symptoms:** - `qm` commands fail with lock errors - VM appears stuck **Solution:** ```bash # Check for locks ls -la /var/lock/qemu-server/lock-.conf # Remove lock (if safe) qm unlock # Force stop if needed qm stop --skiplock ``` ### Issue: Guest Agent Not Starting **Symptoms:** - Package installed but service not running - Service fails to start **Diagnosis:** ```bash # Check service logs journalctl -u qemu-guest-agent -n 50 # Check service status systemctl status qemu-guest-agent -l ``` **Common Causes:** - Missing dependencies - Permission issues - VM needs restart **Solution:** ```bash # Reinstall package apt-get remove --purge qemu-guest-agent apt-get install -y qemu-guest-agent # Restart service systemctl restart qemu-guest-agent # If still failing, restart VM ``` --- ## Best Practices ### 1. Always Include Guest Agent in Templates **Required cloud-init configuration:** ```yaml packages: - qemu-guest-agent runcmd: - systemctl enable qemu-guest-agent - systemctl start qemu-guest-agent - | # Verification with retry for i in {1..30}; do if systemctl is-active --quiet qemu-guest-agent; then echo "✅ Guest agent running" exit 0 fi sleep 1 done ``` ### 2. Verify After VM Creation **Always verify guest agent after creating a VM:** ```bash # Wait for cloud-init to complete (usually 1-2 minutes) sleep 120 # Check status qm guest exec -- systemctl status qemu-guest-agent ``` ### 3. Monitor Guest Agent Status **Regular monitoring:** ```bash # Check all VMs for vmid in $(qm list | tail -n +2 | awk '{print $1}'); do echo "VM $vmid:" qm config $vmid | grep agent || echo " ⚠️ Agent not configured" qm guest exec $vmid -- systemctl is-active qemu-guest-agent 2>/dev/null && echo " ✅ Running" || echo " ❌ Not running" done ``` ### 4. Document Exceptions If a VM cannot have guest agent (rare), document why: - Legacy OS without support - Special security requirements - Known limitations --- ## Scripts and Tools ### Available Scripts 1. **`scripts/complete-vm-100-guest-agent-check.sh`** - Comprehensive check for VM 100 - Installed on both Proxmox nodes - Location: `/usr/local/bin/complete-vm-100-guest-agent-check.sh` 2. **`scripts/copy-script-to-proxmox-nodes.sh`** - Copies scripts to Proxmox nodes - Uses SSH with password from `.env` 3. **`scripts/enhance-guest-agent-verification.py`** - Enhanced all 29 VM templates - Adds robust verification logic ### Usage **Copy script to Proxmox nodes:** ```bash bash scripts/copy-script-to-proxmox-nodes.sh ``` **Run check on Proxmox node:** ```bash ssh root@ /usr/local/bin/complete-vm-100-guest-agent-check.sh ``` --- ## Verification Checklist ### For New VMs - [ ] VM created with Crossplane provider (automatic `agent: 1`) - [ ] Cloud-init template includes `qemu-guest-agent` package - [ ] Cloud-init includes service enable/start commands - [ ] Wait for cloud-init to complete (1-2 minutes) - [ ] Verify package installed: `qm guest exec -- dpkg -l | grep qemu-guest-agent` - [ ] Verify service running: `qm guest exec -- systemctl status qemu-guest-agent` - [ ] Test graceful shutdown: `qm shutdown ` ### For Existing VMs - [ ] Check Proxmox config: `qm config | grep agent` - [ ] Enable if missing: `qm set --agent 1` - [ ] Check package: `qm guest exec -- dpkg -l | grep qemu-guest-agent` - [ ] Install if missing: `qm guest exec -- apt-get install -y qemu-guest-agent` - [ ] Check service: `qm guest exec -- systemctl status qemu-guest-agent` - [ ] Start if stopped: `qm guest exec -- systemctl start qemu-guest-agent` - [ ] Restart VM if needed: `qm shutdown ` or `qm stop && qm start ` --- ## Summary ✅ **Automatic Configuration:** - Crossplane provider sets `agent: 1` automatically - All templates include guest agent in cloud-init ✅ **Verification:** - Use check scripts on Proxmox nodes - Verify both Proxmox config and guest OS service ✅ **Troubleshooting:** - Enable in Proxmox: `qm set --agent 1` - Install in guest: `apt-get install -y qemu-guest-agent` - Start service: `systemctl start qemu-guest-agent` - Restart VM if needed ✅ **Best Practices:** - Always include in templates - Verify after creation - Monitor regularly - Document exceptions --- **Related Documents:** - `docs/GUEST_AGENT_CONFIGURATION_ANALYSIS.md` - `docs/VM_100_GUEST_AGENT_FIXED.md` - `docs/GUEST_AGENT_VERIFICATION_ENHANCEMENT_COMPLETE.md` - `docs/SCRIPT_COPIED_TO_PROXMOX_NODES.md`