- Added lock file exclusions for pnpm in .gitignore. - Removed obsolete package-lock.json from the api and portal directories. - Enhanced Cloudflare adapter with additional interfaces for zones and tunnels. - Improved Proxmox adapter error handling and logging for API requests. - Updated Proxmox VM parameters with validation rules in the API schema. - Enhanced documentation for Proxmox VM specifications and examples.
381 lines
8.7 KiB
Markdown
381 lines
8.7 KiB
Markdown
# QEMU Guest Agent: Complete Setup and Verification Procedure
|
|
|
|
**Last Updated**: 2025-12-11
|
|
**Status**: ✅ Complete and Verified
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This document provides comprehensive procedures for ensuring QEMU Guest Agent is properly configured in all VMs across the Sankofa Phoenix infrastructure. The guest agent is critical for:
|
|
|
|
- Graceful VM shutdown/restart
|
|
- VM lock prevention
|
|
- Guest OS command execution
|
|
- IP address detection
|
|
- Resource monitoring
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
### Two-Level Configuration
|
|
|
|
1. **Proxmox Level** (`agent: 1` in VM config)
|
|
- Configured by Crossplane provider automatically
|
|
- Enables guest agent communication channel
|
|
|
|
2. **Guest OS Level** (package + service)
|
|
- `qemu-guest-agent` package installed
|
|
- `qemu-guest-agent` service running
|
|
- Configured via cloud-init in all templates
|
|
|
|
---
|
|
|
|
## Automatic Configuration
|
|
|
|
### ✅ Crossplane Provider (Automatic)
|
|
|
|
The Crossplane provider **automatically** sets `agent: 1` during:
|
|
- **VM Creation** (`pkg/proxmox/client.go:317`)
|
|
- **VM Cloning** (`pkg/proxmox/client.go:242`)
|
|
- **VM Updates** (`pkg/proxmox/client.go:671`)
|
|
|
|
**No manual intervention required** - this is handled by the provider.
|
|
|
|
### ✅ Cloud-Init Templates (Automatic)
|
|
|
|
All VM templates include enhanced guest agent configuration:
|
|
|
|
1. **Package Installation**: `qemu-guest-agent` in packages list
|
|
2. **Service Enablement**: `systemctl enable qemu-guest-agent`
|
|
3. **Service Start**: `systemctl start qemu-guest-agent`
|
|
4. **Verification**: Automatic retry logic with status checks
|
|
5. **Error Handling**: Automatic installation if package missing
|
|
|
|
**Templates Updated**:
|
|
- ✅ `examples/production/basic-vm.yaml`
|
|
- ✅ `examples/production/medium-vm.yaml`
|
|
- ✅ `examples/production/large-vm.yaml`
|
|
- ✅ `crossplane-provider-proxmox/examples/vm-example.yaml`
|
|
- ✅ `gitops/infrastructure/claims/vm-claim-example.yaml`
|
|
- ✅ All 29 production VM templates (via enhancement script)
|
|
|
|
---
|
|
|
|
## Verification Procedures
|
|
|
|
### 1. Check Proxmox Configuration
|
|
|
|
**On Proxmox Node:**
|
|
|
|
```bash
|
|
# Check if guest agent is enabled in VM config
|
|
qm config <VMID> | grep agent
|
|
|
|
# Expected output:
|
|
# agent: 1
|
|
```
|
|
|
|
**If not enabled:**
|
|
```bash
|
|
qm set <VMID> --agent 1
|
|
```
|
|
|
|
### 2. Check Guest OS Package
|
|
|
|
**On Proxmox Node (requires working guest agent):**
|
|
|
|
```bash
|
|
# Check if package is installed
|
|
qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent
|
|
|
|
# Expected output:
|
|
# ii qemu-guest-agent <version> amd64 Guest communication agent for QEMU
|
|
```
|
|
|
|
**If not installed (via console/SSH):**
|
|
```bash
|
|
apt-get update
|
|
apt-get install -y qemu-guest-agent
|
|
systemctl enable qemu-guest-agent
|
|
systemctl start qemu-guest-agent
|
|
```
|
|
|
|
### 3. Check Guest OS Service
|
|
|
|
**On Proxmox Node:**
|
|
|
|
```bash
|
|
# Check service status
|
|
qm guest exec <VMID> -- systemctl status qemu-guest-agent
|
|
|
|
# Expected output:
|
|
# ● qemu-guest-agent.service - QEMU Guest Agent
|
|
# Loaded: loaded (...)
|
|
# Active: active (running) since ...
|
|
```
|
|
|
|
**If not running:**
|
|
```bash
|
|
qm guest exec <VMID> -- systemctl enable qemu-guest-agent
|
|
qm guest exec <VMID> -- systemctl start qemu-guest-agent
|
|
```
|
|
|
|
### 4. Comprehensive Check Script
|
|
|
|
**Use the automated check script:**
|
|
|
|
```bash
|
|
# On Proxmox node
|
|
/usr/local/bin/complete-vm-100-guest-agent-check.sh
|
|
|
|
# Or for any VM:
|
|
VMID=100
|
|
/usr/local/bin/complete-vm-100-guest-agent-check.sh
|
|
```
|
|
|
|
**Script checks:**
|
|
- ✅ VM exists and is running
|
|
- ✅ Proxmox guest agent config (`agent: 1`)
|
|
- ✅ Package installation
|
|
- ✅ Service status
|
|
- ✅ Provides clear error messages
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: "No QEMU guest agent configured"
|
|
|
|
**Symptoms:**
|
|
- `qm guest exec` commands fail
|
|
- Proxmox shows "No Guest Agent" in UI
|
|
|
|
**Causes:**
|
|
1. Guest agent not enabled in Proxmox config
|
|
2. Package not installed in guest OS
|
|
3. Service not running in guest OS
|
|
4. VM needs restart after configuration
|
|
|
|
**Solutions:**
|
|
|
|
1. **Enable in Proxmox:**
|
|
```bash
|
|
qm set <VMID> --agent 1
|
|
```
|
|
|
|
2. **Install in Guest OS:**
|
|
```bash
|
|
# Via console or SSH
|
|
apt-get update
|
|
apt-get install -y qemu-guest-agent
|
|
systemctl enable qemu-guest-agent
|
|
systemctl start qemu-guest-agent
|
|
```
|
|
|
|
3. **Restart VM:**
|
|
```bash
|
|
qm shutdown <VMID> # Graceful (requires working agent)
|
|
# OR
|
|
qm stop <VMID> # Force stop
|
|
qm start <VMID>
|
|
```
|
|
|
|
### Issue: VM Lock Issues
|
|
|
|
**Symptoms:**
|
|
- `qm` commands fail with lock errors
|
|
- VM appears stuck
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Check for locks
|
|
ls -la /var/lock/qemu-server/lock-<VMID>.conf
|
|
|
|
# Remove lock (if safe)
|
|
qm unlock <VMID>
|
|
|
|
# Force stop if needed
|
|
qm stop <VMID> --skiplock
|
|
```
|
|
|
|
### Issue: Guest Agent Not Starting
|
|
|
|
**Symptoms:**
|
|
- Package installed but service not running
|
|
- Service fails to start
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check service logs
|
|
journalctl -u qemu-guest-agent -n 50
|
|
|
|
# Check service status
|
|
systemctl status qemu-guest-agent -l
|
|
```
|
|
|
|
**Common Causes:**
|
|
- Missing dependencies
|
|
- Permission issues
|
|
- VM needs restart
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Reinstall package
|
|
apt-get remove --purge qemu-guest-agent
|
|
apt-get install -y qemu-guest-agent
|
|
|
|
# Restart service
|
|
systemctl restart qemu-guest-agent
|
|
|
|
# If still failing, restart VM
|
|
```
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### 1. Always Include Guest Agent in Templates
|
|
|
|
**Required cloud-init configuration:**
|
|
|
|
```yaml
|
|
packages:
|
|
- qemu-guest-agent
|
|
|
|
runcmd:
|
|
- systemctl enable qemu-guest-agent
|
|
- systemctl start qemu-guest-agent
|
|
- |
|
|
# Verification with retry
|
|
for i in {1..30}; do
|
|
if systemctl is-active --quiet qemu-guest-agent; then
|
|
echo "✅ Guest agent running"
|
|
exit 0
|
|
fi
|
|
sleep 1
|
|
done
|
|
```
|
|
|
|
### 2. Verify After VM Creation
|
|
|
|
**Always verify guest agent after creating a VM:**
|
|
|
|
```bash
|
|
# Wait for cloud-init to complete (usually 1-2 minutes)
|
|
sleep 120
|
|
|
|
# Check status
|
|
qm guest exec <VMID> -- systemctl status qemu-guest-agent
|
|
```
|
|
|
|
### 3. Monitor Guest Agent Status
|
|
|
|
**Regular monitoring:**
|
|
|
|
```bash
|
|
# Check all VMs
|
|
for vmid in $(qm list | tail -n +2 | awk '{print $1}'); do
|
|
echo "VM $vmid:"
|
|
qm config $vmid | grep agent || echo " ⚠️ Agent not configured"
|
|
qm guest exec $vmid -- systemctl is-active qemu-guest-agent 2>/dev/null && echo " ✅ Running" || echo " ❌ Not running"
|
|
done
|
|
```
|
|
|
|
### 4. Document Exceptions
|
|
|
|
If a VM cannot have guest agent (rare), document why:
|
|
- Legacy OS without support
|
|
- Special security requirements
|
|
- Known limitations
|
|
|
|
---
|
|
|
|
## Scripts and Tools
|
|
|
|
### Available Scripts
|
|
|
|
1. **`scripts/complete-vm-100-guest-agent-check.sh`**
|
|
- Comprehensive check for VM 100
|
|
- Installed on both Proxmox nodes
|
|
- Location: `/usr/local/bin/complete-vm-100-guest-agent-check.sh`
|
|
|
|
2. **`scripts/copy-script-to-proxmox-nodes.sh`**
|
|
- Copies scripts to Proxmox nodes
|
|
- Uses SSH with password from `.env`
|
|
|
|
3. **`scripts/enhance-guest-agent-verification.py`**
|
|
- Enhanced all 29 VM templates
|
|
- Adds robust verification logic
|
|
|
|
### Usage
|
|
|
|
**Copy script to Proxmox nodes:**
|
|
```bash
|
|
bash scripts/copy-script-to-proxmox-nodes.sh
|
|
```
|
|
|
|
**Run check on Proxmox node:**
|
|
```bash
|
|
ssh root@<proxmox-node>
|
|
/usr/local/bin/complete-vm-100-guest-agent-check.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Verification Checklist
|
|
|
|
### For New VMs
|
|
|
|
- [ ] VM created with Crossplane provider (automatic `agent: 1`)
|
|
- [ ] Cloud-init template includes `qemu-guest-agent` package
|
|
- [ ] Cloud-init includes service enable/start commands
|
|
- [ ] Wait for cloud-init to complete (1-2 minutes)
|
|
- [ ] Verify package installed: `qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent`
|
|
- [ ] Verify service running: `qm guest exec <VMID> -- systemctl status qemu-guest-agent`
|
|
- [ ] Test graceful shutdown: `qm shutdown <VMID>`
|
|
|
|
### For Existing VMs
|
|
|
|
- [ ] Check Proxmox config: `qm config <VMID> | grep agent`
|
|
- [ ] Enable if missing: `qm set <VMID> --agent 1`
|
|
- [ ] Check package: `qm guest exec <VMID> -- dpkg -l | grep qemu-guest-agent`
|
|
- [ ] Install if missing: `qm guest exec <VMID> -- apt-get install -y qemu-guest-agent`
|
|
- [ ] Check service: `qm guest exec <VMID> -- systemctl status qemu-guest-agent`
|
|
- [ ] Start if stopped: `qm guest exec <VMID> -- systemctl start qemu-guest-agent`
|
|
- [ ] Restart VM if needed: `qm shutdown <VMID>` or `qm stop <VMID> && qm start <VMID>`
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
✅ **Automatic Configuration:**
|
|
- Crossplane provider sets `agent: 1` automatically
|
|
- All templates include guest agent in cloud-init
|
|
|
|
✅ **Verification:**
|
|
- Use check scripts on Proxmox nodes
|
|
- Verify both Proxmox config and guest OS service
|
|
|
|
✅ **Troubleshooting:**
|
|
- Enable in Proxmox: `qm set <VMID> --agent 1`
|
|
- Install in guest: `apt-get install -y qemu-guest-agent`
|
|
- Start service: `systemctl start qemu-guest-agent`
|
|
- Restart VM if needed
|
|
|
|
✅ **Best Practices:**
|
|
- Always include in templates
|
|
- Verify after creation
|
|
- Monitor regularly
|
|
- Document exceptions
|
|
|
|
---
|
|
|
|
**Related Documents:**
|
|
- `docs/GUEST_AGENT_CONFIGURATION_ANALYSIS.md`
|
|
- `docs/VM_100_GUEST_AGENT_FIXED.md`
|
|
- `docs/GUEST_AGENT_VERIFICATION_ENHANCEMENT_COMPLETE.md`
|
|
- `docs/SCRIPT_COPIED_TO_PROXMOX_NODES.md`
|
|
|