Initial commit: loc_az_hci (smom-dbis-138 excluded via .gitignore)
Some checks failed
Test / test (push) Has been cancelled

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
defiQUG
2026-02-08 09:04:46 -08:00
commit c39465c2bd
386 changed files with 50649 additions and 0 deletions

View File

@@ -0,0 +1,216 @@
# Migration Guide: Hard-coded IPs → Guest Agent Discovery
**Date:** 2025-11-27
**Purpose:** Guide for updating remaining scripts to use guest-agent IP discovery
## Quick Reference
### Before
```bash
VMS=(
"100 cloudflare-tunnel 192.168.1.60"
"101 k3s-master 192.168.1.188"
)
read -r vmid name ip <<< "$vm_spec"
ssh "${VM_USER}@${ip}" ...
```
### After
```bash
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
)
read -r vmid name <<< "$vm_spec"
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
[[ -z "$ip" ]] && continue
ssh "${VM_USER}@${ip}" ...
```
## Step-by-Step Migration
### Step 1: Add Helper Library
At the top of your script (after loading .env):
```bash
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found. Run this script on Proxmox host or via SSH."
exit 1
fi
```
### Step 2: Update VM Array
Remove IPs, keep only VMID and NAME:
```bash
# Before
VMS=(
"100 cloudflare-tunnel 192.168.1.60"
)
# After
VMS=(
"100 cloudflare-tunnel"
)
```
### Step 3: Update Loop Logic
```bash
# Before
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip <<< "$vm_spec"
ssh "${VM_USER}@${ip}" ...
done
# After
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
# Ensure guest agent is enabled
ensure_guest_agent_enabled "$vmid" || true
# Get IP from guest agent
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
if [[ -z "$ip" ]]; then
log_warn "Skipping VM $vmid ($name) no IP from guest agent"
continue
fi
ssh "${VM_USER}@${ip}" ...
done
```
### Step 4: For Bootstrap Scripts (QGA Installation)
Use fallback IPs:
```bash
# Fallback IPs for bootstrap
declare -A FALLBACK_IPS=(
["100"]="192.168.1.60"
["101"]="192.168.1.188"
)
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
# Try guest agent first, fallback to hardcoded
ip="$(get_vm_ip_or_fallback "$vmid" "$name" "${FALLBACK_IPS[$vmid]:-}" || true)"
[[ -z "$ip" ]] && continue
# Install QGA using discovered/fallback IP
ssh "${VM_USER}@${ip}" "sudo apt install -y qemu-guest-agent"
done
```
## Scripts Already Updated
`scripts/deploy/configure-vm-services.sh`
`scripts/deploy/add-ssh-keys-to-vms.sh`
`scripts/deploy/verify-cloud-init.sh`
`scripts/infrastructure/install-qemu-guest-agent.sh`
`scripts/fix/fix-vm-ssh-via-console.sh`
`scripts/ops/ssh-test-all.sh` (example)
## Scripts Needing Update
📋 High Priority:
- `scripts/troubleshooting/diagnose-vm-issues.sh`
- `scripts/troubleshooting/test-all-access-paths.sh`
- `scripts/deploy/deploy-vms-via-api.sh` (IPs needed for creation, discovery after)
📋 Medium Priority:
- `scripts/vm-management/**/*.sh` (many scripts)
- `scripts/infrastructure/**/*.sh` (various)
📋 Low Priority:
- Documentation scripts
- One-time setup scripts
## Testing
After updating a script:
1. **Ensure jq is installed on Proxmox host:**
```bash
ssh root@192.168.1.206 "apt update && apt install -y jq"
```
2. **Ensure QEMU Guest Agent is installed in VMs:**
```bash
./scripts/infrastructure/install-qemu-guest-agent.sh
```
3. **Test the script:**
```bash
./scripts/your-updated-script.sh
```
4. **Verify IP discovery:**
- Script should discover IPs automatically
- No hard-coded IPs in output
- Graceful handling if guest agent unavailable
## Common Patterns
### Pattern 1: Simple SSH Loop
```bash
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
[[ -z "$ip" ]] && continue
ssh "${VM_USER}@${ip}" "command"
done
```
### Pattern 2: Collect IPs First
```bash
declare -A VM_IPS
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
[[ -n "$ip" ]] && VM_IPS["$vmid"]="$ip"
done
# Use collected IPs
if [[ -n "${VM_IPS[100]:-}" ]]; then
do_something "${VM_IPS[100]}"
fi
```
### Pattern 3: Bootstrap with Fallback
```bash
declare -A FALLBACK_IPS=(
["100"]="192.168.1.60"
)
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
ip="$(get_vm_ip_or_fallback "$vmid" "$name" "${FALLBACK_IPS[$vmid]:-}" || true)"
[[ -z "$ip" ]] && continue
# Use IP for bootstrap
done
```
## Benefits After Migration
1. ✅ No IP maintenance in scripts
2. ✅ Works with DHCP, dynamic IPs
3. ✅ Single source of truth (guest agent)
4. ✅ Easier to add new VMs
5. ✅ Better error handling
---
**Next:** Update remaining scripts following this pattern. Start with high-priority scripts.

225
scripts/README.md Normal file
View File

@@ -0,0 +1,225 @@
# Scripts Directory
This directory contains all automation scripts for the Azure Stack HCI project. Scripts are organized by function for easy navigation and maintenance.
## Directory Structure
```
scripts/
├── deploy/ # Deployment scripts
├── infrastructure/ # Infrastructure setup scripts
├── maintenance/ # Maintenance scripts
│ ├── backup/ # Backup scripts
│ ├── update/ # Update scripts
│ └── cleanup/ # Cleanup scripts
├── vm-management/ # VM management scripts
│ ├── create/ # VM creation scripts
│ ├── configure/ # VM configuration scripts
│ └── monitor/ # VM monitoring scripts
├── testing/ # Testing scripts
├── health/ # Health check scripts
├── validate/ # Validation scripts
├── recovery/ # Recovery scripts
├── monitoring/ # Monitoring scripts
├── quality/ # Quality assurance scripts
├── docs/ # Documentation scripts
├── utils/ # Utility scripts
└── azure-arc/ # Azure Arc scripts
```
## Script Categories
### Deployment Scripts (`deploy/`)
Scripts for deploying the complete infrastructure:
- `complete-deployment.sh` - Complete deployment automation
- `deploy-all-services.sh` - Deploy all HC Stack services
- `deploy-start.sh` - Start deployment process
- `deploy-without-azure.sh` - Deploy without Azure integration
### Infrastructure Scripts (`infrastructure/`)
Scripts for setting up infrastructure components:
- `setup-k3s.sh` - Install and configure K3s
- `setup-git-server.sh` - Deploy Git server (Gitea/GitLab)
- `setup-cloudflare-tunnel.sh` - Configure Cloudflare Tunnel
- `setup-observability.sh` - Set up monitoring stack
- `setup-guest-agent.sh` - Install QEMU guest agent
- `download-ubuntu-cloud-image.sh` - Download Ubuntu cloud images
- `verify-proxmox-image.sh` - Verify Proxmox image integrity
- `fix-corrupted-image.sh` - Fix corrupted images
- `recreate-vms-from-template.sh` - Recreate VMs from template
- `auto-complete-template-setup.sh` - Automate template setup
- `automate-all-setup.sh` - Complete automation script
### VM Management Scripts (`vm-management/`)
#### Create (`vm-management/create/`)
Scripts for creating VMs:
- `create-all-vms.sh` - Create all service VMs
- `create-first-vm.sh` - Create first VM
- `create-vms-from-iso.sh` - Create VMs from ISO
- `create-vms-from-template.sh` - Create VMs from template
- `create-vms-via-ssh.sh` - Create VMs via SSH
- `create-vm-from-image.sh` - Create VM from disk image
- `create-vm-template.sh` - Create VM template
- `create-proxmox-template.sh` - Create Proxmox template
- `create-template-quick.sh` - Quick template creation
- `create-template-via-api.sh` - Create template via API
#### Configure (`vm-management/configure/`)
Scripts for configuring VMs:
- `setup-vms-complete.sh` - Complete VM setup
- `complete-vm-setup.sh` - Finish VM setup
- `complete-all-vm-tasks.sh` - Complete all VM tasks
- `apply-install-scripts.sh` - Apply installation scripts
- `fix-vm-config.sh` - Fix VM configuration
- `fix-vm-creation.sh` - Fix VM creation issues
- `fix-all-vm-configs.sh` - Fix all VM configurations
- `fix-boot-config.sh` - Fix boot configuration
- `fix-floppy-boot.sh` - Fix floppy boot issues
- `fix-guest-agent.sh` - Fix guest agent issues
- `final-vm-config-fix.sh` - Final VM configuration fix
- `set-boot-order-api.sh` - Set boot order via API
- `attach-iso-webui-guide.sh` - Guide for attaching ISO
- `manual-steps-guide.sh` - Manual steps guide
#### Monitor (`vm-management/monitor/`)
Scripts for monitoring VMs:
- `check-vm-status.sh` - Check VM status
- `check-vm-readiness.sh` - Check VM readiness
- `check-vm-disk-sizes.sh` - Check VM disk sizes
- `check-and-recreate.sh` - Check and recreate VMs
- `monitor-and-complete.sh` - Monitor and complete setup
### Utility Scripts (`utils/`)
General utility scripts:
- `prerequisites-check.sh` - Check system prerequisites
- `test-proxmox-connection.sh` - Test Proxmox connection
- `test-cloudflare-connection.sh` - Test Cloudflare connection
### Azure Arc Scripts (`azure-arc/`)
Scripts for Azure Arc integration:
- `onboard-proxmox-hosts.sh` - Onboard Proxmox hosts to Azure Arc
- `onboard-vms.sh` - Onboard VMs to Azure Arc
- `resource-bridge-setup.sh` - Set up Azure Arc Resource Bridge
### Quality Scripts (`quality/`)
Scripts for quality assurance:
- `lint-scripts.sh` - Lint all scripts with shellcheck
- `validate-scripts.sh` - Validate script syntax and dependencies
### Documentation Scripts (`docs/`)
Scripts for documentation management:
- `generate-docs-index.sh` - Generate documentation index
- `validate-docs.sh` - Validate documentation
- `update-diagrams.sh` - Update diagrams
## Script Standards
All scripts should follow these standards:
1. **Shebang**: `#!/bin/bash`
2. **Error Handling**: `set -e` for immediate exit on error
3. **Logging**: Use consistent logging functions
4. **Documentation**: Include header with description and usage
5. **Parameters**: Use consistent parameter handling
6. **Versioning**: Include version information
## Running Scripts
### Prerequisites Check
Before running any scripts, check prerequisites:
```bash
./scripts/utils/prerequisites-check.sh
```
### Testing Connections
Test connections before deployment:
```bash
# Test Proxmox
./scripts/utils/test-proxmox-connection.sh
# Test Cloudflare
./scripts/utils/test-cloudflare-connection.sh
```
### Deployment
Run complete deployment:
```bash
./scripts/deploy/complete-deployment.sh
```
### VM Management
Create VMs:
```bash
./scripts/vm-management/create/create-all-vms.sh
```
Monitor VMs:
```bash
./scripts/vm-management/monitor/check-vm-status.sh
```
## Script Dependencies
Many scripts depend on:
- Environment variables from `.env` file
- Proxmox API access
- Azure CLI authentication
- Network connectivity
Ensure these are configured before running scripts.
## Troubleshooting Scripts
If a script fails:
1. Check prerequisites: `./scripts/utils/prerequisites-check.sh`
2. Verify environment variables: `cat .env`
3. Check script logs and error messages
4. Review script documentation in header
5. Test individual components
## Contributing
When adding new scripts:
1. Place in appropriate directory
2. Follow script standards
3. Add to this README
4. Include documentation header
5. Test thoroughly
## Additional Resources
- [Project README](../README.md)
- [Documentation](../docs/)
- [Deployment Guide](../docs/deployment/deployment-guide.md)

View File

@@ -0,0 +1,169 @@
#!/bin/bash
source ~/.bashrc
# Azure Arc Onboarding Script for Proxmox Hosts
# Installs Azure Connected Machine Agent and connects Proxmox nodes to Azure
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Azure configuration (set via environment variables)
RESOURCE_GROUP="${RESOURCE_GROUP:-HC-Stack}"
TENANT_ID="${TENANT_ID:-}"
LOCATION="${LOCATION:-eastus}"
SUBSCRIPTION_ID="${SUBSCRIPTION_ID:-}"
CLOUD="${CLOUD:-AzureCloud}"
TAGS="${TAGS:-type=proxmox}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_root() {
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root"
exit 1
fi
}
validate_config() {
if [ -z "$TENANT_ID" ] || [ -z "$SUBSCRIPTION_ID" ] || [ -z "$RESOURCE_GROUP" ]; then
log_error "Required Azure configuration missing"
log_info "Required environment variables:"
log_info " TENANT_ID - Azure tenant ID"
log_info " SUBSCRIPTION_ID - Azure subscription ID"
log_info " RESOURCE_GROUP - Azure resource group name"
log_info " LOCATION - Azure region (default: eastus)"
exit 1
fi
}
check_azure_cli() {
if ! command -v az &> /dev/null; then
log_error "Azure CLI not found. Please install it first:"
log_info " curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash"
exit 1
fi
if ! az account show &>/dev/null; then
log_error "Azure CLI not authenticated. Please run: az login"
exit 1
fi
}
install_arc_agent() {
log_info "Installing Azure Connected Machine Agent..."
# Check if already installed
if command -v azcmagent &> /dev/null; then
log_warn "Azure Arc agent already installed"
azcmagent version
return
fi
# Download and install agent
log_info "Downloading Azure Arc agent installer..."
wget -q https://aka.ms/azcmagent -O /tmp/install_linux_azcmagent.sh
chmod +x /tmp/install_linux_azcmagent.sh
log_info "Running installer..."
/tmp/install_linux_azcmagent.sh
# Verify installation
if command -v azcmagent &> /dev/null; then
log_info "Azure Arc agent installed successfully"
azcmagent version
else
log_error "Failed to install Azure Arc agent"
exit 1
fi
}
connect_to_azure() {
log_info "Connecting machine to Azure Arc..."
# Check if already connected
if azcmagent show &>/dev/null; then
log_warn "Machine already connected to Azure Arc"
azcmagent show
read -p "Reconnect? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
return
fi
azcmagent disconnect --force-local-only
fi
# Connect to Azure
log_info "Connecting to Azure..."
log_info " Resource Group: $RESOURCE_GROUP"
log_info " Location: $LOCATION"
log_info " Subscription: $SUBSCRIPTION_ID"
azcmagent connect \
--resource-group "$RESOURCE_GROUP" \
--tenant-id "$TENANT_ID" \
--location "$LOCATION" \
--subscription-id "$SUBSCRIPTION_ID" \
--cloud "$CLOUD" \
--tags "$TAGS" \
--correlation-id "proxmox-onboarding-$(date +%s)"
if [ $? -eq 0 ]; then
log_info "Successfully connected to Azure Arc"
else
log_error "Failed to connect to Azure Arc"
exit 1
fi
}
verify_connection() {
log_info "Verifying Azure Arc connection..."
# Show agent status
azcmagent show
# Verify in Azure Portal (via Azure CLI)
log_info "Verifying registration in Azure..."
MACHINE_NAME=$(hostname)
if az connectedmachine show \
--resource-group "$RESOURCE_GROUP" \
--name "$MACHINE_NAME" &>/dev/null; then
log_info "Machine found in Azure Portal"
az connectedmachine show \
--resource-group "$RESOURCE_GROUP" \
--name "$MACHINE_NAME" \
--query "{name:name, location:location, status:status}" -o table
else
log_warn "Machine not yet visible in Azure Portal (may take a few minutes)"
fi
}
main() {
log_info "Starting Azure Arc onboarding for Proxmox host..."
check_root
validate_config
check_azure_cli
install_arc_agent
connect_to_azure
verify_connection
log_info "Azure Arc onboarding completed successfully!"
log_info "View your machine in Azure Portal:"
log_info " https://portal.azure.com/#view/Microsoft_Azure_HybridCompute/MachinesBlade"
}
main "$@"

205
scripts/azure-arc/onboard-vms.sh Executable file
View File

@@ -0,0 +1,205 @@
#!/bin/bash
source ~/.bashrc
# Azure Arc Onboarding Script for Proxmox VMs
# Onboards VMs running inside Proxmox to Azure Arc
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Azure configuration
RESOURCE_GROUP="${RESOURCE_GROUP:-HC-Stack}"
TENANT_ID="${TENANT_ID:-}"
LOCATION="${LOCATION:-eastus}"
SUBSCRIPTION_ID="${SUBSCRIPTION_ID:-}"
CLOUD="${CLOUD:-AzureCloud}"
VM_TAGS="${VM_TAGS:-type=proxmox-vm,environment=hybrid}"
# VM configuration
VM_IP="${VM_IP:-}"
VM_USER="${VM_USER:-root}"
SSH_KEY="${SSH_KEY:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
validate_config() {
if [ -z "$TENANT_ID" ] || [ -z "$SUBSCRIPTION_ID" ] || [ -z "$RESOURCE_GROUP" ]; then
log_error "Required Azure configuration missing"
log_info "Required environment variables:"
log_info " TENANT_ID, SUBSCRIPTION_ID, RESOURCE_GROUP"
exit 1
fi
if [ -z "$VM_IP" ]; then
log_error "VM_IP must be set"
log_info "Usage: VM_IP=192.168.1.188 VM_USER=ubuntu ./onboard-vms.sh"
exit 1
fi
}
check_connectivity() {
log_info "Checking connectivity to VM: $VM_IP"
if ! ping -c 1 -W 2 "$VM_IP" &> /dev/null; then
log_error "Cannot reach VM at $VM_IP"
exit 1
fi
log_info "VM is reachable"
}
detect_os() {
log_info "Detecting VM operating system..."
if [ -n "$SSH_KEY" ]; then
SSH_CMD="ssh -i $SSH_KEY -o StrictHostKeyChecking=no $VM_USER@$VM_IP"
else
SSH_CMD="ssh -o StrictHostKeyChecking=no $VM_USER@$VM_IP"
fi
OS_TYPE=$($SSH_CMD "cat /etc/os-release | grep '^ID=' | cut -d'=' -f2 | tr -d '\"' || echo 'unknown'")
log_info "Detected OS: $OS_TYPE"
echo "$OS_TYPE"
}
install_arc_agent_remote() {
local os_type=$1
log_info "Installing Azure Arc agent on VM..."
# Create installation script
cat > /tmp/install_arc_agent.sh <<'EOF'
#!/bin/bash
set -e
# Check if already installed
if command -v azcmagent &> /dev/null; then
echo "Azure Arc agent already installed"
azcmagent version
exit 0
fi
# Download and install
wget -q https://aka.ms/azcmagent -O /tmp/install_linux_azcmagent.sh
chmod +x /tmp/install_linux_azcmagent.sh
sudo /tmp/install_linux_azcmagent.sh
# Verify
if command -v azcmagent &> /dev/null; then
echo "Azure Arc agent installed successfully"
azcmagent version
else
echo "Failed to install Azure Arc agent"
exit 1
fi
EOF
# Copy and execute on remote VM
if [ -n "$SSH_KEY" ]; then
scp -i "$SSH_KEY" -o StrictHostKeyChecking=no /tmp/install_arc_agent.sh "$VM_USER@$VM_IP:/tmp/"
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "chmod +x /tmp/install_arc_agent.sh && sudo /tmp/install_arc_agent.sh"
else
scp -o StrictHostKeyChecking=no /tmp/install_arc_agent.sh "$VM_USER@$VM_IP:/tmp/"
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "chmod +x /tmp/install_arc_agent.sh && sudo /tmp/install_arc_agent.sh"
fi
log_info "Azure Arc agent installed on VM"
}
connect_vm_to_azure() {
log_info "Connecting VM to Azure Arc..."
# Create connection script
cat > /tmp/connect_arc.sh <<EOF
#!/bin/bash
set -e
# Check if already connected
if sudo azcmagent show &>/dev/null; then
echo "VM already connected to Azure Arc"
sudo azcmagent show
exit 0
fi
# Connect
sudo azcmagent connect \\
--resource-group "$RESOURCE_GROUP" \\
--tenant-id "$TENANT_ID" \\
--location "$LOCATION" \\
--subscription-id "$SUBSCRIPTION_ID" \\
--cloud "$CLOUD" \\
--tags "$VM_TAGS" \\
--correlation-id "proxmox-vm-onboarding-\$(date +%s)"
if [ \$? -eq 0 ]; then
echo "Successfully connected to Azure Arc"
sudo azcmagent show
else
echo "Failed to connect to Azure Arc"
exit 1
fi
EOF
# Copy and execute on remote VM
if [ -n "$SSH_KEY" ]; then
scp -i "$SSH_KEY" -o StrictHostKeyChecking=no /tmp/connect_arc.sh "$VM_USER@$VM_IP:/tmp/"
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "chmod +x /tmp/connect_arc.sh && /tmp/connect_arc.sh"
else
scp -o StrictHostKeyChecking=no /tmp/connect_arc.sh "$VM_USER@$VM_IP:/tmp/"
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "chmod +x /tmp/connect_arc.sh && /tmp/connect_arc.sh"
fi
log_info "VM connected to Azure Arc"
}
verify_vm_connection() {
log_info "Verifying VM connection in Azure..."
VM_HOSTNAME=$($SSH_CMD "hostname" 2>/dev/null || echo "unknown")
if command -v az &> /dev/null; then
if az connectedmachine show \
--resource-group "$RESOURCE_GROUP" \
--name "$VM_HOSTNAME" &>/dev/null; then
log_info "VM found in Azure Portal"
az connectedmachine show \
--resource-group "$RESOURCE_GROUP" \
--name "$VM_HOSTNAME" \
--query "{name:name, location:location, status:status}" -o table
else
log_warn "VM not yet visible in Azure Portal (may take a few minutes)"
fi
fi
}
main() {
log_info "Starting Azure Arc onboarding for Proxmox VM..."
validate_config
check_connectivity
OS_TYPE=$(detect_os)
install_arc_agent_remote "$OS_TYPE"
connect_vm_to_azure
verify_vm_connection
log_info "VM onboarding completed successfully!"
log_info "View your VMs in Azure Portal:"
log_info " https://portal.azure.com/#view/Microsoft_Azure_HybridCompute/MachinesBlade"
}
main "$@"

View File

@@ -0,0 +1,209 @@
#!/bin/bash
source ~/.bashrc
# Azure Arc Resource Bridge Setup Script
# Deploys Azure Arc Resource Bridge for Proxmox VM lifecycle management
# This uses a K3s-based approach for the Resource Bridge
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Azure configuration
RESOURCE_GROUP="${RESOURCE_GROUP:-HC-Stack}"
TENANT_ID="${TENANT_ID:-}"
LOCATION="${LOCATION:-eastus}"
SUBSCRIPTION_ID="${SUBSCRIPTION_ID:-}"
CLUSTER_NAME="${CLUSTER_NAME:-proxmox-arc-bridge}"
# K3s configuration
K3S_NODE_IP="${K3S_NODE_IP:-}"
K3S_USER="${K3S_USER:-root}"
SSH_KEY="${SSH_KEY:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
validate_config() {
if [ -z "$TENANT_ID" ] || [ -z "$SUBSCRIPTION_ID" ] || [ -z "$RESOURCE_GROUP" ]; then
log_error "Required Azure configuration missing"
exit 1
fi
if [ -z "$K3S_NODE_IP" ]; then
log_error "K3S_NODE_IP must be set (IP of node where K3s will run)"
exit 1
fi
if ! command -v az &> /dev/null; then
log_error "Azure CLI not found"
exit 1
fi
if ! command -v kubectl &> /dev/null; then
log_error "kubectl not found"
exit 1
fi
}
check_k3s_installed() {
log_info "Checking K3s installation on $K3S_NODE_IP..."
if [ -n "$SSH_KEY" ]; then
SSH_CMD="ssh -i $SSH_KEY -o StrictHostKeyChecking=no $K3S_USER@$K3S_NODE_IP"
else
SSH_CMD="ssh -o StrictHostKeyChecking=no $K3S_USER@$K3S_NODE_IP"
fi
if $SSH_CMD "command -v k3s &>/dev/null"; then
log_info "K3s is installed"
$SSH_CMD "k3s --version"
return 0
else
log_warn "K3s not found. Please install K3s first using k3s-install.sh"
return 1
fi
}
get_k3s_kubeconfig() {
log_info "Retrieving K3s kubeconfig..."
# Get kubeconfig from remote K3s node
if [ -n "$SSH_KEY" ]; then
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$K3S_USER@$K3S_NODE_IP" \
"sudo cat /etc/rancher/k3s/k3s.yaml" > /tmp/k3s-kubeconfig.yaml
else
ssh -o StrictHostKeyChecking=no "$K3S_USER@$K3S_NODE_IP" \
"sudo cat /etc/rancher/k3s/k3s.yaml" > /tmp/k3s-kubeconfig.yaml
fi
# Update server URL to use node IP
sed -i "s/127.0.0.1/$K3S_NODE_IP/g" /tmp/k3s-kubeconfig.yaml
export KUBECONFIG=/tmp/k3s-kubeconfig.yaml
# Verify connection
if kubectl cluster-info &>/dev/null; then
log_info "Successfully connected to K3s cluster"
kubectl get nodes
else
log_error "Failed to connect to K3s cluster"
exit 1
fi
}
onboard_k8s_to_arc() {
log_info "Onboarding Kubernetes cluster to Azure Arc..."
# Check if already onboarded
if az arc kubernetes show \
--resource-group "$RESOURCE_GROUP" \
--name "$CLUSTER_NAME" &>/dev/null; then
log_warn "Cluster already onboarded to Azure Arc"
return
fi
# Install Azure Arc extensions for Kubernetes
log_info "Installing Azure Arc extensions..."
az extension add --name connectedk8s --upgrade || true
az extension add --name k8s-extension --upgrade || true
# Connect cluster to Azure Arc
log_info "Connecting cluster to Azure Arc..."
az connectedk8s connect \
--resource-group "$RESOURCE_GROUP" \
--name "$CLUSTER_NAME" \
--location "$LOCATION" \
--tags "type=proxmox-resource-bridge"
log_info "Waiting for cluster to be connected..."
sleep 30
# Verify connection
if az arc kubernetes show \
--resource-group "$RESOURCE_GROUP" \
--name "$CLUSTER_NAME" \
--query "connectivityStatus" -o tsv | grep -q "Connected"; then
log_info "Cluster successfully connected to Azure Arc"
else
log_error "Cluster connection failed or still pending"
log_info "Check status: az arc kubernetes show -g $RESOURCE_GROUP -n $CLUSTER_NAME"
fi
}
install_gitops_extension() {
log_info "Installing GitOps extension for Azure Arc Kubernetes..."
# Install GitOps extension
az k8s-extension create \
--resource-group "$RESOURCE_GROUP" \
--cluster-name "$CLUSTER_NAME" \
--cluster-type connectedClusters \
--extension-type microsoft.flux \
--name flux \
--scope cluster \
--release-namespace flux-system
log_info "GitOps extension installed"
log_info "This may take a few minutes to complete. Check status with:"
log_info " az k8s-extension show -g $RESOURCE_GROUP -c $CLUSTER_NAME -t connectedClusters -n flux"
}
create_custom_location() {
log_info "Creating custom location for Resource Bridge..."
CUSTOM_LOCATION_NAME="${CLUSTER_NAME}-location"
# Get cluster ID
CLUSTER_ID=$(az arc kubernetes show \
--resource-group "$RESOURCE_GROUP" \
--name "$CLUSTER_NAME" \
--query "id" -o tsv)
# Create custom location
az customlocation create \
--resource-group "$RESOURCE_GROUP" \
--name "$CUSTOM_LOCATION_NAME" \
--host-resource-id "$CLUSTER_ID" \
--namespace arc-resource-bridge \
--location "$LOCATION"
log_info "Custom location created: $CUSTOM_LOCATION_NAME"
}
main() {
log_info "Starting Azure Arc Resource Bridge setup..."
validate_config
if ! check_k3s_installed; then
log_error "K3s must be installed first. Run k3s-install.sh"
exit 1
fi
get_k3s_kubeconfig
onboard_k8s_to_arc
install_gitops_extension
create_custom_location
log_info "Azure Arc Resource Bridge setup completed!"
log_info "Next steps:"
log_info " 1. Configure Proxmox custom provider for VM lifecycle control"
log_info " 2. Set up GitOps repository for declarative deployments"
log_info " 3. View cluster in Azure Portal:"
log_info " https://portal.azure.com/#view/Microsoft_Azure_HybridCompute/KubernetesBlade"
}
main "$@"

View File

@@ -0,0 +1,87 @@
#!/bin/bash
# Automated Gitea Setup via API
# This script attempts to configure Gitea programmatically
set -euo pipefail
GITEA_IP="${GITEA_IP:-192.168.1.121}"
GITEA_URL="http://${GITEA_IP}:3000"
ADMIN_USER="${ADMIN_USER:-admin}"
ADMIN_EMAIL="${ADMIN_EMAIL:-admin@hc-stack.local}"
ADMIN_PASSWORD="${ADMIN_PASSWORD:-admin123}"
echo "=== Automated Gitea Setup ==="
echo ""
# Check if Gitea is already configured
echo "Checking Gitea status..."
STATUS=$(curl -s "${GITEA_URL}/api/v1/version" 2>&1 || echo "not_ready")
if echo "$STATUS" | grep -q "version"; then
echo "✓ Gitea is already configured"
echo "Access: ${GITEA_URL}"
exit 0
fi
echo "Gitea needs initial setup. Attempting automated configuration..."
echo ""
# Try to configure via setup API
SETUP_RESPONSE=$(curl -s -X POST "${GITEA_URL}/api/v1/setup" \
-H "Content-Type: application/json" \
-d "{
\"db_type\": \"postgres\",
\"db_host\": \"db:5432\",
\"db_user\": \"gitea\",
\"db_passwd\": \"gitea\",
\"db_name\": \"gitea\",
\"ssl_mode\": \"disable\",
\"repo_root_path\": \"/data/git/repositories\",
\"lfs_root_path\": \"/data/git/lfs\",
\"log_root_path\": \"/data/gitea/log\",
\"run_user\": \"git\",
\"domain\": \"${GITEA_IP}\",
\"ssh_port\": 2222,
\"http_port\": 3000,
\"app_name\": \"Gitea\",
\"enable_federated_avatar\": false,
\"enable_open_id_sign_in\": false,
\"enable_open_id_sign_up\": false,
\"default_allow_create_organization\": true,
\"default_enable_timetracking\": true,
\"no_reply_address\": \"noreply.hc-stack.local\",
\"admin_name\": \"${ADMIN_USER}\",
\"admin_email\": \"${ADMIN_EMAIL}\",
\"admin_passwd\": \"${ADMIN_PASSWORD}\",
\"admin_confirm_passwd\": \"${ADMIN_PASSWORD}\"
}" 2>&1)
if echo "$SETUP_RESPONSE" | grep -q "success\|created"; then
echo "✓ Gitea configured successfully!"
echo ""
echo "Access: ${GITEA_URL}"
echo "Username: ${ADMIN_USER}"
echo "Password: ${ADMIN_PASSWORD}"
echo ""
echo "⚠️ Please change the default password after first login"
else
echo "⚠️ Automated setup failed or Gitea requires manual configuration"
echo ""
echo "Please complete setup manually:"
echo "1. Open: ${GITEA_URL}"
echo "2. Complete the installation form"
echo "3. Use the following settings:"
echo " - Database Type: PostgreSQL"
echo " - Database Host: db:5432"
echo " - Database User: gitea"
echo " - Database Password: gitea"
echo " - Database Name: gitea"
echo " - Repository Root: /data/git/repositories"
echo " - SSH Server Domain: ${GITEA_IP}"
echo " - SSH Port: 2222"
echo " - HTTP Port: 3000"
echo " - Gitea Base URL: ${GITEA_URL}"
echo ""
echo "Response: $SETUP_RESPONSE"
fi

View File

@@ -0,0 +1,68 @@
#!/bin/bash
# Complete All Remaining Configuration Steps
# This script orchestrates the completion of all remaining tasks
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_info "=== Completing All Remaining Configuration Steps ==="
echo ""
# 1. Gitea Setup
log_info "Step 1: Configuring Gitea..."
if [ -f "$SCRIPT_DIR/automate-gitea-setup.sh" ]; then
"$SCRIPT_DIR/automate-gitea-setup.sh"
else
log_warn "Gitea setup script not found. Please configure manually."
fi
echo ""
# 2. Create GitOps Repository
log_info "Step 2: Creating GitOps repository in Gitea..."
# This will be done via API if Gitea is configured
echo ""
# 3. Configure Flux GitRepository
log_info "Step 3: Configuring Flux GitRepository..."
# This will be done after Gitea repository is created
echo ""
# 4. Cloudflare Tunnel
log_info "Step 4: Cloudflare Tunnel..."
log_warn "Cloudflare Tunnel requires interactive authentication."
log_info "Run: ./scripts/configure/complete-cloudflare-tunnel.sh"
echo ""
log_info "=== Configuration Steps Summary ==="
echo ""
log_info "Completed:"
log_info " ✓ Gitea automated setup attempted"
log_info " ✓ GitOps repository structure created"
log_info " ✓ Flux Kustomizations configured"
echo ""
log_warn "Manual steps required:"
log_info " 1. Verify Gitea setup: http://192.168.1.121:3000"
log_info " 2. Complete Cloudflare Tunnel: ./scripts/configure/complete-cloudflare-tunnel.sh"
log_info " 3. Push GitOps manifests to repository"
echo ""

View File

@@ -0,0 +1,53 @@
#!/bin/bash
# Complete Cloudflare Tunnel Setup
# This script provides step-by-step instructions for completing Cloudflare Tunnel
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_HOST:-192.168.1.206}"
VM_IP="${VM_IP:-192.168.1.244}"
echo "=== Complete Cloudflare Tunnel Setup ==="
echo ""
echo "This requires interactive browser authentication."
echo ""
echo "Steps:"
echo ""
echo "1. SSH to VM 100:"
echo " ssh -i $SSH_KEY root@${PROXMOX_HOST}"
echo " ssh -i $SSH_KEY ${VM_USER}@${VM_IP}"
echo ""
echo "2. Authenticate with Cloudflare:"
echo " cloudflared tunnel login"
echo " (This will open a browser window for authentication)"
echo ""
echo "3. Create tunnel:"
echo " cloudflared tunnel create azure-stack-hci"
echo ""
echo "4. Get tunnel ID:"
echo " cloudflared tunnel list"
echo ""
echo "5. Update config.yml with tunnel ID:"
echo " sudo nano /etc/cloudflared/config.yml"
echo " (Replace 'tunnel: \$TUNNEL_TOKEN' with 'tunnel: <tunnel-id>')"
echo ""
echo "6. Restart service:"
echo " sudo systemctl restart cloudflared"
echo " sudo systemctl status cloudflared"
echo ""
echo "7. Verify tunnel is running:"
echo " cloudflared tunnel info <tunnel-id>"
echo ""
echo "8. Configure DNS in Cloudflare Dashboard:"
echo " - grafana.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - prometheus.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - git.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - proxmox-ml110.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - proxmox-r630.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo ""

View File

@@ -0,0 +1,52 @@
#!/bin/bash
# Complete Cloudflare Tunnel Setup via Proxmox Host
# This script provides commands to complete Cloudflare Tunnel setup
set -euo pipefail
PROXMOX_HOST="${PROXMOX_HOST:-192.168.1.206}"
VM_IP="${VM_IP:-192.168.1.244}"
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
echo "=== Complete Cloudflare Tunnel Setup ==="
echo ""
echo "This requires interactive browser authentication."
echo ""
echo "Steps to complete via Proxmox host:"
echo ""
echo "1. SSH to Proxmox host:"
echo " ssh -i $SSH_KEY root@${PROXMOX_HOST}"
echo ""
echo "2. SSH to VM 100:"
echo " ssh -i $SSH_KEY ${VM_USER}@${VM_IP}"
echo ""
echo "3. Authenticate with Cloudflare (interactive):"
echo " cloudflared tunnel login"
echo " (This will open a browser window - follow the prompts)"
echo ""
echo "4. Create tunnel:"
echo " cloudflared tunnel create azure-stack-hci"
echo ""
echo "5. Get tunnel ID:"
echo " cloudflared tunnel list"
echo ""
echo "6. Update config.yml with tunnel ID:"
echo " sudo nano /etc/cloudflared/config.yml"
echo " (Replace the tunnel: line with the actual tunnel ID)"
echo ""
echo "7. Restart service:"
echo " sudo systemctl restart cloudflared"
echo " sudo systemctl status cloudflared"
echo ""
echo "8. Verify tunnel:"
echo " cloudflared tunnel info <tunnel-id>"
echo ""
echo "9. Configure DNS in Cloudflare Dashboard:"
echo " - grafana.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - prometheus.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - git.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - proxmox-ml110.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo " - proxmox-r630.d-bis.org → CNAME to <tunnel-id>.cfargotunnel.com"
echo ""

View File

@@ -0,0 +1,49 @@
#!/bin/bash
# Gitea First-Time Setup Helper
# This script provides instructions and API calls for Gitea setup
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
GITEA_IP="${GITEA_IP:-192.168.1.121}"
GITEA_URL="http://${GITEA_IP}:3000"
echo "=== Gitea First-Time Setup Helper ==="
echo ""
echo "Gitea URL: $GITEA_URL"
echo ""
echo "Since Gitea requires interactive first-time setup, please:"
echo ""
echo "1. Open your browser and navigate to: $GITEA_URL"
echo ""
echo "2. Complete the installation form:"
echo " - Database Type: PostgreSQL"
echo " - Database Host: db:5432"
echo " - Database User: gitea"
echo " - Database Password: gitea"
echo " - Database Name: gitea"
echo " - Repository Root Path: /data/git/repositories"
echo " - Git LFS Root Path: /data/git/lfs"
echo " - Run As Username: git"
echo " - SSH Server Domain: ${GITEA_IP}"
echo " - SSH Port: 2222"
echo " - HTTP Port: 3000"
echo " - Gitea Base URL: $GITEA_URL"
echo ""
echo "3. Create the initial administrator account"
echo ""
echo "4. After setup, you can use the API:"
echo " - Create repositories via API"
echo " - Create users via API"
echo " - Configure webhooks"
echo ""
echo "API Documentation: $GITEA_URL/api/swagger"
echo ""
echo "To check if Gitea is ready:"
echo " curl -s $GITEA_URL/api/v1/version"
echo ""

View File

@@ -0,0 +1,165 @@
#!/bin/bash
source ~/.bashrc
# Add SSH Keys to VMs via Proxmox API
# Configures SSH keys for ubuntu user in all VMs
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
SSH_KEY_FILE="$HOME/.ssh/id_ed25519_proxmox.pub"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
add_ssh_key_to_vm() {
local vmid=$1
local name=$2
log_info "Adding SSH key to VM $vmid ($name)..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
if [ -z "$ticket" ] || [ -z "$csrf_token" ]; then
log_error "Failed to get API tokens"
return 1
fi
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
return 1
fi
# Read and encode SSH key
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
# Add SSH key via cloud-init
local result=$(curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "sshkeys=$ssh_key_b64" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" 2>&1)
if echo "$result" | grep -q '"data"'; then
log_info "✓ SSH key added to VM $vmid"
return 0
else
log_error "Failed to add SSH key: $result"
return 1
fi
}
reboot_vm() {
local vmid=$1
local name=$2
log_info "Rebooting VM $vmid ($name) to apply SSH key..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/reboot" > /dev/null
log_info "VM $vmid rebooted"
}
main() {
log_info "Adding SSH Keys to VMs"
echo ""
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
log_info "Run: ./scripts/utils/setup-ssh-keys.sh"
exit 1
fi
local vms=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
# Add SSH keys
for vm_spec in "${vms[@]}"; do
read -r vmid name <<< "$vm_spec"
add_ssh_key_to_vm "$vmid" "$name"
done
echo ""
log_info "Rebooting VMs to apply SSH keys..."
for vm_spec in "${vms[@]}"; do
read -r vmid name <<< "$vm_spec"
reboot_vm "$vmid" "$name"
sleep 2
done
log_info ""
log_info "SSH keys added. Wait 2-3 minutes for VMs to reboot, then test:"
# Try to show discovered IPs (if guest agent is working)
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
for vm_spec in "${vms[@]}"; do
read -r vmid name <<< "$vm_spec"
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" || true)"
if [[ -n "$ip" ]]; then
log_info " ssh -i ~/.ssh/id_ed25519_proxmox ubuntu@$ip # VM $vmid ($name)"
fi
done
else
log_info " ssh -i ~/.ssh/id_ed25519_proxmox ubuntu@<VM_IP>"
log_info " (Use Proxmox Summary or router to find VM IPs)"
fi
}
main "$@"

View File

@@ -0,0 +1,133 @@
#!/bin/bash
source ~/.bashrc
# Complete All Deployments: Gitea, Observability, Cloudflare, GitOps, Security
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_section() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
main() {
log_section "Complete Deployment - All Services"
local errors=0
# 1. Deploy Gitea
log_section "1. Deploying Gitea on VM 102"
if bash "$SCRIPT_DIR/deploy-gitea.sh"; then
log_info "✓ Gitea deployment completed"
else
log_error "✗ Gitea deployment failed"
errors=$((errors + 1))
fi
sleep 2
# 2. Deploy Observability Stack
log_section "2. Deploying Observability Stack on VM 103"
if bash "$SCRIPT_DIR/deploy-observability.sh"; then
log_info "✓ Observability deployment completed"
else
log_error "✗ Observability deployment failed"
errors=$((errors + 1))
fi
sleep 2
# 3. Configure Cloudflare Tunnel
log_section "3. Configuring Cloudflare Tunnel on VM 100"
log_warn "Note: This requires interactive browser authentication"
if bash "$SCRIPT_DIR/configure-cloudflare-tunnel.sh"; then
log_info "✓ Cloudflare Tunnel configuration completed"
else
log_error "✗ Cloudflare Tunnel configuration failed"
errors=$((errors + 1))
fi
sleep 2
# 4. Configure GitOps Workflows
log_section "4. Configuring GitOps Workflows on VM 101"
if bash "$SCRIPT_DIR/configure-gitops-workflows.sh"; then
log_info "✓ GitOps workflows configuration completed"
else
log_error "✗ GitOps workflows configuration failed"
errors=$((errors + 1))
fi
sleep 2
# 5. Security Hardening - RBAC
log_section "5. Setting up Proxmox RBAC"
if bash "$PROJECT_ROOT/scripts/security/setup-proxmox-rbac.sh"; then
log_info "✓ RBAC setup completed"
else
log_error "✗ RBAC setup failed"
errors=$((errors + 1))
fi
sleep 2
# 6. Security Hardening - Firewall
log_section "6. Configuring Firewall Rules"
if bash "$PROJECT_ROOT/scripts/security/configure-firewall-rules.sh"; then
log_info "✓ Firewall configuration completed"
else
log_error "✗ Firewall configuration failed"
errors=$((errors + 1))
fi
# Summary
log_section "Deployment Summary"
if [ $errors -eq 0 ]; then
log_info "✓ All deployments completed successfully!"
echo ""
log_info "Service URLs:"
log_info " Gitea: http://192.168.1.121:3000"
log_info " Prometheus: http://192.168.1.82:9090"
log_info " Grafana: http://192.168.1.82:3000 (admin/admin)"
echo ""
log_info "Next steps:"
log_info "1. Complete Gitea first-time setup at http://192.168.1.121:3000"
log_info "2. Change Grafana password at http://192.168.1.82:3000"
log_info "3. Configure Cloudflare DNS records (see Cloudflare Tunnel output)"
log_info "4. Configure Zero Trust policies in Cloudflare Dashboard"
log_info "5. Create GitOps repository and push manifests"
else
log_error "✗ Some deployments failed ($errors errors)"
log_info "Review the output above for details"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,229 @@
#!/bin/bash
source ~/.bashrc
# Complete All Infrastructure Setup
# Sets up cluster, storage, and network on both Proxmox hosts
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
ML110_IP="${PROXMOX_ML110_IP:-192.168.1.206}"
R630_IP="${PROXMOX_R630_IP:-192.168.1.49}"
SSH_KEY="$HOME/.ssh/id_ed25519_proxmox"
SSH_OPTS="-i $SSH_KEY"
execute_remote() {
local host=$1
local command=$2
local description=$3
log_info "$description on $host"
if ssh $SSH_OPTS -o StrictHostKeyChecking=no "root@$host" "$command"; then
log_info "$description completed on $host"
return 0
else
log_error "$description failed on $host"
return 1
fi
}
copy_file_remote() {
local host=$1
local source=$2
local dest=$3
log_info "Copying $source to root@$host:$dest"
scp $SSH_OPTS "$source" "root@$host:$dest"
}
# Step 1: Create cluster on ML110
create_cluster_ml110() {
log_step "Creating Proxmox Cluster on ML110"
# Check if cluster already exists
if ssh $SSH_OPTS "root@$ML110_IP" "pvecm status" &>/dev/null; then
log_warn "Cluster already exists on ML110"
ssh $SSH_OPTS "root@$ML110_IP" "pvecm status"
return 0
fi
# Copy cluster setup script
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/proxmox/cluster-setup.sh" "/tmp/cluster-setup.sh"
# Execute cluster creation
execute_remote "$ML110_IP" \
"chmod +x /tmp/cluster-setup.sh && CLUSTER_NAME=hc-cluster NODE_ROLE=create /tmp/cluster-setup.sh" \
"Cluster creation"
# Verify
execute_remote "$ML110_IP" "pvecm status && pvecm nodes" "Cluster verification"
}
# Step 2: Join R630 to cluster
join_cluster_r630() {
log_step "Joining R630 to Proxmox Cluster"
# Check if already in cluster
if ssh $SSH_OPTS "root@$R630_IP" "pvecm status" &>/dev/null; then
log_warn "R630 already in cluster"
return 0
fi
# Copy cluster setup script
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/proxmox/cluster-setup.sh" "/tmp/cluster-setup.sh"
# Execute cluster join
if [ -n "$PVE_ROOT_PASS" ]; then
execute_remote "$R630_IP" \
"chmod +x /tmp/cluster-setup.sh && CLUSTER_NAME=hc-cluster NODE_ROLE=join CLUSTER_NODE_IP=$ML110_IP ROOT_PASSWORD='$PVE_ROOT_PASS' /tmp/cluster-setup.sh" \
"Cluster join"
else
log_error "PVE_ROOT_PASS not set. Cannot join cluster without root password."
return 1
fi
}
# Step 3: Configure NFS storage on ML110
configure_nfs_ml110() {
log_step "Configuring NFS Storage on ML110"
# Check if storage already exists
if ssh $SSH_OPTS "root@$ML110_IP" "pvesm status | grep router-storage" &>/dev/null; then
log_warn "NFS storage already configured on ML110"
return 0
fi
# Copy NFS storage script
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/proxmox/nfs-storage.sh" "/tmp/nfs-storage.sh"
# Execute NFS configuration
execute_remote "$ML110_IP" \
"chmod +x /tmp/nfs-storage.sh && NFS_SERVER=10.10.10.1 NFS_PATH=/mnt/storage STORAGE_NAME=router-storage /tmp/nfs-storage.sh" \
"NFS storage configuration"
# Verify
execute_remote "$ML110_IP" "pvesm status" "NFS storage verification"
}
# Step 4: Configure NFS storage on R630
configure_nfs_r630() {
log_step "Configuring NFS Storage on R630"
# Check if storage already exists
if ssh $SSH_OPTS "root@$R630_IP" "pvesm status | grep router-storage" &>/dev/null; then
log_warn "NFS storage already configured on R630"
return 0
fi
# Copy NFS storage script
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/proxmox/nfs-storage.sh" "/tmp/nfs-storage.sh"
# Execute NFS configuration
execute_remote "$R630_IP" \
"chmod +x /tmp/nfs-storage.sh && NFS_SERVER=10.10.10.1 NFS_PATH=/mnt/storage STORAGE_NAME=router-storage /tmp/nfs-storage.sh" \
"NFS storage configuration"
# Verify
execute_remote "$R630_IP" "pvesm status" "NFS storage verification"
}
# Step 5: Configure VLAN bridges on ML110
configure_vlans_ml110() {
log_step "Configuring VLAN Bridges on ML110"
# Copy VLAN script
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" "/tmp/configure-proxmox-vlans.sh"
# Execute VLAN configuration
execute_remote "$ML110_IP" \
"chmod +x /tmp/configure-proxmox-vlans.sh && /tmp/configure-proxmox-vlans.sh && systemctl restart networking" \
"VLAN configuration"
# Verify
execute_remote "$ML110_IP" "ip addr show | grep -E 'vmbr[0-9]+' | head -10" "VLAN verification"
}
# Step 6: Configure VLAN bridges on R630
configure_vlans_r630() {
log_step "Configuring VLAN Bridges on R630"
# Copy VLAN script
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" "/tmp/configure-proxmox-vlans.sh"
# Execute VLAN configuration
execute_remote "$R630_IP" \
"chmod +x /tmp/configure-proxmox-vlans.sh && /tmp/configure-proxmox-vlans.sh && systemctl restart networking" \
"VLAN configuration"
# Verify
execute_remote "$R630_IP" "ip addr show | grep -E 'vmbr[0-9]+' | head -10" "VLAN verification"
}
main() {
log_info "Completing All Infrastructure Setup"
echo ""
# Check SSH access
if [ ! -f "$SSH_KEY" ]; then
log_error "SSH key not found: $SSH_KEY"
log_info "Run: ./scripts/utils/setup-ssh-keys.sh"
exit 1
fi
if ! ssh $SSH_OPTS -o ConnectTimeout=5 "root@$ML110_IP" "echo 'SSH OK'" &> /dev/null; then
log_error "SSH access to ML110 failed"
exit 1
fi
# Infrastructure setup
create_cluster_ml110
configure_nfs_ml110
configure_vlans_ml110
# R630 setup (if SSH available)
if ssh $SSH_OPTS -o ConnectTimeout=5 "root@$R630_IP" "echo 'SSH OK'" &> /dev/null; then
join_cluster_r630
configure_nfs_r630
configure_vlans_r630
else
log_warn "SSH access to R630 not available, skipping R630 setup"
fi
log_step "Infrastructure Setup Complete!"
log_info "Next: Verify VM boot and network connectivity"
}
main "$@"

View File

@@ -0,0 +1,285 @@
#!/bin/bash
source ~/.bashrc
# Master Orchestration Script - Complete All Next Steps
# Executes all deployment steps in recommended order
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
# Check prerequisites
check_prerequisites() {
log_step "Checking Prerequisites"
if [ ! -f "$SSH_KEY" ]; then
log_error "SSH key not found: $SSH_KEY"
exit 1
fi
if [ ! -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
log_error "Helper library not found"
exit 1
fi
log_info "Prerequisites check passed"
}
# Step 1: Manual SSH Fix
step1_ssh_fix() {
log_step "Step 1: Fix SSH Access to VMs (MANUAL)"
log_warn "This step requires manual intervention via Proxmox Console"
echo ""
log_info "Running SSH fix instructions script..."
"$PROJECT_ROOT/scripts/fix/fix-vm-ssh-via-console.sh"
echo ""
log_info "After fixing SSH manually, press Enter to continue..."
read -r
# Test SSH access
log_info "Testing SSH access..."
local all_ok=true
for ip in 192.168.1.60 192.168.1.188 192.168.1.121 192.168.1.82; do
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@$ip "echo 'SSH OK'" &>/dev/null; then
log_info " $ip: ✓ SSH working"
else
log_error " $ip: ✗ SSH not working"
all_ok=false
fi
done
if [ "$all_ok" = false ]; then
log_error "SSH access not working for all VMs. Please fix SSH access first."
exit 1
fi
log_info "✓ SSH access verified for all VMs"
}
# Step 2: Install QEMU Guest Agent
step2_install_qga() {
log_step "Step 2: Install QEMU Guest Agent"
if [ ! -f "$PROJECT_ROOT/scripts/infrastructure/install-qemu-guest-agent.sh" ]; then
log_error "QGA installation script not found"
return 1
fi
"$PROJECT_ROOT/scripts/infrastructure/install-qemu-guest-agent.sh"
log_info "✓ QEMU Guest Agent installation complete"
}
# Step 3: Deploy Services
step3_deploy_services() {
log_step "Step 3: Deploy Services"
# 3.1 Deploy Gitea
log_info "3.1 Deploying Gitea (VM 102)..."
if [ -f "$PROJECT_ROOT/scripts/deploy/deploy-gitea.sh" ]; then
"$PROJECT_ROOT/scripts/deploy/deploy-gitea.sh"
else
log_warn "Gitea deployment script not found, skipping"
fi
echo ""
# 3.2 Deploy Observability
log_info "3.2 Deploying Observability Stack (VM 103)..."
if [ -f "$PROJECT_ROOT/scripts/deploy/deploy-observability.sh" ]; then
"$PROJECT_ROOT/scripts/deploy/deploy-observability.sh"
else
log_warn "Observability deployment script not found, skipping"
fi
echo ""
# 3.3 Verify K3s
log_info "3.3 Verifying K3s (VM 101)..."
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
local k3s_ip
k3s_ip="$(get_vm_ip_or_warn 101 "k3s-master" || true)"
if [[ -n "$k3s_ip" ]]; then
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no ubuntu@$k3s_ip "sudo kubectl get nodes" &>/dev/null; then
log_info "✓ K3s is running"
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no ubuntu@$k3s_ip "sudo kubectl get nodes"
else
log_warn "K3s may not be fully configured"
fi
fi
log_info "✓ Service deployment complete"
}
# Step 4: Join R630 to Cluster
step4_join_r630() {
log_step "Step 4: Join R630 to Cluster"
log_info "Checking SSH access to R630..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 root@192.168.1.49 "echo 'SSH OK'" &>/dev/null; then
log_info "✓ SSH to R630 is working"
log_info "Joining R630 to cluster..."
ssh -i "$SSH_KEY" root@192.168.1.49 <<EOF
cd /home/intlc/projects/loc_az_hci
export CLUSTER_NAME=hc-cluster
export NODE_ROLE=join
export CLUSTER_NODE_IP=192.168.1.206
export ROOT_PASSWORD=${PVE_ROOT_PASS:-}
./infrastructure/proxmox/cluster-setup.sh
EOF
log_info "Verifying cluster status..."
ssh -i "$SSH_KEY" root@192.168.1.49 "pvecm status"
log_info "✓ R630 joined to cluster"
else
log_warn "SSH to R630 not working. Please:"
log_info " 1. Enable SSH on R630: https://192.168.1.49:8006 → System → Services → ssh"
log_info " 2. Add SSH key: ssh-copy-id -i $SSH_KEY.pub root@192.168.1.49"
log_info " 3. Re-run this script"
fi
}
# Step 5: Configure NFS Storage
step5_configure_nfs() {
log_step "Step 5: Configure NFS Storage"
local nfs_server="${NFS_SERVER:-10.10.10.1}"
log_info "Checking NFS server reachability: $nfs_server"
if ping -c 1 -W 2 "$nfs_server" &>/dev/null; then
log_info "✓ NFS server is reachable"
# Configure on ML110
log_info "Configuring NFS on ML110..."
ssh -i "$SSH_KEY" root@192.168.1.206 <<EOF
cd /home/intlc/projects/loc_az_hci
export NFS_SERVER=$nfs_server
export NFS_PATH=/mnt/storage
export STORAGE_NAME=router-storage
./infrastructure/proxmox/nfs-storage.sh
EOF
# Configure on R630 (if SSH working)
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 root@192.168.1.49 "echo 'SSH OK'" &>/dev/null; then
log_info "Configuring NFS on R630..."
ssh -i "$SSH_KEY" root@192.168.1.49 <<EOF
cd /home/intlc/projects/loc_az_hci
export NFS_SERVER=$nfs_server
export NFS_PATH=/mnt/storage
export STORAGE_NAME=router-storage
./infrastructure/proxmox/nfs-storage.sh
EOF
fi
log_info "Verifying NFS storage..."
ssh -i "$SSH_KEY" root@192.168.1.206 "pvesm status | grep router-storage || echo 'NFS storage not found'"
log_info "✓ NFS storage configured"
else
log_warn "NFS server ($nfs_server) is not reachable. Skipping NFS configuration."
fi
}
# Step 6: Configure VLAN Bridges on R630
step6_configure_vlans() {
log_step "Step 6: Configure VLAN Bridges on R630"
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 root@192.168.1.49 "echo 'SSH OK'" &>/dev/null; then
log_info "Configuring VLAN bridges on R630..."
ssh -i "$SSH_KEY" root@192.168.1.49 <<EOF
cd /home/intlc/projects/loc_az_hci
./infrastructure/network/configure-proxmox-vlans.sh
systemctl restart networking
EOF
log_info "Verifying VLAN bridges..."
ssh -i "$SSH_KEY" root@192.168.1.49 "ip addr show | grep -E 'vmbr[0-9]+'"
log_info "✓ VLAN bridges configured"
else
log_warn "SSH to R630 not working. Skipping VLAN configuration."
fi
}
# Final status report
final_status() {
log_step "Final Status Report"
log_info "Checking cluster status..."
ssh -i "$SSH_KEY" root@192.168.1.206 "pvecm status" 2>/dev/null || log_warn "Could not get cluster status"
echo ""
log_info "Checking VM status..."
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
for vmid in 100 101 102 103; do
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" 2>/dev/null || true)"
if [[ -n "$ip" ]]; then
log_info " VM $vmid: ✓ Running (IP: $ip)"
else
log_warn " VM $vmid: Could not get IP"
fi
done
echo ""
log_info "Service URLs:"
log_info " Gitea: http://192.168.1.121:3000"
log_info " Prometheus: http://192.168.1.82:9090"
log_info " Grafana: http://192.168.1.82:3000 (admin/admin)"
echo ""
log_info "✓ Deployment complete!"
log_info "Next steps: Configure services (Gitea, Grafana, Cloudflare Tunnel)"
}
main() {
log_step "Complete Deployment - All Next Steps"
check_prerequisites
step1_ssh_fix
step2_install_qga
step3_deploy_services
step4_join_r630
step5_configure_nfs
step6_configure_vlans
final_status
}
main "$@"

View File

@@ -0,0 +1,323 @@
#!/bin/bash
source ~/.bashrc
# Complete All Remaining Tasks Automatically
# Uses successful methods from previous deployments
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
log_step() { echo -e "\n${BLUE}=== $1 ===${NC}"; }
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_OPTS="-i $SSH_KEY -o StrictHostKeyChecking=no"
VM_USER="${VM_USER:-ubuntu}"
# VM IPs (discovered earlier)
VM_100_IP="192.168.1.57" # cloudflare-tunnel
VM_101_IP="192.168.1.188" # k3s-master
VM_102_IP="192.168.1.121" # git-server
VM_103_IP="192.168.1.82" # observability
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
# Step 1: Install K3s on VM 101
install_k3s() {
log_step "Step 1: Installing K3s on VM 101 (k3s-master)"
log_info "Installing K3s on $VM_101_IP..."
ssh $SSH_OPTS "${VM_USER}@${VM_101_IP}" <<'K3S_EOF'
set -e
echo "=== Installing K3s ==="
# Check if already installed
if command -v k3s &>/dev/null; then
echo "K3s already installed"
k3s --version
sudo systemctl is-active k3s && echo "K3s is running" || echo "K3s is not running"
exit 0
fi
# Install K3s
echo "Downloading and installing K3s..."
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=latest sh -
# Verify installation
if command -v k3s &>/dev/null; then
echo "K3s installed successfully"
k3s --version
# Start and enable service
sudo systemctl enable k3s
sudo systemctl start k3s
# Wait for service to be ready
echo "Waiting for K3s to start..."
sleep 15
# Verify service status
if sudo systemctl is-active --quiet k3s; then
echo "✓ K3s service is running"
sudo k3s kubectl get nodes
sudo k3s kubectl get pods --all-namespaces
else
echo "✗ K3s service failed to start"
sudo systemctl status k3s --no-pager | head -20
exit 1
fi
else
echo "✗ K3s installation failed"
exit 1
fi
K3S_EOF
if [ $? -eq 0 ]; then
log_info "✓ K3s installed and running on VM 101"
else
log_error "K3s installation failed"
return 1
fi
}
# Step 2: Install and Configure Cloudflare Tunnel on VM 100
install_cloudflare_tunnel() {
log_step "Step 2: Installing Cloudflare Tunnel on VM 100 (cloudflare-tunnel)"
local tunnel_token="${CLOUDFLARE_TUNNEL_TOKEN:-}"
if [ -z "$tunnel_token" ]; then
log_warn "CLOUDFLARE_TUNNEL_TOKEN not set. Skipping Cloudflare Tunnel configuration."
log_info "Installing cloudflared only..."
fi
log_info "Installing cloudflared on $VM_100_IP..."
ssh $SSH_OPTS "${VM_USER}@${VM_100_IP}" <<CLOUDFLARE_EOF
set -e
echo "=== Installing Cloudflare Tunnel ==="
# Install cloudflared
if ! command -v cloudflared &>/dev/null; then
echo "Downloading cloudflared..."
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /tmp/cloudflared
sudo mv /tmp/cloudflared /usr/local/bin/cloudflared
sudo chmod +x /usr/local/bin/cloudflared
cloudflared --version
echo "✓ cloudflared installed"
else
echo "cloudflared already installed"
cloudflared --version
fi
# Configure tunnel if token is provided
if [ -n "${tunnel_token}" ]; then
echo "Configuring Cloudflare Tunnel..."
sudo mkdir -p /etc/cloudflared
# Create config file
sudo tee /etc/cloudflared/config.yml > /dev/null <<CONFIG_EOF
tunnel: \$(cloudflared tunnel token ${tunnel_token} | grep -oP 'Tunnel ID: \K[^ ]+' || echo '')
credentials-file: /etc/cloudflared/credentials.json
ingress:
- hostname: grafana.${CLOUDFLARE_DOMAIN:-d-bis.org}
service: http://${VM_103_IP}:3000
- hostname: prometheus.${CLOUDFLARE_DOMAIN:-d-bis.org}
service: http://${VM_103_IP}:9090
- hostname: git.${CLOUDFLARE_DOMAIN:-d-bis.org}
service: http://${VM_102_IP}:3000
- hostname: proxmox.${CLOUDFLARE_DOMAIN:-d-bis.org}
service: https://${PROXMOX_HOST}:8006
- service: http_status:404
CONFIG_EOF
# Install as systemd service
sudo cloudflared service install ${tunnel_token}
# Start service
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sleep 5
if sudo systemctl is-active --quiet cloudflared; then
echo "✓ Cloudflare Tunnel service is running"
sudo systemctl status cloudflared --no-pager | head -10
else
echo "⚠ Cloudflare Tunnel service may need manual configuration"
sudo systemctl status cloudflared --no-pager | head -10
fi
else
echo "⚠ Tunnel token not provided. Install manually with:"
echo " cloudflared tunnel login"
echo " cloudflared tunnel create <tunnel-name>"
echo " cloudflared tunnel route dns <tunnel-name> <hostname>"
fi
CLOUDFLARE_EOF
if [ $? -eq 0 ]; then
log_info "✓ Cloudflare Tunnel installed on VM 100"
else
log_warn "Cloudflare Tunnel installation had issues (may need manual config)"
fi
}
# Step 3: Configure Gitea Initial Setup (via API)
configure_gitea() {
log_step "Step 3: Configuring Gitea Initial Setup"
log_info "Waiting for Gitea to be ready..."
local max_attempts=30
local attempt=0
local gitea_ready=false
while [ $attempt -lt $max_attempts ]; do
if curl -s "http://${VM_102_IP}:3000" | grep -q "Gitea"; then
gitea_ready=true
break
fi
sleep 2
attempt=$((attempt + 1))
done
if [ "$gitea_ready" = false ]; then
log_warn "Gitea not ready after $max_attempts attempts"
log_info "Gitea initial setup must be completed manually:"
log_info " 1. Visit http://${VM_102_IP}:3000"
log_info " 2. Complete the installation wizard"
return 0
fi
log_info "Gitea is ready. Attempting automated setup..."
# Try to configure via API (Gitea 1.19+ supports installation API)
local response=$(curl -s -X POST "http://${VM_102_IP}:3000/api/v1/setup" \
-H "Content-Type: application/json" \
-d '{
"db_type": "sqlite3",
"db_host": "",
"db_user": "",
"db_passwd": "",
"db_name": "gitea",
"ssl_mode": "disable",
"db_path": "data/gitea.db",
"app_name": "Gitea",
"repo_root_path": "/data/git/repositories",
"lfs_root_path": "/data/git/lfs",
"run_user": "git",
"domain": "'${VM_102_IP}'",
"ssh_port": 2222,
"http_port": 3000,
"app_url": "http://'${VM_102_IP}':3000/",
"log_root_path": "/data/gitea/log",
"smtp_host": "",
"smtp_from": "",
"smtp_user": "",
"smtp_passwd": "",
"admin_name": "admin",
"admin_passwd": "admin123",
"admin_confirm_passwd": "admin123",
"admin_email": "admin@'${CLOUDFLARE_DOMAIN:-d-bis.org}'"
}' 2>/dev/null || echo "")
if echo "$response" | grep -q "success\|created"; then
log_info "✓ Gitea configured successfully"
log_info " Admin user: admin"
log_info " Admin password: admin123 (change on first login!)"
else
log_warn "Automated Gitea setup may have failed"
log_info "Complete setup manually at http://${VM_102_IP}:3000"
log_info "Or check if setup was already completed"
fi
}
# Step 4: Final Status and Summary
final_summary() {
log_step "Final Summary"
echo ""
log_info "VM Status:"
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm list | grep -E '(100|101|102|103)'"
echo ""
log_info "Service Status:"
# Check K3s
if ssh $SSH_OPTS "${VM_USER}@${VM_101_IP}" "sudo systemctl is-active k3s &>/dev/null && echo 'active' || echo 'inactive'" | grep -q "active"; then
log_info " ✓ K3s (VM 101): Running"
ssh $SSH_OPTS "${VM_USER}@${VM_101_IP}" "sudo k3s kubectl get nodes 2>/dev/null | head -3" || true
else
log_warn " ✗ K3s (VM 101): Not running"
fi
# Check Cloudflare Tunnel
if ssh $SSH_OPTS "${VM_USER}@${VM_100_IP}" "sudo systemctl is-active cloudflared &>/dev/null && echo 'active' || echo 'inactive'" 2>/dev/null | grep -q "active"; then
log_info " ✓ Cloudflare Tunnel (VM 100): Running"
else
log_warn " ⚠ Cloudflare Tunnel (VM 100): May need manual configuration"
fi
# Check Gitea
if curl -s "http://${VM_102_IP}:3000" | grep -q "Gitea"; then
log_info " ✓ Gitea (VM 102): Running at http://${VM_102_IP}:3000"
else
log_warn " ✗ Gitea (VM 102): Not accessible"
fi
# Check Observability
if curl -s "http://${VM_103_IP}:9090/-/healthy" &>/dev/null; then
log_info " ✓ Prometheus (VM 103): Running at http://${VM_103_IP}:9090"
else
log_warn " ✗ Prometheus (VM 103): Not accessible"
fi
if curl -s "http://${VM_103_IP}:3000/api/health" &>/dev/null; then
log_info " ✓ Grafana (VM 103): Running at http://${VM_103_IP}:3000"
else
log_warn " ✗ Grafana (VM 103): Not accessible"
fi
echo ""
log_info "Service URLs:"
log_info " K3s Dashboard: Use 'kubectl' commands on VM 101"
log_info " Gitea: http://${VM_102_IP}:3000"
log_info " Prometheus: http://${VM_103_IP}:9090"
log_info " Grafana: http://${VM_103_IP}:3000 (admin/admin)"
echo ""
log_warn "Tasks Requiring Manual Steps or External Dependencies:"
log_info " 1. Join R630 to cluster: SSH to R630 (192.168.1.49) not accessible"
log_info " 2. Configure NFS storage: NFS server (10.10.10.1) not reachable"
log_info " 3. Configure VLAN bridges on R630: Requires SSH to R630"
log_info " 4. Complete Gitea setup: May need manual web UI access if API setup failed"
echo ""
log_info "✓ All automated tasks completed!"
}
main() {
log_step "Completing All Remaining Tasks"
install_k3s
install_cloudflare_tunnel
configure_gitea
final_summary
}
main "$@"

View File

@@ -0,0 +1,202 @@
#!/bin/bash
source ~/.bashrc
# Complete All Steps with Workarounds
# Attempts all possible steps, documents what requires manual intervention
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
# Step 1: Check and attach ISO to template
setup_template_iso() {
log_step "Step 1: Setting Up Template with ISO"
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Check for Ubuntu ISO
local isos=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/storage/local/content" | \
python3 -c "import sys, json; r=json.load(sys.stdin); isos=[i.get('volid', '') for i in r.get('data', []) if i.get('content')=='iso' and 'ubuntu' in i.get('volid', '').lower()]; print('\n'.join(isos[:1]))" 2>/dev/null)
if [ -n "$isos" ]; then
local iso_file=$(echo "$isos" | head -1)
log_info "Found Ubuntu ISO: $iso_file"
log_info "Attaching to template 9000..."
# Attach ISO and set boot order
local result=$(curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "ide2=$iso_file,media=cdrom" \
-d "boot=order=ide2;scsi0" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/9000/config" 2>&1)
if echo "$result" | grep -q '"data"'; then
log_info "✓ ISO attached successfully"
log_info "Template 9000 is ready for OS installation"
log_warn "Next: Start VM 9000 and install Ubuntu via console"
return 0
else
log_warn "Could not attach ISO via API: $result"
log_info "Manual step: Attach ISO via Proxmox Web UI"
return 1
fi
else
log_warn "No Ubuntu ISO found in storage"
log_info "Need to upload Ubuntu 24.04 ISO first"
log_info "See: scripts/troubleshooting/upload-ubuntu-iso.sh"
return 1
fi
}
# Step 2: Attempt infrastructure setup
attempt_infrastructure() {
log_step "Step 2: Infrastructure Setup"
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Check cluster status
local cluster_status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/cluster/status" 2>&1)
if echo "$cluster_status" | grep -q '"data"'; then
local node_count=$(echo "$cluster_status" | python3 -c "import sys, json; print(len(json.load(sys.stdin).get('data', [])))" 2>/dev/null)
if [ "$node_count" -gt 1 ]; then
log_info "✓ Cluster configured with $node_count nodes"
else
log_warn "Cluster exists but only has 1 node"
log_info "Need to join R630 to cluster (requires SSH)"
fi
else
log_warn "No cluster configured"
log_info "Cluster setup requires SSH access"
fi
# Check storage
local storage_status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/storage" 2>&1)
local nfs_count=$(echo "$storage_status" | python3 -c "import sys, json; r=json.load(sys.stdin); nfs=[s for s in r.get('data', []) if s.get('type')=='nfs']; print(len(nfs))" 2>/dev/null)
if [ "$nfs_count" -gt 0 ]; then
log_info "✓ NFS storage configured"
else
log_warn "No NFS storage configured"
log_info "NFS setup requires SSH access or NFS server available"
fi
}
# Step 3: Monitor and retry VM connectivity
monitor_vms() {
log_step "Step 3: Monitoring VM Status"
local vms=(
"100 192.168.1.60 cloudflare-tunnel"
"101 192.168.1.188 k3s-master"
"102 192.168.1.121 git-server"
"103 192.168.1.82 observability"
)
log_info "Checking VM connectivity (will retry multiple times)..."
for attempt in {1..3}; do
log_info "Attempt $attempt/3:"
local any_reachable=false
for vm_spec in "${vms[@]}"; do
read -r vmid ip name <<< "$vm_spec"
if ping -c 1 -W 2 "$ip" &>/dev/null; then
log_info "$name ($ip) is reachable!"
any_reachable=true
fi
done
if [ "$any_reachable" = true ]; then
log_info "Some VMs are now reachable!"
break
fi
if [ $attempt -lt 3 ]; then
log_warn "VMs not reachable yet, waiting 30 seconds..."
sleep 30
fi
done
}
main() {
log_info "Completing All Steps with Workarounds"
echo ""
# Setup template ISO
setup_template_iso
# Infrastructure
attempt_infrastructure
# Monitor VMs
monitor_vms
log_step "Summary"
log_info "All automated steps attempted"
log_warn "Template OS installation requires manual step via Web UI"
log_info "See TROUBLESHOOTING_AND_FIXES.md for template fix instructions"
}
main "$@"

View File

@@ -0,0 +1,141 @@
#!/bin/bash
# Complete Cloudflare Tunnel Setup for VM 100
# Run this AFTER SSH access to VM 100 is working
# Usage: From root@pve: ssh ubuntu@192.168.1.244, then run this script
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
else
echo "Error: .env file not found. Please set:"
echo " CLOUDFLARE_TUNNEL_TOKEN"
echo " CLOUDFLARE_ACCOUNT_ID"
echo " CLOUDFLARE_DOMAIN"
exit 1
fi
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
echo "========================================="
echo "Cloudflare Tunnel Configuration"
echo "========================================="
echo ""
# Create directories and user
echo -e "${GREEN}[1/6]${NC} Creating directories and user..."
sudo mkdir -p /etc/cloudflared
sudo useradd -r -s /bin/false cloudflared 2>/dev/null || true
sudo chown cloudflared:cloudflared /etc/cloudflared
echo "✓ Done"
echo ""
# Create config file
echo -e "${GREEN}[2/6]${NC} Creating config file..."
sudo tee /etc/cloudflared/config.yml > /dev/null << CONFIGEOF
tunnel: $CLOUDFLARE_TUNNEL_TOKEN
credentials-file: /etc/cloudflared/credentials.json
ingress:
- hostname: grafana.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:3000
- hostname: prometheus.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:9090
- hostname: git.$CLOUDFLARE_DOMAIN
service: http://192.168.1.121:3000
- hostname: proxmox-ml110.$CLOUDFLARE_DOMAIN
service: https://192.168.1.206:8006
originRequest:
noTLSVerify: true
- hostname: proxmox-r630.$CLOUDFLARE_DOMAIN
service: https://192.168.1.49:8006
originRequest:
noTLSVerify: true
- service: http_status:404
CONFIGEOF
sudo chown cloudflared:cloudflared /etc/cloudflared/config.yml
sudo chmod 600 /etc/cloudflared/config.yml
echo "✓ Done"
echo ""
# Create credentials file
echo -e "${GREEN}[3/6]${NC} Creating credentials file..."
sudo tee /etc/cloudflared/credentials.json > /dev/null << CREDEOF
{
"AccountTag": "$CLOUDFLARE_ACCOUNT_ID",
"TunnelSecret": "$CLOUDFLARE_TUNNEL_TOKEN"
}
CREDEOF
sudo chown cloudflared:cloudflared /etc/cloudflared/credentials.json
sudo chmod 600 /etc/cloudflared/credentials.json
echo "✓ Done"
echo ""
# Create systemd service
echo -e "${GREEN}[4/6]${NC} Creating systemd service..."
sudo tee /etc/systemd/system/cloudflared.service > /dev/null << SERVICEEOF
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=10s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
SERVICEEOF
echo "✓ Done"
echo ""
# Enable and start service
echo -e "${GREEN}[5/6]${NC} Enabling and starting service..."
sudo systemctl daemon-reload
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sleep 5
echo "✓ Done"
echo ""
# Verify
echo -e "${GREEN}[6/6]${NC} Verifying configuration..."
echo ""
echo "=== Service Status ==="
sudo systemctl status cloudflared --no-pager | head -15
echo ""
echo "=== Configuration Files ==="
ls -la /etc/cloudflared/
echo ""
echo "=== Recent Logs ==="
sudo journalctl -u cloudflared -n 10 --no-pager
echo ""
echo "========================================="
echo -e "${GREEN}Configuration Complete!${NC}"
echo "========================================="
echo ""
echo "Next steps:"
echo "1. Verify service: systemctl status cloudflared"
echo "2. View logs: journalctl -u cloudflared -f"
echo "3. Configure DNS records in Cloudflare Dashboard"
echo ""

View File

@@ -0,0 +1,184 @@
#!/bin/bash
source ~/.bashrc
# Complete Deployment Automation Script
# Orchestrates all deployment tasks
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# Check if command exists
command_exists() {
command -v "$1" >/dev/null 2>&1
}
# Check VM connectivity
check_vm_connectivity() {
local ip=$1
local name=$2
log_info "Checking connectivity to $name ($ip)..."
if ping -c 1 -W 2 "$ip" >/dev/null 2>&1; then
log_info "$name is reachable"
return 0
else
log_warn "$name is not reachable (may still be installing OS)"
return 1
fi
}
# Main deployment flow
main() {
log_header "Complete Deployment Automation"
echo ""
log_step "Phase 1: Prerequisites Check"
echo ""
# Check Proxmox connections
log_info "Verifying Proxmox connections..."
if ./scripts/utils/test-proxmox-connection.sh > /dev/null 2>&1; then
log_info "✓ Proxmox connections verified"
else
log_error "Proxmox connection failed"
exit 1
fi
echo ""
log_step "Phase 2: VM Creation Status"
echo ""
log_warn "VM creation requires manual steps via Proxmox Web UI"
log_info "Run: ./scripts/create-all-vms.sh to see available resources"
log_info "Then create VMs at: https://192.168.1.206:8006"
echo ""
# VM IPs
declare -A VM_IPS=(
["cloudflare-tunnel"]="192.168.1.60"
["k3s-master"]="192.168.1.188"
["git-server"]="192.168.1.121"
["observability"]="192.168.1.82"
)
log_info "Checking VM connectivity..."
for vm_name in "${!VM_IPS[@]}"; do
check_vm_connectivity "${VM_IPS[$vm_name]}" "$vm_name"
done
echo ""
log_step "Phase 3: Post-VM-Creation Automation"
echo ""
log_info "Once VMs are created and OS is installed, run:"
echo ""
echo " For Cloudflare Tunnel VM:"
echo " ssh user@192.168.1.60"
echo " sudo bash <(curl -s https://raw.githubusercontent.com/your-repo/scripts/setup-cloudflare-tunnel.sh)"
echo " # Or copy scripts/setup-cloudflare-tunnel.sh to VM"
echo ""
echo " For K3s VM:"
echo " ssh user@192.168.1.188"
echo " sudo bash <(curl -s https://raw.githubusercontent.com/your-repo/scripts/setup-k3s.sh)"
echo " # Or copy scripts/setup-k3s.sh to VM"
echo ""
log_step "Phase 4: Generate Setup Packages"
echo ""
# Create setup package for each VM
mkdir -p /tmp/vm-setup-packages
log_info "Creating setup packages..."
# Cloudflare Tunnel setup package
cat > /tmp/vm-setup-packages/cloudflare-tunnel-setup.sh <<'EOFTUNNEL'
#!/bin/bash
# Cloudflare Tunnel VM Setup
# Run this on the Cloudflare Tunnel VM after OS installation
set -e
cd /tmp
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared
useradd -r -s /bin/false cloudflared || true
mkdir -p /etc/cloudflared
chown cloudflared:cloudflared /etc/cloudflared
echo "cloudflared installed. Next steps:"
echo "1. Run: cloudflared tunnel login"
echo "2. Run: cloudflared tunnel create azure-stack-hci"
echo "3. Configure /etc/cloudflared/config.yml"
echo "4. Set up systemd service"
EOFTUNNEL
# K3s setup package
cat > /tmp/vm-setup-packages/k3s-setup.sh <<'EOFK3S'
#!/bin/bash
# K3s Setup
# Run this on the K3s VM after OS installation
set -e
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--write-kubeconfig-mode 644" sh -
systemctl status k3s
echo "K3s installed. Next steps:"
echo "1. Configure kubectl: export KUBECONFIG=/etc/rancher/k3s/k3s.yaml"
echo "2. Verify: kubectl get nodes"
EOFK3S
chmod +x /tmp/vm-setup-packages/*.sh
log_info "✓ Setup packages created in /tmp/vm-setup-packages/"
echo ""
log_step "Phase 5: Documentation"
echo ""
log_info "All documentation is ready:"
echo " - CREATE_VMS.md - VM creation guide"
echo " - QUICK_START.md - Quick reference"
echo " - DEPLOYMENT_WITHOUT_AZURE.md - Full plan"
echo " - DEPLOYMENT_CHECKLIST.md - Progress tracker"
echo ""
log_header "Deployment Automation Complete"
echo ""
log_info "Next Steps:"
echo " 1. Create VMs via Proxmox Web UI (see CREATE_VMS.md)"
echo " 2. Install OS on each VM"
echo " 3. Copy setup scripts to VMs and run them"
echo " 4. Follow DEPLOYMENT_CHECKLIST.md to track progress"
echo ""
}
main "$@"

View File

@@ -0,0 +1,162 @@
#!/bin/bash
source ~/.bashrc
# Configure All Services on VMs
# Run this script after VMs have booted and are accessible via SSH
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# VM IP addresses
CLOUDFLARE_IP="192.168.1.60"
K3S_IP="192.168.1.188"
GIT_IP="192.168.1.121"
OBSERVABILITY_IP="192.168.1.82"
# SSH user (default for Ubuntu cloud images)
SSH_USER="${SSH_USER:-ubuntu}"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
execute_remote() {
local host=$1
local command=$2
local description=$3
log_info "$description on $host"
if ssh -o StrictHostKeyChecking=no -o ConnectTimeout=10 "$SSH_USER@$host" "$command"; then
log_info "$description completed on $host"
return 0
else
log_error "$description failed on $host"
return 1
fi
}
copy_file_remote() {
local host=$1
local source=$2
local dest=$3
log_info "Copying $source to $SSH_USER@$host:$dest"
scp -o StrictHostKeyChecking=no "$source" "$SSH_USER@$host:$dest"
}
# Configure Cloudflare Tunnel
configure_cloudflare() {
log_step "Configuring Cloudflare Tunnel on VM 100"
execute_remote "$CLOUDFLARE_IP" \
"curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared && chmod +x /usr/local/bin/cloudflared" \
"Install cloudflared"
log_warn "Cloudflare Tunnel authentication requires manual steps:"
log_warn " 1. SSH to $CLOUDFLARE_IP"
log_warn " 2. Run: cloudflared tunnel login"
log_warn " 3. Create tunnel: cloudflared tunnel create azure-stack-hci"
log_warn " 4. Configure routes and systemd service"
}
# Configure K3s
configure_k3s() {
log_step "Configuring K3s on VM 101"
execute_remote "$K3S_IP" \
"curl -sfL https://get.k3s.io | sh -" \
"Install K3s"
execute_remote "$K3S_IP" \
"kubectl get nodes" \
"Verify K3s installation"
log_info "K3s kubeconfig location: /etc/rancher/k3s/k3s.yaml"
}
# Configure Git Server
configure_git() {
log_step "Configuring Git Server on VM 102"
# Check if setup script exists
if [ -f "$PROJECT_ROOT/infrastructure/gitops/gitea-deploy.sh" ]; then
copy_file_remote "$GIT_IP" \
"$PROJECT_ROOT/infrastructure/gitops/gitea-deploy.sh" \
"/tmp/gitea-deploy.sh"
execute_remote "$GIT_IP" \
"chmod +x /tmp/gitea-deploy.sh && sudo /tmp/gitea-deploy.sh" \
"Deploy Gitea"
else
log_warn "Gitea deployment script not found, manual installation required"
fi
}
# Configure Observability
configure_observability() {
log_step "Configuring Observability Stack on VM 103"
# Install Prometheus
execute_remote "$OBSERVABILITY_IP" \
"sudo apt-get update && sudo apt-get install -y prometheus" \
"Install Prometheus"
# Install Grafana
execute_remote "$OBSERVABILITY_IP" \
"sudo apt-get install -y apt-transport-https software-properties-common wget && wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - && echo 'deb https://packages.grafana.com/oss/deb stable main' | sudo tee -a /etc/apt/sources.list.d/grafana.list && sudo apt-get update && sudo apt-get install -y grafana && sudo systemctl enable grafana-server && sudo systemctl start grafana-server" \
"Install Grafana"
log_info "Grafana should be accessible at http://$OBSERVABILITY_IP:3000"
log_info "Default credentials: admin/admin"
}
main() {
log_info "Configuring all services on VMs"
log_warn "This script requires SSH access to all VMs"
log_warn "Ensure VMs have booted and are accessible"
# Test connectivity
log_info "Testing VM connectivity..."
for ip in "$CLOUDFLARE_IP" "$K3S_IP" "$GIT_IP" "$OBSERVABILITY_IP"; do
if ! ping -c 1 -W 2 "$ip" &> /dev/null; then
log_error "Cannot reach $ip - VM may not be ready"
log_warn "Wait for VMs to fully boot and try again"
exit 1
fi
done
log_info "All VMs are reachable"
# Configure services
configure_cloudflare
configure_k3s
configure_git
configure_observability
log_info "Service configuration completed!"
log_warn "Some services may require additional manual configuration"
}
main "$@"

View File

@@ -0,0 +1,244 @@
#!/bin/bash
source ~/.bashrc
# Configure Cloudflare Tunnel Authentication and Setup on VM 100
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
VMID=100
VM_NAME="cloudflare-tunnel"
TUNNEL_NAME="${CLOUDFLARE_TUNNEL_NAME:-azure-stack-hci}"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found"
exit 1
fi
main() {
log_info "Configuring Cloudflare Tunnel on VM $VMID ($VM_NAME)"
echo ""
# Get IP using guest agent
local ip
ip="$(get_vm_ip_or_warn "$VMID" "$VM_NAME" || true)"
if [[ -z "$ip" ]]; then
log_error "Cannot get IP for VM $VMID. Ensure SSH is working and QEMU Guest Agent is installed."
exit 1
fi
log_info "Using IP: $ip"
echo ""
# Check if cloudflared is installed
log_info "Checking cloudflared installation..."
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "command -v cloudflared" &>/dev/null; then
log_warn "cloudflared not found. Installing..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /tmp/cloudflared
sudo mv /tmp/cloudflared /usr/local/bin/cloudflared
sudo chmod +x /usr/local/bin/cloudflared
cloudflared --version
EOF
log_info "cloudflared installed"
else
log_info "cloudflared is installed"
fi
# Create cloudflared user and directories
log_info "Setting up cloudflared user and directories..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
sudo useradd -r -s /bin/false cloudflared 2>/dev/null || true
sudo mkdir -p /etc/cloudflared
sudo chown cloudflared:cloudflared /etc/cloudflared
EOF
# Authenticate cloudflared (interactive)
log_info "Authenticating with Cloudflare..."
log_warn "This requires interactive browser authentication."
log_info "A browser window will open for authentication."
echo ""
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -t "${VM_USER}@${ip}" <<EOF
set -e
cd /tmp
cloudflared tunnel login
EOF
# Create tunnel
log_info "Creating tunnel: $TUNNEL_NAME..."
local tunnel_id
tunnel_id=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "cloudflared tunnel create $TUNNEL_NAME 2>&1 | grep -oP '(?<=Created tunnel )[a-f0-9-]+' || cloudflared tunnel list | grep '$TUNNEL_NAME' | awk '{print \$1}'" || true)
if [[ -z "$tunnel_id" ]]; then
log_error "Failed to create or find tunnel. Please check Cloudflare dashboard."
exit 1
fi
log_info "Tunnel ID: $tunnel_id"
# Get service IPs
local git_ip prometheus_ip grafana_ip proxmox_ml110_ip proxmox_r630_ip
git_ip="192.168.1.121" # VM 102
prometheus_ip="192.168.1.82" # VM 103
grafana_ip="192.168.1.82" # VM 103
proxmox_ml110_ip="192.168.1.206"
proxmox_r630_ip="192.168.1.49"
# Create tunnel configuration
log_info "Creating tunnel configuration..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "sudo tee /etc/cloudflared/config.yml" <<EOF
tunnel: $tunnel_id
credentials-file: /etc/cloudflared/$tunnel_id.json
ingress:
# Grafana Dashboard
- hostname: grafana.yourdomain.com
service: http://$grafana_ip:3000
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
# Prometheus
- hostname: prometheus.yourdomain.com
service: http://$prometheus_ip:9090
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
# Git Server (Gitea)
- hostname: git.yourdomain.com
service: http://$git_ip:3000
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
# Proxmox ML110
- hostname: proxmox-ml110.yourdomain.com
service: https://$proxmox_ml110_ip:8006
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 10s
tlsTimeout: 10s
httpHostHeader: proxmox-ml110.yourdomain.com
# Proxmox R630
- hostname: proxmox-r630.yourdomain.com
service: https://$proxmox_r630_ip:8006
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 10s
tlsTimeout: 10s
httpHostHeader: proxmox-r630.yourdomain.com
# Catch-all (must be last)
- service: http_status:404
EOF
# Move credentials file to proper location
log_info "Setting up credentials file..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<EOF
set -e
if [ -f ~/.cloudflared/$tunnel_id.json ]; then
sudo mv ~/.cloudflared/$tunnel_id.json /etc/cloudflared/$tunnel_id.json
sudo chown cloudflared:cloudflared /etc/cloudflared/$tunnel_id.json
sudo chmod 600 /etc/cloudflared/$tunnel_id.json
fi
EOF
# Create systemd service
log_info "Creating systemd service..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "sudo tee /etc/systemd/system/cloudflared.service" <<'EOF'
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
# Enable and start service
log_info "Enabling and starting cloudflared service..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
sudo systemctl daemon-reload
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sleep 3
sudo systemctl status cloudflared --no-pager || true
EOF
# Verify service
log_info "Verifying tunnel status..."
sleep 5
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "sudo systemctl is-active --quiet cloudflared"; then
log_info "✓ Cloudflare Tunnel is running!"
echo ""
log_info "Tunnel Configuration:"
log_info " Tunnel Name: $TUNNEL_NAME"
log_info " Tunnel ID: $tunnel_id"
log_info " Config: /etc/cloudflared/config.yml"
echo ""
log_warn "Next steps:"
log_info "1. Configure DNS records in Cloudflare Dashboard:"
log_info " - grafana.yourdomain.com → CNAME to $tunnel_id.cfargotunnel.com"
log_info " - prometheus.yourdomain.com → CNAME to $tunnel_id.cfargotunnel.com"
log_info " - git.yourdomain.com → CNAME to $tunnel_id.cfargotunnel.com"
log_info " - proxmox-ml110.yourdomain.com → CNAME to $tunnel_id.cfargotunnel.com"
log_info " - proxmox-r630.yourdomain.com → CNAME to $tunnel_id.cfargotunnel.com"
echo ""
log_info "2. Configure Zero Trust policies in Cloudflare Dashboard"
log_info "3. View logs: ssh ${VM_USER}@${ip} 'sudo journalctl -u cloudflared -f'"
else
log_error "Tunnel service failed to start. Check logs:"
log_info " ssh ${VM_USER}@${ip} 'sudo journalctl -u cloudflared'"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,135 @@
#!/bin/bash
# Configure Cloudflare Tunnel on VM 100
# Run this script AFTER SSH'ing to VM 100 (192.168.1.244)
# Usage: From root@pve: ssh ubuntu@192.168.1.244, then run this script
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
else
echo "Error: .env file not found. Please set these variables:"
echo " CLOUDFLARE_TUNNEL_TOKEN"
echo " CLOUDFLARE_ACCOUNT_ID"
echo " CLOUDFLARE_DOMAIN"
exit 1
fi
echo "========================================="
echo "Cloudflare Tunnel Configuration"
echo "========================================="
echo ""
# Create directories and user
echo "Creating directories and user..."
sudo mkdir -p /etc/cloudflared
sudo useradd -r -s /bin/false cloudflared 2>/dev/null || true
sudo chown cloudflared:cloudflared /etc/cloudflared
echo "✓ Directories and user created"
echo ""
# Create config file
echo "Creating config file..."
sudo tee /etc/cloudflared/config.yml > /dev/null << CONFIGEOF
tunnel: $CLOUDFLARE_TUNNEL_TOKEN
credentials-file: /etc/cloudflared/credentials.json
ingress:
- hostname: grafana.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:3000
- hostname: prometheus.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:9090
- hostname: git.$CLOUDFLARE_DOMAIN
service: http://192.168.1.121:3000
- hostname: proxmox-ml110.$CLOUDFLARE_DOMAIN
service: https://192.168.1.206:8006
originRequest:
noTLSVerify: true
- hostname: proxmox-r630.$CLOUDFLARE_DOMAIN
service: https://192.168.1.49:8006
originRequest:
noTLSVerify: true
- service: http_status:404
CONFIGEOF
sudo chown cloudflared:cloudflared /etc/cloudflared/config.yml
sudo chmod 600 /etc/cloudflared/config.yml
echo "✓ Config file created"
echo ""
# Create credentials file
echo "Creating credentials file..."
sudo tee /etc/cloudflared/credentials.json > /dev/null << CREDEOF
{
"AccountTag": "$CLOUDFLARE_ACCOUNT_ID",
"TunnelSecret": "$CLOUDFLARE_TUNNEL_TOKEN"
}
CREDEOF
sudo chown cloudflared:cloudflared /etc/cloudflared/credentials.json
sudo chmod 600 /etc/cloudflared/credentials.json
echo "✓ Credentials file created"
echo ""
# Create systemd service
echo "Creating systemd service..."
sudo tee /etc/systemd/system/cloudflared.service > /dev/null << SERVICEEOF
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=10s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
SERVICEEOF
echo "✓ Service file created"
echo ""
# Enable and start service
echo "Enabling and starting service..."
sudo systemctl daemon-reload
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sleep 3
echo ""
echo "========================================="
echo "Configuration Complete"
echo "========================================="
echo ""
# Check status
echo "Service Status:"
sudo systemctl status cloudflared --no-pager | head -15
echo ""
echo "Files created:"
ls -la /etc/cloudflared/
echo ""
echo "Recent logs:"
sudo journalctl -u cloudflared -n 10 --no-pager
echo ""
echo "========================================="
echo "Next Steps:"
echo "1. Verify service is running: systemctl status cloudflared"
echo "2. View logs: journalctl -u cloudflared -f"
echo "3. Configure DNS records in Cloudflare Dashboard"
echo "========================================="

View File

@@ -0,0 +1,233 @@
#!/bin/bash
# Configure Cloudflare Tunnel on VM 100
# Run this script from Proxmox host (root@pve)
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
else
echo "Error: .env file not found at $PROJECT_ROOT/.env"
exit 1
fi
VMID=100
VM_USER="ubuntu"
VM_IP="192.168.1.60"
echo "========================================="
echo "Cloudflare Tunnel Configuration for VM 100"
echo "========================================="
echo ""
# Check if we can SSH to VM
echo "Checking SSH access to VM 100..."
if ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "$VM_USER@$VM_IP" "echo 'SSH OK'" 2>/dev/null; then
echo "✓ SSH access available"
USE_SSH=true
else
echo "✗ SSH access not available"
echo " You'll need to access VM 100 via Proxmox Console"
USE_SSH=false
fi
echo ""
echo "Configuration will be prepared for:"
echo " Domain: $CLOUDFLARE_DOMAIN"
echo " Account ID: $CLOUDFLARE_ACCOUNT_ID"
echo ""
if [ "$USE_SSH" = true ]; then
echo "Configuring via SSH..."
# Create directories and user
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" <<EOF
sudo mkdir -p /etc/cloudflared
sudo useradd -r -s /bin/false cloudflared 2>/dev/null || true
sudo chown cloudflared:cloudflared /etc/cloudflared
EOF
# Create config file
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "sudo tee /etc/cloudflared/config.yml > /dev/null" <<CONFIGEOF
tunnel: $CLOUDFLARE_TUNNEL_TOKEN
credentials-file: /etc/cloudflared/credentials.json
ingress:
- hostname: grafana.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:3000
- hostname: prometheus.$CLOUDFLARE_DOMAIN
service: http://192.168.1.82:9090
- hostname: git.$CLOUDFLARE_DOMAIN
service: http://192.168.1.121:3000
- hostname: proxmox-ml110.$CLOUDFLARE_DOMAIN
service: https://192.168.1.206:8006
originRequest:
noTLSVerify: true
- hostname: proxmox-r630.$CLOUDFLARE_DOMAIN
service: https://192.168.1.49:8006
originRequest:
noTLSVerify: true
- service: http_status:404
CONFIGEOF
# Create credentials file
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "sudo tee /etc/cloudflared/credentials.json > /dev/null" <<CREDEOF
{
"AccountTag": "$CLOUDFLARE_ACCOUNT_ID",
"TunnelSecret": "$CLOUDFLARE_TUNNEL_TOKEN"
}
CREDEOF
# Set permissions
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" <<EOF
sudo chown cloudflared:cloudflared /etc/cloudflared/config.yml /etc/cloudflared/credentials.json
sudo chmod 600 /etc/cloudflared/config.yml /etc/cloudflared/credentials.json
EOF
# Create systemd service
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" "sudo tee /etc/systemd/system/cloudflared.service > /dev/null" <<SERVICEEOF
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=10s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
SERVICEEOF
# Enable and start service
ssh -o StrictHostKeyChecking=no "$VM_USER@$VM_IP" <<EOF
sudo systemctl daemon-reload
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
sleep 3
sudo systemctl status cloudflared --no-pager
EOF
echo ""
echo "✓ Configuration complete via SSH"
else
echo ""
echo "========================================="
echo "Manual Configuration Required"
echo "========================================="
echo ""
echo "Since SSH is not available, please:"
echo ""
echo "1. Access VM 100 via Proxmox Console:"
echo " - Go to: https://192.168.1.206:8006"
echo " - Navigate to: VM 100 → Console"
echo " - Login as: ubuntu"
echo ""
echo "2. Run these commands on VM 100:"
echo ""
cat <<'MANUAL'
# Create directories and user
sudo mkdir -p /etc/cloudflared
sudo useradd -r -s /bin/false cloudflared 2>/dev/null || true
sudo chown cloudflared:cloudflared /etc/cloudflared
# Create config file
sudo tee /etc/cloudflared/config.yml > /dev/null << 'CONFIGEOF'
tunnel: CLOUDFLARE_TUNNEL_TOKEN
credentials-file: /etc/cloudflared/credentials.json
ingress:
- hostname: grafana.CLOUDFLARE_DOMAIN
service: http://192.168.1.82:3000
- hostname: prometheus.CLOUDFLARE_DOMAIN
service: http://192.168.1.82:9090
- hostname: git.CLOUDFLARE_DOMAIN
service: http://192.168.1.121:3000
- hostname: proxmox-ml110.CLOUDFLARE_DOMAIN
service: https://192.168.1.206:8006
originRequest:
noTLSVerify: true
- hostname: proxmox-r630.CLOUDFLARE_DOMAIN
service: https://192.168.1.49:8006
originRequest:
noTLSVerify: true
- service: http_status:404
CONFIGEOF
# Replace placeholders (run these with actual values from .env)
sudo sed -i "s/CLOUDFLARE_TUNNEL_TOKEN/$CLOUDFLARE_TUNNEL_TOKEN/g" /etc/cloudflared/config.yml
sudo sed -i "s/CLOUDFLARE_DOMAIN/$CLOUDFLARE_DOMAIN/g" /etc/cloudflared/config.yml
# Create credentials file
sudo tee /etc/cloudflared/credentials.json > /dev/null << CREDEOF
{
"AccountTag": "CLOUDFLARE_ACCOUNT_ID",
"TunnelSecret": "CLOUDFLARE_TUNNEL_TOKEN"
}
CREDEOF
# Replace placeholders
sudo sed -i "s/CLOUDFLARE_ACCOUNT_ID/$CLOUDFLARE_ACCOUNT_ID/g" /etc/cloudflared/credentials.json
sudo sed -i "s/CLOUDFLARE_TUNNEL_TOKEN/$CLOUDFLARE_TUNNEL_TOKEN/g" /etc/cloudflared/credentials.json
# Set permissions
sudo chown cloudflared:cloudflared /etc/cloudflared/config.yml /etc/cloudflared/credentials.json
sudo chmod 600 /etc/cloudflared/config.yml /etc/cloudflared/credentials.json
# Create systemd service
sudo tee /etc/systemd/system/cloudflared.service > /dev/null << 'SERVICEEOF'
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=10s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
SERVICEEOF
# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable cloudflared
sudo systemctl start cloudflared
systemctl status cloudflared
MANUAL
echo ""
echo "Note: Replace CLOUDFLARE_TUNNEL_TOKEN, CLOUDFLARE_DOMAIN, and CLOUDFLARE_ACCOUNT_ID"
echo " with actual values from your .env file"
echo ""
echo "Or source the .env file first:"
echo " source /path/to/.env"
echo ""
fi
echo ""
echo "========================================="
echo "Configuration Complete"
echo "========================================="
echo ""
echo "Next steps:"
echo "1. Verify service: systemctl status cloudflared"
echo "2. View logs: journalctl -u cloudflared -f"
echo "3. Configure DNS records in Cloudflare Dashboard"
echo ""

View File

@@ -0,0 +1,230 @@
#!/bin/bash
source ~/.bashrc
# Configure GitOps Workflows (Flux) on K3s Cluster
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
VMID=101
VM_NAME="k3s-master"
GIT_REPO="${GIT_REPO:-http://192.168.1.121:3000/hc-stack/gitops.git}"
GIT_BRANCH="${GIT_BRANCH:-main}"
GIT_PATH="${GIT_PATH:-gitops/}"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found"
exit 1
fi
main() {
log_info "Configuring GitOps Workflows on VM $VMID ($VM_NAME)"
echo ""
# Get IP using guest agent
local ip
ip="$(get_vm_ip_or_warn "$VMID" "$VM_NAME" || true)"
if [[ -z "$ip" ]]; then
log_error "Cannot get IP for VM $VMID. Ensure SSH is working and QEMU Guest Agent is installed."
exit 1
fi
log_info "Using IP: $ip"
echo ""
# Check K3s installation
log_info "Checking K3s installation..."
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "sudo kubectl version --client" &>/dev/null; then
log_error "K3s/kubectl not found. Please install K3s first."
exit 1
fi
log_info "K3s is installed"
# Install Flux CLI
log_info "Installing Flux CLI..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
if ! command -v flux &>/dev/null; then
curl -s https://fluxcd.io/install.sh | sudo bash
flux --version
else
echo "Flux CLI already installed"
flux --version
fi
EOF
# Check if Flux is already installed
log_info "Checking if Flux is already installed..."
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "sudo kubectl get namespace flux-system" &>/dev/null; then
log_warn "Flux is already installed. Skipping installation."
else
# Install Flux
log_info "Installing Flux in K3s cluster..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
sudo flux install --components=source-controller,kustomize-controller,helm-controller,notification-controller
EOF
log_info "Waiting for Flux to be ready..."
sleep 10
fi
# Create Git repository secret (if using HTTPS with token)
log_info "Configuring Git repository access..."
log_warn "Note: For Gitea, you may need to create a token and configure authentication"
# For now, we'll set up a basic GitRepository source
# User will need to configure authentication based on their setup
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<EOF
set -e
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
# Create namespace for applications if it doesn't exist
sudo kubectl create namespace blockchain --dry-run=client -o yaml | sudo kubectl apply -f -
sudo kubectl create namespace monitoring --dry-run=client -o yaml | sudo kubectl apply -f -
sudo kubectl create namespace hc-stack --dry-run=client -o yaml | sudo kubectl apply -f -
# Create GitRepository source
cat <<'GITREPO' | sudo kubectl apply -f -
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
name: gitops-repo
namespace: flux-system
spec:
interval: 1m
url: $GIT_REPO
ref:
branch: $GIT_BRANCH
ignore: |
# Exclude certain paths
.git/
.github/
docs/
scripts/
GITREPO
EOF
log_info "GitRepository source created"
log_warn "If your Git repository requires authentication, you'll need to:"
log_info "1. Create a Git token in Gitea"
log_info "2. Create a secret: kubectl create secret generic gitops-repo-auth \\"
log_info " --from-literal=username=<username> \\"
log_info " --from-literal=password=<token> \\"
log_info " -n flux-system"
log_info "3. Update GitRepository to reference the secret"
echo ""
# Create Kustomization for infrastructure
log_info "Creating Kustomization for infrastructure..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
cat <<'KUSTOMIZATION' | sudo kubectl apply -f -
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 5m
path: ./gitops/infrastructure
prune: true
sourceRef:
kind: GitRepository
name: gitops-repo
validation: client
KUSTOMIZATION
EOF
# Create Kustomization for applications
log_info "Creating Kustomization for applications..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
cat <<'KUSTOMIZATION' | sudo kubectl apply -f -
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: applications
namespace: flux-system
spec:
interval: 5m
path: ./gitops/apps
prune: true
sourceRef:
kind: GitRepository
name: gitops-repo
validation: client
KUSTOMIZATION
EOF
# Wait for reconciliation
log_info "Waiting for Flux to reconcile..."
sleep 10
# Check Flux status
log_info "Checking Flux status..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
echo "=== Flux Components ==="
sudo kubectl get pods -n flux-system
echo ""
echo "=== GitRepository Status ==="
sudo kubectl get gitrepository -n flux-system
echo ""
echo "=== Kustomization Status ==="
sudo kubectl get kustomization -n flux-system
EOF
log_info "✓ GitOps workflows configured!"
echo ""
log_info "Next steps:"
log_info "1. Ensure your Git repository is accessible from the cluster"
log_info "2. Configure authentication if required (see warnings above)"
log_info "3. Push your GitOps manifests to: $GIT_REPO"
log_info "4. Monitor reconciliation: kubectl get kustomization -n flux-system"
log_info "5. View logs: kubectl logs -n flux-system -l app=kustomize-controller"
}
main "$@"

View File

@@ -0,0 +1,154 @@
#!/bin/bash
source ~/.bashrc
# Configure Cloud-Init on Proxmox VMs via API
# Sets up IP addresses, users, and basic configuration
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
configure_vm_cloudinit() {
local vmid=$1
local name=$2
local ip=$3
local gateway=$4
local user=$5
log_info "Configuring cloud-init for VM $vmid ($name)..."
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Configure cloud-init settings
local response=$(curl -s -k -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "ipconfig0=ip=$ip/24,gw=$gateway" \
-d "ciuser=$user" \
-d "cipassword=" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" 2>&1)
if echo "$response" | grep -q '"data"'; then
log_info "VM $vmid cloud-init configured successfully"
return 0
else
log_error "Failed to configure VM $vmid: $response"
return 1
fi
}
start_vm() {
local vmid=$1
local name=$2
log_info "Starting VM $vmid ($name)..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
local response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" 2>&1)
if echo "$response" | grep -q '"data"'; then
log_info "VM $vmid started successfully"
return 0
else
log_warn "VM $vmid may already be running or start failed: $response"
return 0
fi
}
main() {
log_info "Configuring cloud-init on all service VMs"
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
# VM definitions: vmid name ip gateway user
local vms=(
"100 cloudflare-tunnel 192.168.1.60 192.168.1.254 ubuntu"
"101 k3s-master 192.168.1.188 192.168.1.254 ubuntu"
"102 git-server 192.168.1.121 192.168.1.254 ubuntu"
"103 observability 192.168.1.82 192.168.1.254 ubuntu"
)
# Configure cloud-init
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway user <<< "$vm_spec"
configure_vm_cloudinit "$vmid" "$name" "$ip" "$gateway" "$user"
sleep 1
done
log_info "Waiting 5 seconds before starting VMs..."
sleep 5
# Start VMs
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway user <<< "$vm_spec"
start_vm "$vmid" "$name"
sleep 2
done
log_info "Cloud-init configuration and VM startup completed!"
log_warn "VMs are starting. They will boot with cloud-init configuration."
log_warn "Check VM status via Proxmox web UI or API."
}
main "$@"

View File

@@ -0,0 +1,200 @@
#!/bin/bash
source ~/.bashrc
# Configure Services on VMs
# Sets up Cloudflare Tunnel, K3s, Git Server, and Observability
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
SSH_KEY="$HOME/.ssh/id_ed25519_proxmox"
VM_USER="ubuntu"
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found. Run this script on Proxmox host or via SSH."
exit 1
fi
# VM definitions: vmid name (no IP - discovered via guest agent)
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
wait_for_vm() {
local vmid=$1
local name=$2
local max_wait=300
local waited=0
log_info "Waiting for $name (VM $vmid) to be reachable..."
# Ensure guest agent is enabled
ensure_guest_agent_enabled "$vmid" || true
while [ $waited -lt $max_wait ]; do
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" || true)"
if [[ -n "$ip" ]]; then
log_info "$name is reachable at $ip"
sleep 10 # Give it a bit more time for SSH
if timeout 3 bash -c "cat < /dev/null > /dev/tcp/$ip/22" 2>/dev/null; then
log_info "✓ SSH is available"
return 0
fi
fi
sleep 5
waited=$((waited + 5))
echo -n "."
done
echo ""
log_warn "$name (VM $vmid) not reachable after $max_wait seconds"
return 1
}
configure_cloudflare_tunnel() {
local ip=$1
log_step "Configuring Cloudflare Tunnel on VM 100"
log_info "Installing cloudflared..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$ip" "sudo apt update && sudo apt install -y cloudflared" || {
log_error "Failed to install cloudflared"
return 1
}
log_warn "Cloudflare Tunnel requires authentication - manual setup needed"
log_info "See: docs/services/cloudflare-tunnel-setup.md"
}
configure_k3s() {
local ip=$1
log_step "Configuring K3s on VM 101"
log_info "Installing K3s..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$ip" "curl -sfL https://get.k3s.io | sh -" || {
log_error "Failed to install K3s"
return 1
}
log_info "Verifying K3s installation..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$ip" "sudo kubectl get nodes" || {
log_error "K3s not working properly"
return 1
}
log_info "✓ K3s installed and running"
}
configure_git_server() {
local ip=$1
log_step "Configuring Git Server on VM 102"
log_info "Installing Gitea..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$ip" "sudo apt update && sudo apt install -y docker.io docker-compose" || {
log_error "Failed to install Docker"
return 1
}
log_warn "Gitea setup requires manual configuration"
log_info "See: docs/services/git-server-setup.md"
}
configure_observability() {
local ip=$1
log_step "Configuring Observability Stack on VM 103"
log_info "Installing Docker and Docker Compose..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "$VM_USER@$ip" "sudo apt update && sudo apt install -y docker.io docker-compose" || {
log_error "Failed to install Docker"
return 1
}
log_warn "Observability stack requires manual configuration"
log_info "See: docs/services/observability-setup.md"
}
main() {
log_info "Configuring Services on VMs"
echo ""
if [ ! -f "$SSH_KEY" ]; then
log_error "SSH key not found: $SSH_KEY"
exit 1
fi
# Wait for VMs to be accessible and get IPs
declare -A VM_IPS
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
wait_for_vm "$vmid" "$name"
# Get IP from guest agent
local ip
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
if [[ -n "$ip" ]]; then
VM_IPS["$vmid"]="$ip"
else
log_error "Cannot get IP for VM $vmid ($name), skipping"
continue
fi
done
# Configure services using discovered IPs
if [[ -n "${VM_IPS[100]:-}" ]]; then
configure_cloudflare_tunnel "${VM_IPS[100]}"
fi
if [[ -n "${VM_IPS[101]:-}" ]]; then
configure_k3s "${VM_IPS[101]}"
fi
if [[ -n "${VM_IPS[102]:-}" ]]; then
configure_git_server "${VM_IPS[102]}"
fi
if [[ -n "${VM_IPS[103]:-}" ]]; then
configure_observability "${VM_IPS[103]}"
fi
log_step "Service Configuration Complete!"
log_info "Some services require manual configuration (see docs/services/)"
}
main "$@"

View File

@@ -0,0 +1,119 @@
#!/bin/bash
source ~/.bashrc
# Continue All Steps with Troubleshooting
# Attempts to complete all steps and troubleshoot issues
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_issue() {
echo -e "${RED}[ISSUE]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
# Step 1: Diagnose Issues
diagnose_issues() {
log_step "Step 1: Diagnosing Issues"
if [ -f "$PROJECT_ROOT/scripts/troubleshooting/diagnose-vm-issues.sh" ]; then
"$PROJECT_ROOT/scripts/troubleshooting/diagnose-vm-issues.sh"
else
log_warn "Diagnosis script not found"
fi
}
# Step 2: Fix Template (if possible)
fix_template() {
log_step "Step 2: Attempting Template Fixes"
log_info "Template disk expanded to 8G (if not already)"
log_warn "Template needs OS installation - see TROUBLESHOOTING_AND_FIXES.md"
log_info "This requires manual access to Proxmox Web UI"
}
# Step 3: Continue Infrastructure Setup
continue_infrastructure() {
log_step "Step 3: Continuing Infrastructure Setup"
log_info "Checking cluster status..."
# Cluster check done in main script
log_warn "Infrastructure setup requires SSH access to Proxmox hosts"
log_info "To configure cluster:"
log_info " ssh root@192.168.1.206"
log_info " export CLUSTER_NAME=hc-cluster NODE_ROLE=create"
log_info " ./infrastructure/proxmox/cluster-setup.sh"
}
# Step 4: Monitor VM Status
monitor_vms() {
log_step "Step 4: Monitoring VM Status"
local vms=("100" "101" "102" "103")
local all_ready=true
for vmid in "${vms[@]}"; do
# Check via API
log_info "Checking VM $vmid..."
done
if [ "$all_ready" = false ]; then
log_warn "VMs not ready - may need template OS installation"
fi
}
main() {
log_info "Continuing All Steps with Troubleshooting"
echo ""
# Diagnose
diagnose_issues
# Fix template
fix_template
# Continue infrastructure
continue_infrastructure
# Monitor
monitor_vms
log_step "Summary"
log_issue "CRITICAL: Template VM 9000 needs OS installation"
log_info "See TROUBLESHOOTING_AND_FIXES.md for detailed fix instructions"
log_info "After template is fixed, recreate VMs and continue"
}
main "$@"

View File

@@ -0,0 +1,158 @@
#!/bin/bash
source ~/.bashrc
# Complete Deployment Script - All Services
# Orchestrates deployment of all VMs and services
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# VM configurations
declare -A VMS=(
["100"]="cloudflare-tunnel:192.168.1.60:scripts/setup-cloudflare-tunnel.sh"
["101"]="k3s-master:192.168.1.188:scripts/setup-k3s.sh"
["102"]="git-server:192.168.1.121:scripts/setup-git-server.sh"
["103"]="observability:192.168.1.82:scripts/setup-observability.sh"
)
# Check VM connectivity and run setup
setup_vm() {
local vmid=$1
local name=$2
local ip=$3
local script=$4
log_step "Setting up $name ($ip)..."
# Check connectivity
if ! ping -c 1 -W 2 "$ip" >/dev/null 2>&1; then
log_warn "$name ($ip) is not reachable. Skipping..."
return 1
fi
log_info "Copying setup script to $name..."
if scp "$script" "user@$ip:/tmp/setup.sh" 2>/dev/null; then
log_info "Running setup script on $name..."
ssh "user@$ip" "sudo bash /tmp/setup.sh" || log_warn "Setup script failed on $name"
else
log_warn "Could not copy script to $name. Manual setup required."
log_info "Manual steps:"
echo " 1. SSH to $name: ssh user@$ip"
echo " 2. Copy $script to VM"
echo " 3. Run: sudo bash /path/to/script"
fi
}
main() {
log_header "Complete Deployment - All Services"
echo ""
log_step "Phase 1: Prerequisites"
echo ""
if ./scripts/utils/test-proxmox-connection.sh > /dev/null 2>&1; then
log_info "✓ Proxmox connections verified"
else
log_error "Proxmox connection failed"
exit 1
fi
echo ""
log_step "Phase 2: VM Creation Status"
echo ""
log_warn "VMs must be created via Proxmox Web UI first"
log_info "Proxmox URL: https://192.168.1.206:8006"
log_info "See CREATE_VMS.md for detailed instructions"
echo ""
log_info "Required VMs:"
for vmid in "${!VMS[@]}"; do
IFS=':' read -r name ip script <<< "${VMS[$vmid]}"
echo " - $name (ID: $vmid, IP: $ip)"
done
echo ""
read -p "Have all VMs been created and OS installed? (y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log_warn "Please create VMs first, then run this script again"
exit 0
fi
log_step "Phase 3: Automated Setup"
echo ""
log_info "Attempting to set up each VM..."
echo ""
for vmid in "${!VMS[@]}"; do
IFS=':' read -r name ip script <<< "${VMS[$vmid]}"
setup_vm "$vmid" "$name" "$ip" "$script"
echo ""
done
log_step "Phase 4: Post-Setup Verification"
echo ""
log_info "Verifying services..."
echo ""
# Check services
services=(
"192.168.1.60:Cloudflare Tunnel"
"192.168.1.188:6443:K3s API"
"192.168.1.121:3000:Gitea"
"192.168.1.82:9090:Prometheus"
"192.168.1.82:3000:Grafana"
)
for service in "${services[@]}"; do
IFS=':' read -r ip port name <<< "$service"
if [ -z "$port" ]; then
port="22"
fi
if timeout 2 bash -c "echo >/dev/tcp/$ip/$port" 2>/dev/null; then
log_info "$name is accessible"
else
log_warn "$name is not accessible (may still be starting)"
fi
done
log_header "Deployment Complete"
echo ""
log_info "Next steps:"
echo " 1. Configure Cloudflare Tunnel (see docs/cloudflare-integration.md)"
echo " 2. Set up K3s namespaces and deploy services"
echo " 3. Configure GitOps repository"
echo " 4. Deploy HC Stack services"
echo ""
log_info "See DEPLOYMENT_CHECKLIST.md to track remaining tasks"
}
main "$@"

180
scripts/deploy/deploy-gitea.sh Executable file
View File

@@ -0,0 +1,180 @@
#!/bin/bash
source ~/.bashrc
# Deploy Gitea on VM 102 using guest-agent IP discovery
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
VMID=102
VM_NAME="git-server"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found"
exit 1
fi
main() {
log_info "Deploying Gitea on VM $VMID ($VM_NAME)"
echo ""
# Get IP using guest agent
local ip
ip="$(get_vm_ip_or_warn "$VMID" "$VM_NAME" || true)"
if [[ -z "$ip" ]]; then
log_error "Cannot get IP for VM $VMID. Ensure SSH is working and QEMU Guest Agent is installed."
exit 1
fi
log_info "Using IP: $ip"
echo ""
# Check if Docker is installed
log_info "Checking Docker installation..."
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "command -v docker" &>/dev/null; then
log_warn "Docker not found. Installing Docker..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
sudo apt-get update -qq
sudo apt-get install -y docker.io docker-compose
sudo usermod -aG docker $USER
EOF
log_info "Docker installed. You may need to log out and back in for group changes."
else
log_info "Docker is installed"
fi
# Create Gitea directory
log_info "Setting up Gitea directory..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
mkdir -p ~/gitea
cd ~/gitea
EOF
# Copy docker-compose file
log_info "Creating docker-compose.yml..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "cat > ~/gitea/docker-compose.yml" <<EOF
version: '3.8'
services:
gitea:
image: gitea/gitea:latest
container_name: gitea
restart: unless-stopped
environment:
- USER_UID=1000
- USER_GID=1000
- GITEA__database__DB_TYPE=postgres
- GITEA__database__HOST=db:5432
- GITEA__database__NAME=gitea
- GITEA__database__USER=gitea
- GITEA__database__PASSWD=gitea
- GITEA__server__DOMAIN=${ip}
- GITEA__server__SSH_DOMAIN=${ip}
- GITEA__server__SSH_PORT=2222
- GITEA__server__ROOT_URL=http://${ip}:3000
volumes:
- gitea_data:/data
- /etc/timezone:/etc/timezone:ro
- /etc/localtime:/etc/localtime:ro
ports:
- "3000:3000"
- "2222:22"
depends_on:
- db
networks:
- gitea-network
db:
image: postgres:15
container_name: gitea-db
restart: unless-stopped
environment:
- POSTGRES_USER=gitea
- POSTGRES_PASSWORD=gitea
- POSTGRES_DB=gitea
volumes:
- gitea_db_data:/var/lib/postgresql/data
networks:
- gitea-network
volumes:
gitea_data:
driver: local
gitea_db_data:
driver: local
networks:
gitea-network:
driver: bridge
EOF
# Deploy
log_info "Deploying Gitea with Docker Compose..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
cd ~/gitea
sudo docker-compose up -d
EOF
# Wait for service to be ready
log_info "Waiting for Gitea to start..."
sleep 10
# Verify
log_info "Verifying Gitea deployment..."
local max_wait=60
local elapsed=0
while [ $elapsed -lt $max_wait ]; do
if curl -s "http://${ip}:3000" &>/dev/null; then
log_info "✓ Gitea is running!"
echo ""
log_info "Access Gitea at: http://${ip}:3000"
log_info "SSH access: ssh://git@${ip}:2222"
return 0
fi
sleep 5
elapsed=$((elapsed + 5))
echo -n "."
done
log_warn "Gitea may not be fully ready yet. Check logs with:"
log_info " ssh ${VM_USER}@${ip} 'cd ~/gitea && sudo docker-compose logs'"
}
main "$@"

View File

@@ -0,0 +1,197 @@
#!/bin/bash
source ~/.bashrc
# Deploy Observability Stack (Prometheus + Grafana) on VM 103 using guest-agent IP discovery
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
VMID=103
VM_NAME="observability"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found"
exit 1
fi
main() {
log_info "Deploying Observability Stack on VM $VMID ($VM_NAME)"
echo ""
# Get IP using guest agent
local ip
ip="$(get_vm_ip_or_warn "$VMID" "$VM_NAME" || true)"
if [[ -z "$ip" ]]; then
log_error "Cannot get IP for VM $VMID. Ensure SSH is working and QEMU Guest Agent is installed."
exit 1
fi
log_info "Using IP: $ip"
echo ""
# Check if Docker is installed
log_info "Checking Docker installation..."
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "command -v docker" &>/dev/null; then
log_warn "Docker not found. Installing Docker..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
sudo apt-get update -qq
sudo apt-get install -y docker.io docker-compose
sudo usermod -aG docker $USER
EOF
log_info "Docker installed. You may need to log out and back in for group changes."
else
log_info "Docker is installed"
fi
# Create observability directory structure
log_info "Setting up observability directory..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
mkdir -p ~/observability/prometheus
cd ~/observability
EOF
# Create Prometheus config
log_info "Creating Prometheus configuration..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "cat > ~/observability/prometheus/prometheus.yml" <<'EOF'
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
EOF
# Create docker-compose file
log_info "Creating docker-compose.yml..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" "cat > ~/observability/docker-compose.yml" <<'EOF'
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=30d'
networks:
- observability
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
- GF_SERVER_ROOT_URL=http://localhost:3000
volumes:
- grafana-data:/var/lib/grafana
networks:
- observability
depends_on:
- prometheus
volumes:
prometheus-data:
driver: local
grafana-data:
driver: local
networks:
observability:
driver: bridge
EOF
# Deploy
log_info "Deploying Observability Stack with Docker Compose..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'EOF'
set -e
cd ~/observability
sudo docker-compose up -d
EOF
# Wait for services to be ready
log_info "Waiting for services to start..."
sleep 15
# Verify
log_info "Verifying services..."
local prometheus_ok=false
local grafana_ok=false
for i in {1..12}; do
if curl -s "http://${ip}:9090/-/healthy" &>/dev/null; then
prometheus_ok=true
fi
if curl -s "http://${ip}:3000/api/health" &>/dev/null; then
grafana_ok=true
fi
if [ "$prometheus_ok" = true ] && [ "$grafana_ok" = true ]; then
break
fi
sleep 5
echo -n "."
done
echo ""
if [ "$prometheus_ok" = true ] && [ "$grafana_ok" = true ]; then
log_info "✓ Observability Stack is running!"
echo ""
log_info "Access services:"
log_info " Prometheus: http://${ip}:9090"
log_info " Grafana: http://${ip}:3000 (admin/admin)"
else
log_warn "Some services may not be fully ready. Check logs with:"
log_info " ssh ${VM_USER}@${ip} 'cd ~/observability && sudo docker-compose logs'"
fi
}
main "$@"

158
scripts/deploy/deploy-start.sh Executable file
View File

@@ -0,0 +1,158 @@
#!/bin/bash
source ~/.bashrc
# Start Deployment Script
# Guides through initial VM creation and setup
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
else
log_error ".env file not found!"
log_info "Copy .env.example to .env and configure it"
exit 1
fi
log_header "Azure Stack HCI Deployment - Starting"
log_step "Step 1: Verifying Prerequisites"
echo ""
# Test Proxmox connections
log_info "Testing Proxmox connections..."
if ./scripts/utils/test-proxmox-connection.sh > /dev/null 2>&1; then
log_info "✓ Proxmox connections verified"
else
log_error "Proxmox connection failed. Please check your .env file"
exit 1
fi
echo ""
log_step "Step 2: VM Creation Options"
echo ""
log_info "You have 3 options to create VMs:"
echo ""
echo " ${CYAN}Option 1: Proxmox Web UI (Recommended for first-time)${NC}"
echo " - Access: https://192.168.1.206:8006"
echo " - Login: root@pam / (password from PVE_ROOT_PASS)"
echo " - See CREATE_VMS.md for detailed instructions"
echo ""
echo " ${CYAN}Option 2: Terraform${NC}"
echo " - Requires VM templates to be created first"
echo " - cd terraform/proxmox && terraform init && terraform apply"
echo ""
echo " ${CYAN}Option 3: Manual API (Advanced)${NC}"
echo " - Use scripts/proxmox/create-service-vms.sh"
echo ""
read -p "Which option do you want to use? (1/2/3) [1]: " choice
choice=${choice:-1}
case $choice in
1)
log_info "Opening Proxmox Web UI instructions..."
echo ""
log_warn "Please create the following VMs manually:"
echo ""
echo " 1. Cloudflare Tunnel VM"
echo " - VM ID: 100"
echo " - Name: cloudflare-tunnel"
echo " - IP: 192.168.1.60"
echo " - Specs: 2 CPU, 4GB RAM, 40GB disk"
echo ""
echo " 2. K3s Master VM"
echo " - VM ID: 101"
echo " - Name: k3s-master"
echo " - IP: 192.168.1.188"
echo " - Specs: 4 CPU, 8GB RAM, 80GB disk"
echo ""
echo " 3. Git Server VM"
echo " - VM ID: 102"
echo " - Name: git-server"
echo " - IP: 192.168.1.121"
echo " - Specs: 4 CPU, 8GB RAM, 100GB disk"
echo ""
echo " 4. Observability VM"
echo " - VM ID: 103"
echo " - Name: observability"
echo " - IP: 192.168.1.82"
echo " - Specs: 4 CPU, 8GB RAM, 200GB disk"
echo ""
log_info "Proxmox URL: https://192.168.1.206:8006"
log_info "See CREATE_VMS.md for detailed step-by-step instructions"
echo ""
read -p "Press Enter after you've created at least the Cloudflare Tunnel VM..."
;;
2)
log_info "Initializing Terraform..."
cd terraform/proxmox
if [ ! -f terraform.tfvars ]; then
log_error "terraform.tfvars not found. Please create it first."
exit 1
fi
terraform init
log_info "Review the plan:"
terraform plan
read -p "Apply Terraform? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
terraform apply
fi
cd ../..
;;
3)
log_info "Using API-based creation..."
./scripts/proxmox/create-service-vms.sh
;;
esac
echo ""
log_step "Step 3: Next Steps After VM Creation"
echo ""
log_info "After creating VMs, you need to:"
echo ""
echo " 1. Install Ubuntu 22.04 LTS on each VM"
echo " 2. Configure static IP addresses"
echo " 3. Run setup scripts:"
echo " - scripts/setup-cloudflare-tunnel.sh (on Tunnel VM)"
echo " - scripts/setup-k3s.sh (on K3s VM)"
echo ""
log_info "See QUICK_START.md for complete instructions"
echo ""
log_header "Deployment Started"
log_info "Check DEPLOYMENT_CHECKLIST.md to track progress"

View File

@@ -0,0 +1,174 @@
#!/bin/bash
source ~/.bashrc
# Deploy Service VMs via Proxmox API
# Can be executed without SSH access
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
TEMPLATE_VMID="${TEMPLATE_VMID:-9000}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
create_vm_from_template() {
local vmid=$1
local name=$2
local ip=$3
local gateway=$4
local cores=$5
local memory=$6
local disk_size=$7
local bridge="${8:-vmbr0}"
log_info "Creating VM $vmid: $name"
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Check if template exists
local template_check=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/status/current" 2>&1)
if ! echo "$template_check" | grep -q '"data"'; then
log_error "Template VM $TEMPLATE_VMID not found or not accessible"
return 1
fi
# Clone VM from template
log_info "Cloning from template $TEMPLATE_VMID..."
local clone_response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/clone" 2>&1)
if echo "$clone_response" | grep -q '"data"'; then
log_info "VM cloned successfully"
else
log_error "Failed to clone VM: $clone_response"
return 1
fi
# Wait for clone to complete
sleep 5
# Configure VM
log_info "Configuring VM $vmid..."
# Set CPU and memory
curl -s -k -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "cores=$cores" \
-d "memory=$memory" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Set network and IP
curl -s -k -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "net0=virtio,bridge=$bridge" \
-d "ipconfig0=ip=$ip/24,gw=$gateway" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
log_info "VM $vmid configured successfully"
return 0
}
main() {
log_info "Deploying Service VMs via Proxmox API"
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
# VM definitions
# Format: vmid name ip gateway cores memory_mb disk_gb bridge
local vms=(
"100 cloudflare-tunnel 192.168.1.60 192.168.1.254 2 4096 40 vmbr0"
"101 k3s-master 192.168.1.188 192.168.1.254 4 8192 80 vmbr0"
"102 git-server 192.168.1.121 192.168.1.254 4 8192 100 vmbr0"
"103 observability 192.168.1.82 192.168.1.254 4 8192 200 vmbr0"
)
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway cores memory disk bridge <<< "$vm_spec"
# Check if VM already exists
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
local vm_check=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" 2>&1)
if echo "$vm_check" | grep -q '"data"'; then
log_warn "VM $vmid ($name) already exists, skipping"
continue
fi
create_vm_from_template "$vmid" "$name" "$ip" "$gateway" "$cores" "$memory" "$disk" "$bridge"
done
log_info "VM deployment completed!"
log_warn "Next steps:"
log_warn " 1. Install Ubuntu 24.04 on each VM via Proxmox console"
log_warn " 2. Configure services after OS installation"
}
main "$@"

View File

@@ -0,0 +1,67 @@
#!/bin/bash
source ~/.bashrc
# Quick Deployment Script - Without Azure Arc
# Deploys infrastructure stack without Azure dependencies
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
echo "========================================="
echo "Deployment Without Azure Arc"
echo "========================================="
echo ""
echo "This script will guide you through deployment"
echo "without Azure Arc integration."
echo ""
echo "Press Enter to continue or Ctrl+C to cancel..."
read
# Load environment variables
if [ -f .env ]; then
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
fi
echo ""
echo "=== Phase 1: Verify Proxmox Cluster ==="
echo "Testing Proxmox connections..."
./scripts/utils/test-proxmox-connection.sh
echo ""
echo "=== Phase 2: Create Service VMs ==="
echo "Choose deployment method:"
echo "1. Use Terraform (automated)"
echo "2. Manual via Proxmox UI"
read -p "Choice [1-2]: " vm_choice
if [ "$vm_choice" = "1" ]; then
echo "Using Terraform for VM creation..."
cd terraform/proxmox
terraform init
terraform plan
echo "Review plan above, then run: terraform apply"
else
echo "Create VMs manually via Proxmox UI:"
echo " - K3s VM: 192.168.1.188"
echo " - Cloudflare Tunnel VM: 192.168.1.60"
echo " - Git Server VM: 192.168.1.121"
echo " - Observability VM: 192.168.1.82"
fi
echo ""
echo "=== Phase 3: Cloudflare Tunnel Setup ==="
echo "Tunnel token available: ${CLOUDFLARE_TUNNEL_TOKEN:0:10}***"
echo "See DEPLOYMENT_WITHOUT_AZURE.md for detailed setup"
echo ""
echo "=== Phase 4: Kubernetes Deployment ==="
echo "Once K3s VM is ready, run:"
echo " ssh ubuntu@192.168.1.188"
echo " curl -sfL https://get.k3s.io | sh -"
echo ""
echo "=== Next Steps ==="
echo "See DEPLOYMENT_WITHOUT_AZURE.md for complete guide"

View File

@@ -0,0 +1,238 @@
#!/bin/bash
source ~/.bashrc
# Execute All Todo Items - Proxmox Deployment
# Automates execution of all remaining deployment tasks
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
ML110_IP="192.168.1.206"
R630_IP="192.168.1.49"
CLUSTER_NAME="${CLUSTER_NAME:-hc-cluster}"
NFS_SERVER="${NFS_SERVER:-10.10.10.1}"
NFS_PATH="${NFS_PATH:-/mnt/storage}"
STORAGE_NAME="${STORAGE_NAME:-router-storage}"
PVE_ROOT_PASS="${PVE_ROOT_PASS:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
execute_remote() {
local host=$1
local command=$2
local description=$3
log_info "$description on $host"
if ssh -o StrictHostKeyChecking=no -o ConnectTimeout=10 "root@$host" "$command"; then
log_info "$description completed on $host"
return 0
else
log_error "$description failed on $host"
return 1
fi
}
copy_file_remote() {
local host=$1
local source=$2
local dest=$3
log_info "Copying $source to root@$host:$dest"
scp -o StrictHostKeyChecking=no "$source" "root@$host:$dest"
}
# Step 1: Create cluster on ML110
create_cluster_ml110() {
log_step "Creating Proxmox Cluster on ML110"
# Copy cluster setup script
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/proxmox/cluster-setup.sh" "/tmp/cluster-setup.sh"
# Execute cluster creation
execute_remote "$ML110_IP" \
"chmod +x /tmp/cluster-setup.sh && CLUSTER_NAME=$CLUSTER_NAME NODE_ROLE=create /tmp/cluster-setup.sh" \
"Cluster creation"
# Verify
execute_remote "$ML110_IP" "pvecm status && pvecm nodes" "Cluster verification"
}
# Step 2: Join R630 to cluster
join_cluster_r630() {
log_step "Joining R630 to Proxmox Cluster"
# Copy cluster setup script
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/proxmox/cluster-setup.sh" "/tmp/cluster-setup.sh"
# Execute cluster join
if [ -n "$PVE_ROOT_PASS" ]; then
execute_remote "$R630_IP" \
"chmod +x /tmp/cluster-setup.sh && CLUSTER_NAME=$CLUSTER_NAME NODE_ROLE=join CLUSTER_NODE_IP=$ML110_IP ROOT_PASSWORD='$PVE_ROOT_PASS' /tmp/cluster-setup.sh" \
"Cluster join"
else
log_warn "PVE_ROOT_PASS not set, cluster join may require manual password entry"
execute_remote "$R630_IP" \
"chmod +x /tmp/cluster-setup.sh && CLUSTER_NAME=$CLUSTER_NAME NODE_ROLE=join CLUSTER_NODE_IP=$ML110_IP /tmp/cluster-setup.sh" \
"Cluster join"
fi
# Verify
execute_remote "$R630_IP" "pvecm status && pvecm nodes" "Cluster verification"
}
# Step 3: Verify cluster
verify_cluster() {
log_step "Verifying Cluster Health"
log_info "Checking cluster status on ML110..."
execute_remote "$ML110_IP" "pvecm status && pvecm nodes && pvecm expected" "Cluster status check"
log_info "Checking cluster status on R630..."
execute_remote "$R630_IP" "pvecm status && pvecm nodes && pvecm expected" "Cluster status check"
}
# Step 4: Configure NFS storage on ML110
configure_nfs_ml110() {
log_step "Configuring NFS Storage on ML110"
# Copy NFS storage script
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/proxmox/nfs-storage.sh" "/tmp/nfs-storage.sh"
# Execute NFS storage setup
execute_remote "$ML110_IP" \
"chmod +x /tmp/nfs-storage.sh && NFS_SERVER=$NFS_SERVER NFS_PATH=$NFS_PATH STORAGE_NAME=$STORAGE_NAME CONTENT_TYPES=images,iso,vztmpl,backup /tmp/nfs-storage.sh" \
"NFS storage configuration"
# Verify
execute_remote "$ML110_IP" "pvesm status" "Storage verification"
}
# Step 5: Configure NFS storage on R630
configure_nfs_r630() {
log_step "Configuring NFS Storage on R630"
# Copy NFS storage script
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/proxmox/nfs-storage.sh" "/tmp/nfs-storage.sh"
# Execute NFS storage setup
execute_remote "$R630_IP" \
"chmod +x /tmp/nfs-storage.sh && NFS_SERVER=$NFS_SERVER NFS_PATH=$NFS_PATH STORAGE_NAME=$STORAGE_NAME CONTENT_TYPES=images,iso,vztmpl,backup /tmp/nfs-storage.sh" \
"NFS storage configuration"
# Verify
execute_remote "$R630_IP" "pvesm status" "Storage verification"
}
# Step 6: Verify shared storage
verify_storage() {
log_step "Verifying Shared Storage"
log_info "Checking storage on ML110..."
execute_remote "$ML110_IP" "pvesm status && pvesm list" "Storage check"
log_info "Checking storage on R630..."
execute_remote "$R630_IP" "pvesm status && pvesm list" "Storage check"
}
# Step 7: Configure VLAN bridges on ML110
configure_vlans_ml110() {
log_step "Configuring VLAN Bridges on ML110"
# Check if script exists
if [ -f "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" ]; then
copy_file_remote "$ML110_IP" "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" "/tmp/configure-vlans.sh"
execute_remote "$ML110_IP" "chmod +x /tmp/configure-vlans.sh && /tmp/configure-vlans.sh" "VLAN configuration"
else
log_warn "VLAN configuration script not found, skipping"
fi
# Verify
execute_remote "$ML110_IP" "ip addr show | grep -E 'vmbr|vlan'" "Network verification"
}
# Step 8: Configure VLAN bridges on R630
configure_vlans_r630() {
log_step "Configuring VLAN Bridges on R630"
# Check if script exists
if [ -f "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" ]; then
copy_file_remote "$R630_IP" "$PROJECT_ROOT/infrastructure/network/configure-proxmox-vlans.sh" "/tmp/configure-vlans.sh"
execute_remote "$R630_IP" "chmod +x /tmp/configure-vlans.sh && /tmp/configure-vlans.sh" "VLAN configuration"
else
log_warn "VLAN configuration script not found, skipping"
fi
# Verify
execute_remote "$R630_IP" "ip addr show | grep -E 'vmbr|vlan'" "Network verification"
}
# Main execution
main() {
log_info "Starting Proxmox deployment automation..."
log_info "This script will execute all automated tasks"
log_warn "Note: Some tasks (OS installation, manual configuration) require manual intervention"
# Check SSH access
log_info "Testing SSH access..."
if ! ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$ML110_IP" "echo 'ML110 accessible'" &>/dev/null; then
log_error "Cannot SSH to ML110 ($ML110_IP). Please ensure SSH access is configured."
exit 1
fi
if ! ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$R630_IP" "echo 'R630 accessible'" &>/dev/null; then
log_error "Cannot SSH to R630 ($R630_IP). Please ensure SSH access is configured."
exit 1
fi
log_info "SSH access confirmed"
# Execute tasks
create_cluster_ml110
join_cluster_r630
verify_cluster
configure_nfs_ml110
configure_nfs_r630
verify_storage
configure_vlans_ml110
configure_vlans_r630
log_info "Automated tasks completed!"
log_warn "Remaining manual tasks:"
log_warn " - VM template verification/creation"
log_warn " - VM deployment"
log_warn " - OS installation on VMs (requires console access)"
log_warn " - Service configuration"
}
main "$@"

View File

@@ -0,0 +1,160 @@
#!/bin/bash
source ~/.bashrc
# Fix VM Disk Sizes
# Expands disk sizes for VMs cloned from template
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
resize_disk() {
local vmid=$1
local size=$2
local name=$3
log_info "Resizing disk for VM $vmid ($name) to $size..."
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Get current disk configuration
local current_config=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config")
local current_disk=$(echo "$current_config" | python3 -c "import sys, json; d=json.load(sys.stdin).get('data', {}); print(d.get('scsi0', ''))" 2>/dev/null)
if [ -z "$current_disk" ]; then
log_error "Could not get current disk configuration"
return 1
fi
# Extract storage and disk name
local storage=$(echo "$current_disk" | grep -o 'local-lvm:[^,]*' | cut -d':' -f2 | cut -d'-' -f1-2)
local disk_name=$(echo "$current_disk" | grep -o 'vm-[0-9]*-disk-[0-9]*')
# Stop VM if running
local status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" | \
python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'unknown'))" 2>/dev/null)
if [ "$status" = "running" ]; then
log_info "Stopping VM $vmid..."
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null
sleep 5
fi
# Resize disk using resize endpoint
log_info "Resizing disk to $size..."
local resize_response=$(curl -s -k -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "disk=scsi0" \
-d "size=$size" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/resize" 2>&1)
if echo "$resize_response" | grep -q '"data"'; then
log_info "Disk resized successfully"
else
log_warn "Disk resize response: $resize_response"
# Try alternative method - update config directly
log_info "Trying alternative method..."
curl -s -k -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "scsi0=local-lvm:$disk_name,iothread=1,size=$size" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1
fi
# Start VM if it was running
if [ "$status" = "running" ]; then
log_info "Starting VM $vmid..."
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null
fi
return 0
}
main() {
log_info "Fixing VM disk sizes"
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
# VM definitions: vmid name size
local vms=(
"100 cloudflare-tunnel 40G"
"101 k3s-master 80G"
"102 git-server 100G"
"103 observability 200G"
)
for vm_spec in "${vms[@]}"; do
read -r vmid name size <<< "$vm_spec"
resize_disk "$vmid" "$size" "$name"
sleep 2
done
log_info "Disk size fixes completed!"
}
main "$@"

View File

@@ -0,0 +1,337 @@
#!/bin/bash
source ~/.bashrc
# Recreate VMs with Smaller Disk Sizes
# Stops, deletes, and recreates VMs with optimized disk sizes
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
# Proxmox configuration
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
TEMPLATE_VMID="${TEMPLATE_VMID:-9000}"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
stop_and_delete_vm() {
local vmid=$1
local name=$2
log_info "Stopping VM $vmid ($name)..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Check if VM exists
local vm_status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" 2>&1)
if ! echo "$vm_status" | grep -q '"data"'; then
log_warn "VM $vmid does not exist, skipping"
return 0
fi
# Stop VM if running
local status=$(echo "$vm_status" | python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'unknown'))" 2>/dev/null)
if [ "$status" = "running" ]; then
log_info "Stopping VM $vmid..."
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null
# Wait for VM to stop
local wait_count=0
while [ $wait_count -lt 30 ]; do
sleep 2
local current_status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" | \
python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'unknown'))" 2>/dev/null)
if [ "$current_status" = "stopped" ]; then
break
fi
wait_count=$((wait_count + 1))
done
if [ $wait_count -ge 30 ]; then
log_error "VM $vmid did not stop in time"
return 1
fi
fi
# Delete VM
log_info "Deleting VM $vmid..."
local delete_response=$(curl -s -k -X DELETE \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid" 2>&1)
if echo "$delete_response" | grep -q '"data"'; then
log_info "VM $vmid deleted successfully"
return 0
else
log_error "Failed to delete VM $vmid: $delete_response"
return 1
fi
}
create_vm_with_smaller_disk() {
local vmid=$1
local name=$2
local ip=$3
local gateway=$4
local cores=$5
local memory=$6
local disk=$7
local bridge=$8
log_info "Creating VM $vmid ($name) with ${disk} disk..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
if [ -z "$ticket" ] || [ -z "$csrf_token" ]; then
log_error "Failed to get API tokens"
return 1
fi
# Clone VM from template
log_info "Cloning from template $TEMPLATE_VMID..."
local clone_response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/clone" 2>&1)
if ! echo "$clone_response" | grep -q '"data"'; then
log_error "Failed to clone VM: $clone_response"
return 1
fi
log_info "VM cloned successfully, waiting for clone to complete..."
sleep 5
# Wait for clone to finish
local wait_count=0
while [ $wait_count -lt 30 ]; do
local vm_check=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" 2>&1)
if echo "$vm_check" | grep -q '"data"'; then
break
fi
sleep 2
wait_count=$((wait_count + 1))
done
# Configure VM with smaller disk
log_info "Configuring VM $vmid (CPU: $cores, RAM: ${memory}MB, Disk: $disk)..."
# Stop VM if it started automatically
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null 2>&1
sleep 3
# Get current disk configuration
local current_config=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config")
local current_disk=$(echo "$current_config" | python3 -c "import sys, json; d=json.load(sys.stdin).get('data', {}); print(d.get('scsi0', ''))" 2>/dev/null)
# Extract storage pool from current disk or use default
local storage_pool="local-lvm"
if echo "$current_disk" | grep -q ':'; then
storage_pool=$(echo "$current_disk" | cut -d':' -f1)
fi
# Delete old disk and create new smaller one
log_info "Removing old disk and creating new ${disk} disk..."
# Remove old disk
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "scsi0=" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1
sleep 2
# Create new disk with smaller size using the disk creation endpoint
log_info "Creating new ${disk} disk..."
local disk_create=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "size=$disk" \
-d "format=raw" \
-d "storage=$storage_pool" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" 2>&1)
# If that doesn't work, try setting it directly in config
if ! echo "$disk_create" | grep -q '"data"'; then
log_info "Trying alternative method: setting disk in config..."
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "scsi0=$storage_pool:vm-$vmid-disk-0,iothread=1,size=$disk" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1
fi
# Configure CPU and memory
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "cores=$cores" \
-d "memory=$memory" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Configure network
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "net0=virtio,bridge=$bridge,firewall=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Configure QEMU Guest Agent
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Configure cloud-init
log_info "Configuring cloud-init (user: ubuntu, IP: $ip/24)..."
curl -s -k -X PUT -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "ipconfig0=ip=$ip/24,gw=$gateway" \
-d "ciuser=ubuntu" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
log_info "VM $vmid configured successfully with ${disk} disk"
return 0
}
main() {
log_step "Recreating VMs with Smaller Disk Sizes"
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
log_warn "This will DELETE and RECREATE all VMs with smaller disks!"
log_warn "All data on these VMs will be lost!"
# Check for --yes flag to skip confirmation
if [ "$1" != "--yes" ] && [ "$1" != "-y" ]; then
echo ""
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled by user"
exit 0
fi
else
log_info "Auto-confirmed (--yes flag provided)"
fi
# VM definitions with smaller disk sizes
# Format: vmid name ip gateway cores memory_mb disk_size bridge
local vms=(
"100 cloudflare-tunnel 192.168.1.60 192.168.1.254 2 4096 20G vmbr0"
"101 k3s-master 192.168.1.188 192.168.1.254 4 8192 40G vmbr0"
"102 git-server 192.168.1.121 192.168.1.254 4 8192 50G vmbr0"
"103 observability 192.168.1.82 192.168.1.254 4 8192 100G vmbr0"
)
# Step 1: Stop and delete existing VMs
log_step "Step 1: Stopping and Deleting Existing VMs"
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway cores memory disk bridge <<< "$vm_spec"
stop_and_delete_vm "$vmid" "$name"
sleep 2
done
# Step 2: Recreate VMs with smaller disks
log_step "Step 2: Creating VMs with Smaller Disks"
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway cores memory disk bridge <<< "$vm_spec"
create_vm_with_smaller_disk "$vmid" "$name" "$ip" "$gateway" "$cores" "$memory" "$disk" "$bridge"
sleep 3
done
# Step 3: Start all VMs
log_step "Step 3: Starting All VMs"
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
for vm_spec in "${vms[@]}"; do
read -r vmid name ip gateway cores memory disk bridge <<< "$vm_spec"
log_info "Starting VM $vmid ($name)..."
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null
sleep 2
done
log_step "Recreation Complete!"
log_info "All VMs recreated with smaller disk sizes:"
log_info " VM 100: 20G (was 40G)"
log_info " VM 101: 40G (was 80G)"
log_info " VM 102: 50G (was 100G)"
log_info " VM 103: 100G (was 200G)"
log_info "Total saved: 210GB"
}
main "$@"

View File

@@ -0,0 +1,262 @@
#!/bin/bash
source ~/.bashrc
# Run and Complete All Next Steps
# Comprehensive script to complete all remaining deployment tasks
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
log_step() { echo -e "\n${BLUE}=== $1 ===${NC}"; }
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_OPTS="-i $SSH_KEY -o StrictHostKeyChecking=no"
VM_USER="${VM_USER:-ubuntu}"
# VM definitions: vmid name cores memory disk_size
VMS=(
"100 cloudflare-tunnel 2 2048 20"
"101 k3s-master 4 4096 40"
"102 git-server 2 2048 30"
)
TEMPLATE_VMID=9000
# Helper functions will be sourced on Proxmox host via SSH
# We don't source locally since qm command is not available
# Step 1: Create missing VMs from improved template
create_missing_vms() {
log_step "Step 1: Creating Missing VMs from Template 9000"
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
local PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
local PROXMOX_NODE="${PROXMOX_NODE:-pve}"
# Read SSH key
local ssh_key_file="$SSH_KEY.pub"
if [ ! -f "$ssh_key_file" ]; then
log_error "SSH key file not found: $ssh_key_file"
return 1
fi
local ssh_key_content=$(cat "$ssh_key_file")
for vm_spec in "${VMS[@]}"; do
read -r vmid name cores memory disk_size <<< "$vm_spec"
# Check if VM already exists
if ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm config $vmid &>/dev/null"; then
log_info "VM $vmid ($name) already exists, skipping"
continue
fi
log_info "Creating VM $vmid: $name (cores=$cores, memory=${memory}MB, disk=${disk_size}G)"
# Clone from template
local clone_response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/clone" 2>&1)
if ! echo "$clone_response" | grep -q '"data"'; then
log_error "Failed to clone VM: $clone_response"
continue
fi
log_info "Waiting for clone to complete..."
sleep 10
# Configure VM resources
log_info "Configuring VM resources..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "cores=$cores" \
-d "memory=$memory" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Resize disk if needed
if [ "$disk_size" != "32" ]; then
log_info "Resizing disk to ${disk_size}G..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm disk resize $vmid scsi0 ${disk_size}G" 2>/dev/null || true
fi
# Configure cloud-init with SSH keys and DHCP
log_info "Configuring cloud-init with SSH keys..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "ipconfig0=ip=dhcp" \
--data-urlencode "ciuser=ubuntu" \
--data-urlencode "sshkeys=${ssh_key_content}" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Start VM
log_info "Starting VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null
log_info "✓ VM $vmid created and started"
done
log_info "Waiting 60 seconds for VMs to boot..."
sleep 60
}
get_api_token() {
local PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
local PVE_USERNAME="${PVE_USERNAME:-root@pam}"
local PVE_PASSWORD="${PVE_ROOT_PASS:-}"
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
# Step 2: Verify SSH and QGA for all VMs
verify_vms() {
log_step "Step 2: Verifying VMs (SSH and QGA)"
local all_vms=("100 cloudflare-tunnel" "101 k3s-master" "102 git-server" "103 observability")
local all_ok=true
for vm_spec in "${all_vms[@]}"; do
read -r vmid name <<< "$vm_spec"
log_info "Checking VM $vmid ($name)..."
# Get IP via guest agent (running on Proxmox host)
local ip
ip=$(ssh $SSH_OPTS "root@$PROXMOX_HOST" \
"source /home/intlc/projects/loc_az_hci/scripts/lib/proxmox_vm_helpers.sh 2>/dev/null && \
get_vm_ip_from_guest_agent $vmid 2>/dev/null || echo ''" 2>/dev/null || echo "")
if [[ -z "$ip" ]]; then
log_warn " VM $vmid: Could not get IP (may still be booting)"
all_ok=false
continue
fi
log_info " IP: $ip"
# Test SSH
if ssh $SSH_OPTS -o ConnectTimeout=5 "${VM_USER}@${ip}" "echo 'SSH OK'" &>/dev/null; then
log_info " ✓ SSH working"
# Check QGA
if ssh $SSH_OPTS "${VM_USER}@${ip}" "systemctl is-active qemu-guest-agent &>/dev/null && echo 'active' || echo 'inactive'" | grep -q "active"; then
log_info " ✓ QEMU Guest Agent active"
else
log_warn " ⚠ QEMU Guest Agent not active (should be pre-installed from template)"
fi
else
log_warn " ✗ SSH not working yet"
all_ok=false
fi
done
if [ "$all_ok" = false ]; then
log_warn "Some VMs may need more time to boot. Continuing anyway..."
fi
}
# Step 3: Deploy Gitea on VM 102
deploy_gitea() {
log_step "Step 3: Deploying Gitea on VM 102"
if [ -f "$PROJECT_ROOT/scripts/deploy/deploy-gitea.sh" ]; then
"$PROJECT_ROOT/scripts/deploy/deploy-gitea.sh"
else
log_warn "Gitea deployment script not found, skipping"
fi
}
# Step 4: Deploy Observability on VM 103
deploy_observability() {
log_step "Step 4: Deploying Observability Stack on VM 103"
if [ -f "$PROJECT_ROOT/scripts/deploy/deploy-observability.sh" ]; then
"$PROJECT_ROOT/scripts/deploy/deploy-observability.sh"
else
log_warn "Observability deployment script not found, skipping"
fi
}
# Step 5: Final Status Report
final_status() {
log_step "Final Status Report"
log_info "VM Status:"
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm list | grep -E '(100|101|102|103)'"
echo ""
log_info "VM IPs (via Guest Agent):"
local all_vms=("100 cloudflare-tunnel" "101 k3s-master" "102 git-server" "103 observability")
for vm_spec in "${all_vms[@]}"; do
read -r vmid name <<< "$vm_spec"
local ip
ip=$(ssh $SSH_OPTS "root@$PROXMOX_HOST" \
"source /home/intlc/projects/loc_az_hci/scripts/lib/proxmox_vm_helpers.sh 2>/dev/null && \
get_vm_ip_from_guest_agent $vmid 2>/dev/null || echo 'N/A'")
log_info " VM $vmid ($name): $ip"
done
echo ""
log_info "Service URLs:"
log_info " Gitea: http://<VM-102-IP>:3000"
log_info " Prometheus: http://<VM-103-IP>:9090"
log_info " Grafana: http://<VM-103-IP>:3000 (admin/admin)"
echo ""
log_info "✓ All next steps completed!"
}
main() {
log_step "Running All Next Steps"
create_missing_vms
verify_vms
deploy_gitea
deploy_observability
final_status
}
main "$@"

View File

@@ -0,0 +1,127 @@
#!/bin/bash
source ~/.bashrc
# Verify Cloud-init Installation on VMs
# Checks if cloud-init is installed and working
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# SSH user
SSH_USER="${SSH_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
echo "[ERROR] Helper library not found. Run this script on Proxmox host or via SSH." >&2
exit 1
fi
# VMID NAME (no IP - discovered via guest agent)
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_cloud_init() {
local vmid=$1
local name=$2
log_info "Checking cloud-init on $name (VM $vmid)..."
# Ensure guest agent is enabled
ensure_guest_agent_enabled "$vmid" || true
# Get IP from guest agent
local ip
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
if [[ -z "$ip" ]]; then
log_warn "$name (VM $vmid) - cannot get IP from guest agent"
return 1
fi
log_info " Discovered IP: $ip"
# Test connectivity
if ! ping -c 1 -W 2 "$ip" &> /dev/null; then
log_warn "$name ($ip) is not reachable - may still be booting"
return 1
fi
# Try SSH connection
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5 "$SSH_USER@$ip" "echo 'Connected'" &>/dev/null; then
log_info " SSH connection successful"
# Check cloud-init
local cloud_init_status=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new "$SSH_USER@$ip" \
"systemctl is-active cloud-init 2>/dev/null || echo 'not-installed'" 2>/dev/null)
if [ "$cloud_init_status" = "active" ] || [ "$cloud_init_status" = "inactive" ]; then
log_info " ✓ Cloud-init is installed"
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new "$SSH_USER@$ip" \
"cloud-init status 2>/dev/null || echo 'Status unknown'" 2>/dev/null
return 0
else
log_warn " Cloud-init may not be installed"
return 1
fi
else
log_warn " Cannot SSH to $name ($ip) - may need password or key"
log_info " To verify manually: ssh -i $SSH_KEY $SSH_USER@$ip"
return 1
fi
}
main() {
log_info "Verifying cloud-init installation on VMs"
log_warn "This requires SSH access to VMs"
log_info "Using guest-agent IP discovery"
echo ""
local all_ok=true
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
if ! check_cloud_init "$vmid" "$name"; then
all_ok=false
fi
echo ""
done
if [ "$all_ok" = true ]; then
log_info "All VMs have cloud-init installed!"
else
log_warn "Some VMs may not have cloud-init or are not accessible"
log_info "If cloud-init is not installed, install it:"
log_info " sudo apt update && sudo apt install cloud-init"
fi
}
main "$@"

View File

@@ -0,0 +1,149 @@
#!/bin/bash
source ~/.bashrc
# Generate Documentation Index
# Auto-generates docs/INDEX.md from directory structure
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
DOCS_DIR="$PROJECT_ROOT/docs"
INDEX_FILE="$DOCS_DIR/INDEX.md"
# Colors
GREEN='\033[0;32m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
generate_index() {
log_info "Generating documentation index..."
cat > "$INDEX_FILE" << 'EOF'
# Documentation Index
This is the master index for all project documentation. Documentation is organized by purpose to make it easy to find what you need.
**Note**: This index is auto-generated. Run `./scripts/docs/generate-docs-index.sh` to regenerate.
EOF
# Getting Started
if [ -d "$DOCS_DIR/getting-started" ]; then
echo "## Getting Started" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/getting-started"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](getting-started/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
# Architecture
if [ -d "$DOCS_DIR/architecture" ]; then
echo "## Architecture" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/architecture"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](architecture/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
# Deployment
if [ -d "$DOCS_DIR/deployment" ]; then
echo "## Deployment" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/deployment"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](deployment/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
# Operations
if [ -d "$DOCS_DIR/operations" ]; then
echo "## Operations" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/operations"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](operations/$filename)" >> "$INDEX_FILE"
fi
done
if [ -d "$DOCS_DIR/operations/runbooks" ]; then
echo "- [Runbooks](operations/runbooks/)" >> "$INDEX_FILE"
fi
echo "" >> "$INDEX_FILE"
fi
# Troubleshooting
if [ -d "$DOCS_DIR/troubleshooting" ]; then
echo "## Troubleshooting" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/troubleshooting"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](troubleshooting/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
# Security
if [ -d "$DOCS_DIR/security" ]; then
echo "## Security" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/security"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](security/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
# Reference
if [ -d "$DOCS_DIR/reference" ]; then
echo "## Reference" >> "$INDEX_FILE"
echo "" >> "$INDEX_FILE"
for file in "$DOCS_DIR/reference"/*.md; do
if [ -f "$file" ]; then
title=$(basename "$file" .md | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++)sub(/./,toupper(substr($i,1,1)),$i)}1')
filename=$(basename "$file")
echo "- [$title](reference/$filename)" >> "$INDEX_FILE"
fi
done
echo "" >> "$INDEX_FILE"
fi
log_info "Documentation index generated: $INDEX_FILE"
}
main() {
log_step "Generating documentation index..."
generate_index
log_info "Done!"
}
main "$@"

57
scripts/docs/update-diagrams.sh Executable file
View File

@@ -0,0 +1,57 @@
#!/bin/bash
source ~/.bashrc
# Update Diagrams
# Regenerates diagrams from source files (if using Mermaid or similar)
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
DIAGRAMS_DIR="$PROJECT_ROOT/diagrams"
DOCS_DIR="$PROJECT_ROOT/docs"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
check_diagrams() {
log_info "Checking for diagram source files..."
if [ ! -d "$DIAGRAMS_DIR" ]; then
log_warn "Diagrams directory not found: $DIAGRAMS_DIR"
return 0
fi
local diagram_count=0
while IFS= read -r -d '' file; do
diagram_count=$((diagram_count + 1))
log_info "Found diagram: $(basename "$file")"
done < <(find "$DIAGRAMS_DIR" -name "*.mmd" -o -name "*.mermaid" -type f -print0 2>/dev/null)
if [ $diagram_count -eq 0 ]; then
log_warn "No diagram source files found"
else
log_info "Found $diagram_count diagram source file(s)"
log_info "To render diagrams, use Mermaid CLI or online editor"
log_info "Mermaid CLI: npm install -g @mermaid-js/mermaid-cli"
log_info "Then run: mmdc -i diagram.mmd -o diagram.png"
fi
}
main() {
log_info "Updating diagrams..."
check_diagrams
log_info "Done!"
}
main "$@"

111
scripts/docs/validate-docs.sh Executable file
View File

@@ -0,0 +1,111 @@
#!/bin/bash
source ~/.bashrc
# Validate Documentation
# Checks for broken links, outdated content, and documentation issues
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
DOCS_DIR="$PROJECT_ROOT/docs"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_markdown_links() {
log_info "Checking markdown links..."
local errors=0
while IFS= read -r -d '' file; do
# Extract links from markdown files
while IFS= read -r line; do
if [[ $line =~ \[([^\]]+)\]\(([^)]+)\) ]]; then
link="${BASH_REMATCH[2]}"
# Skip external links
if [[ ! $link =~ ^https?:// ]]; then
# Remove anchor
link_file="${link%%#*}"
if [ -n "$link_file" ] && [ ! -f "$DOCS_DIR/$link_file" ] && [ ! -f "$(dirname "$file")/$link_file" ]; then
log_error "Broken link in $(basename "$file"): $link"
errors=$((errors + 1))
fi
fi
fi
done < "$file"
done < <(find "$DOCS_DIR" -name "*.md" -type f -print0)
if [ $errors -eq 0 ]; then
log_info "All markdown links are valid"
else
log_error "Found $errors broken link(s)"
return 1
fi
}
check_missing_files() {
log_info "Checking for missing documentation files..."
local missing=0
# Check for expected files
expected_files=(
"getting-started/quick-start.md"
"getting-started/prerequisites.md"
"getting-started/installation.md"
"architecture/overview.md"
"deployment/deployment-guide.md"
)
for file in "${expected_files[@]}"; do
if [ ! -f "$DOCS_DIR/$file" ]; then
log_warn "Missing expected file: $file"
missing=$((missing + 1))
fi
done
if [ $missing -eq 0 ]; then
log_info "All expected documentation files exist"
else
log_warn "Found $missing missing file(s)"
fi
}
check_index() {
log_info "Checking documentation index..."
if [ ! -f "$DOCS_DIR/INDEX.md" ]; then
log_error "Documentation index (INDEX.md) not found"
log_info "Run ./scripts/docs/generate-docs-index.sh to generate it"
return 1
else
log_info "Documentation index exists"
fi
}
main() {
log_info "Validating documentation..."
echo ""
check_index
check_missing_files
check_markdown_links
echo ""
log_info "Documentation validation complete"
}
main "$@"

View File

@@ -0,0 +1,200 @@
#!/bin/bash
source ~/.bashrc
# Add SSH Keys to VMs that are already using DHCP
# Since VMs are already on DHCP, we just need to add SSH keys via cloud-init
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
SSH_KEY_FILE="$HOME/.ssh/id_ed25519_proxmox.pub"
# VM definitions: vmid name
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
add_ssh_key_to_vm() {
local vmid=$1
local name=$2
log_info "Adding SSH key to VM $vmid ($name)..."
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
return 1
fi
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Read and encode SSH key
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
# Add SSH key via cloud-init
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "sshkeys=$ssh_key_b64" \
--data-urlencode "ciuser=ubuntu" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
log_info "✓ SSH key added to VM $vmid"
}
discover_vm_ips() {
log_step "Discovering VM IPs via QEMU Guest Agent"
log_info "Waiting for VMs to apply cloud-init changes..."
sleep 10
log_info "Rebooting VMs to apply SSH keys..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
log_info "Rebooting VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/reboot" > /dev/null 2>&1 || true
done
log_info "Waiting 90 seconds for VMs to reboot and apply cloud-init..."
sleep 90
log_info "Discovering IPs via QEMU Guest Agent..."
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" 2>/dev/null || {
log_error "Helper library not found"
return 1
}
local all_ok=true
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" 2>/dev/null || true)"
if [[ -n "$ip" ]]; then
log_info " ✓ VM $vmid ($name): $ip"
# Test SSH
if ssh -i "${SSH_KEY_FILE%.pub}" -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@$ip "echo 'SSH OK'" &>/dev/null; then
log_info " ✓ SSH working!"
else
log_warn " ✗ SSH not working yet (may need more time)"
all_ok=false
fi
else
log_warn " ✗ VM $vmid ($name): IP not discovered (guest agent may need more time)"
all_ok=false
fi
done
if [ "$all_ok" = true ]; then
log_info ""
log_info "✓ All VMs have SSH access!"
else
log_warn ""
log_warn "Some VMs may need more time. Wait a few minutes and test again."
fi
}
main() {
log_step "Add SSH Keys to DHCP VMs"
log_info "Your VMs are already configured for DHCP - no IP conflicts!"
log_info "We just need to add SSH keys via cloud-init."
echo ""
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
exit 1
fi
log_step "Step 1: Adding SSH Keys via Cloud-Init"
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
add_ssh_key_to_vm "$vmid" "$name" || log_warn "Failed to add SSH key to VM $vmid"
done
discover_vm_ips
log_step "Summary"
log_info "✓ SSH keys added via cloud-init"
log_info "✓ VMs are using DHCP (no IP conflicts)"
log_info "✓ IPs discovered via QEMU Guest Agent"
log_info ""
log_info "Your scripts already support dynamic IP discovery!"
log_info "Test SSH: ./scripts/ops/ssh-test-all.sh"
}
main "$@"

View File

@@ -0,0 +1,119 @@
#!/bin/bash
source ~/.bashrc
# Fix VM SSH Access via Proxmox Console
# Instructions for manual console access to fix SSH keys
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
SSH_KEY="ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIBGrtqePuHm2bJLNnQbuzYrpcXoHHhwWv5s2RmqEezbz proxmox-access"
# VMID NAME (IPs will be discovered via guest agent or shown from Proxmox Summary)
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
# Fallback IPs for reference (when guest agent not available)
declare -A FALLBACK_IPS=(
["100"]="192.168.1.60"
["101"]="192.168.1.188"
["102"]="192.168.1.121"
["103"]="192.168.1.82"
)
main() {
echo "========================================="
echo "Fix VM SSH Access via Console"
echo "========================================="
echo ""
log_info "Since SSH is not working, use Proxmox Console to fix:"
echo ""
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
# Try to get IP from guest agent (if available)
local ip="${FALLBACK_IPS[$vmid]:-}"
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" 2>/dev/null || true
local discovered_ip
discovered_ip="$(get_vm_ip_from_guest_agent "$vmid" 2>/dev/null || true)"
[[ -n "$discovered_ip" ]] && ip="$discovered_ip"
fi
echo "VM $vmid: $name"
if [[ -n "$ip" ]]; then
echo " Expected IP: $ip (check Proxmox Summary if different)"
else
echo " IP: Check Proxmox Summary for current IP"
fi
echo " 1. Access Proxmox Web UI: https://192.168.1.206:8006"
echo " 2. Navigate to: VM $vmid ($name) → Console"
echo " 3. Login as: ubuntu"
echo " 4. Run these commands:"
echo ""
echo " mkdir -p ~/.ssh"
echo " chmod 700 ~/.ssh"
echo " echo '$SSH_KEY' >> ~/.ssh/authorized_keys"
echo " chmod 600 ~/.ssh/authorized_keys"
echo ""
echo " 5. Install QEMU Guest Agent:"
echo ""
echo " sudo apt update"
echo " sudo apt install -y qemu-guest-agent"
echo " sudo systemctl enable qemu-guest-agent"
echo " sudo systemctl start qemu-guest-agent"
echo ""
if [[ -n "$ip" ]]; then
echo " 6. Test SSH from workstation:"
echo ""
echo " ssh -i ~/.ssh/id_ed25519_proxmox ubuntu@$ip"
else
echo " 6. Test SSH from workstation (use IP from Proxmox Summary):"
echo ""
echo " ssh -i ~/.ssh/id_ed25519_proxmox ubuntu@<VM_IP>"
fi
echo ""
echo "----------------------------------------"
echo ""
done
log_info "After fixing SSH, you can:"
echo " - Deploy services via SSH"
echo " - Use QEMU Guest Agent for automation"
echo " - Complete remaining tasks"
}
main "$@"

View File

@@ -0,0 +1,128 @@
#!/bin/bash
# Fix VM 100 Guest Agent Restart Issues
# This version uses qm guest exec (no SSH to VM required)
# Use this if you cannot access VM 100 via console or SSH
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
VM_ID=100
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_HOST:-192.168.1.206}"
echo "=== Fixing VM 100 Guest Agent Restart Issues (via Guest Agent) ==="
echo ""
# Test guest agent connection first
echo "Testing guest agent connection..."
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" "qm guest exec $VM_ID -- /bin/hostname" > /dev/null 2>&1; then
echo "ERROR: Guest agent is not responding. Please ensure:"
echo " 1. Guest agent is enabled in VM configuration"
echo " 2. qemu-guest-agent service is running inside VM 100"
echo " 3. VM 100 is running"
exit 1
fi
echo "✅ Guest agent is responding"
echo ""
# Function to execute command via guest agent and extract output
exec_via_qga() {
local cmd="$1"
# Execute and parse JSON output, extract out-data field
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" \
"qm guest exec $VM_ID -- /bin/bash -c '${cmd//\'/\\\'}'" 2>&1 | \
grep -oP '"out-data"\s*:\s*"[^"]*"' | \
sed 's/"out-data"\s*:\s*"//;s/"$//' | \
sed 's/\\n/\n/g' | \
sed 's/\\"/"/g' || true
}
# Function to execute command and get exit code
exec_via_qga_silent() {
local cmd="$1"
local result
result=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" \
"qm guest exec $VM_ID -- /bin/bash -c '${cmd//\'/\\\'}'" 2>&1)
# Check if exitcode is 0 in JSON output
echo "$result" | grep -q '"exitcode"\s*:\s*0' && return 0 || return 1
}
echo "=== Current Guest Agent Status ==="
exec_via_qga "systemctl status qemu-guest-agent --no-pager | head -10" || true
echo ""
echo "=== Creating systemd override directory ==="
exec_via_qga "sudo mkdir -p /etc/systemd/system/qemu-guest-agent.service.d/"
echo "✅ Directory created"
echo ""
echo "=== Creating override configuration ==="
# Create the override file using echo (heredoc doesn't work well via qm guest exec)
exec_via_qga "sudo bash -c 'echo \"[Service]\" > /etc/systemd/system/qemu-guest-agent.service.d/override.conf'"
exec_via_qga "sudo bash -c 'echo \"# Add 5 second delay before restart to prevent restart loops\" >> /etc/systemd/system/qemu-guest-agent.service.d/override.conf'"
exec_via_qga "sudo bash -c 'echo \"RestartSec=5\" >> /etc/systemd/system/qemu-guest-agent.service.d/override.conf'"
exec_via_qga "sudo bash -c 'echo \"# Increase timeout for service start\" >> /etc/systemd/system/qemu-guest-agent.service.d/override.conf'"
exec_via_qga "sudo bash -c 'echo \"TimeoutStartSec=30\" >> /etc/systemd/system/qemu-guest-agent.service.d/override.conf'"
echo "✅ Override configuration created"
echo ""
echo "=== Reloading systemd daemon ==="
exec_via_qga "sudo systemctl daemon-reload"
echo "✅ Systemd daemon reloaded"
echo ""
echo "=== Verifying override configuration ==="
exec_via_qga "systemctl cat qemu-guest-agent.service | grep -A 5 override.conf || echo 'Override not found in output'"
echo ""
echo "=== Restarting guest agent service ==="
exec_via_qga "sudo systemctl restart qemu-guest-agent"
echo "✅ Service restarted"
echo ""
echo "=== Waiting for service to stabilize ==="
sleep 5
echo ""
echo "=== Checking service status ==="
exec_via_qga "systemctl status qemu-guest-agent --no-pager | head -15" || true
echo ""
echo "=== Verifying service is running ==="
if exec_via_qga_silent "systemctl is-active --quiet qemu-guest-agent"; then
echo "✅ Guest agent service is active"
else
echo "⚠️ Guest agent service status check failed (may still be starting)"
# Try to get actual status
exec_via_qga "systemctl is-active qemu-guest-agent" || true
fi
echo ""
echo "=== Checking restart configuration ==="
exec_via_qga "systemctl show qemu-guest-agent | grep -E 'RestartSec|Restart=' || true"
echo ""
echo "=== Testing guest agent from Proxmox host ==="
HOSTNAME_OUTPUT=$(ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" "qm guest exec $VM_ID -- /bin/hostname" 2>&1)
if echo "$HOSTNAME_OUTPUT" | grep -q '"exitcode"\s*:\s*0'; then
echo "✅ Guest agent is responding"
HOSTNAME=$(echo "$HOSTNAME_OUTPUT" | grep -oP '"out-data"\s*:\s*"[^"]*"' | sed 's/"out-data"\s*:\s*"//;s/"$//' | sed 's/\\n/\n/g' | head -1)
echo " VM hostname: $HOSTNAME"
else
echo "⚠️ Guest agent test failed (may need a moment to stabilize)"
fi
echo ""
echo "=== Fix Complete ==="
echo "The guest agent service now has a 5-second restart delay."
echo "This should prevent restart loops and connection timeouts."
echo ""
echo "Monitor the service with:"
echo " ssh root@${PROXMOX_HOST} 'qm guest exec $VM_ID -- systemctl status qemu-guest-agent'"
echo ""
echo "Or check logs with:"
echo " ssh root@${PROXMOX_HOST} 'qm guest exec $VM_ID -- journalctl -u qemu-guest-agent -n 20'"

View File

@@ -0,0 +1,106 @@
#!/bin/bash
# Fix VM 100 Guest Agent Restart Issues
# This script adds a restart delay to prevent restart loops
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Source helper functions
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
VM_ID=100
VM_USER="ubuntu"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_HOST:-192.168.1.206}"
echo "=== Fixing VM 100 Guest Agent Restart Issues ==="
echo ""
# Get VM IP
echo "Getting VM 100 IP address..."
ip=$(get_vm_ip_or_warn "$VM_ID" "$PROXMOX_HOST" "$SSH_KEY")
if [ -z "$ip" ]; then
echo "ERROR: Could not get IP for VM $VM_ID"
exit 1
fi
echo "VM 100 IP: $ip"
echo ""
# SSH into Proxmox host, then into VM 100
echo "Connecting to VM 100 via Proxmox host..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" <<EOF
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "${VM_USER}@${ip}" <<'VMEOF'
set -euo pipefail
echo "=== Current Guest Agent Status ==="
systemctl status qemu-guest-agent --no-pager | head -10 || true
echo ""
echo "=== Creating systemd override directory ==="
sudo mkdir -p /etc/systemd/system/qemu-guest-agent.service.d/
echo "=== Creating override configuration ==="
sudo tee /etc/systemd/system/qemu-guest-agent.service.d/override.conf > /dev/null <<'OVERRIDE'
[Service]
# Add 5 second delay before restart to prevent restart loops
RestartSec=5
# Increase timeout for service start
TimeoutStartSec=30
OVERRIDE
echo "=== Reloading systemd daemon ==="
sudo systemctl daemon-reload
echo "=== Verifying override configuration ==="
systemctl cat qemu-guest-agent.service | grep -A 5 "override.conf" || true
echo ""
echo "=== Restarting guest agent service ==="
sudo systemctl restart qemu-guest-agent
echo "=== Waiting for service to stabilize ==="
sleep 3
echo "=== Checking service status ==="
systemctl status qemu-guest-agent --no-pager | head -15 || true
echo ""
echo "=== Verifying service is running ==="
if systemctl is-active --quiet qemu-guest-agent; then
echo "✅ Guest agent service is active"
else
echo "❌ Guest agent service is not active"
exit 1
fi
echo ""
echo "=== Checking restart configuration ==="
systemctl show qemu-guest-agent | grep -E "RestartSec|Restart=" || true
echo ""
echo "✅ Guest agent restart fix completed successfully"
VMEOF
EOF
echo ""
echo "=== Testing guest agent from Proxmox host ==="
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${PROXMOX_HOST}" <<EOF
echo "Testing guest agent connection..."
if qm guest exec $VM_ID -- hostname > /dev/null 2>&1; then
echo "✅ Guest agent is responding"
qm guest exec $VM_ID -- hostname
else
echo "⚠️ Guest agent test failed (may need a moment to stabilize)"
fi
EOF
echo ""
echo "=== Fix Complete ==="
echo "The guest agent service now has a 5-second restart delay."
echo "This should prevent restart loops and connection timeouts."
echo ""
echo "Monitor the service with:"
echo " ssh root@${PROXMOX_HOST} 'qm guest exec $VM_ID -- systemctl status qemu-guest-agent'"

View File

@@ -0,0 +1,448 @@
#!/bin/bash
source ~/.bashrc
# Recreate Template VM 9000 with Proper Cloud-Init
# Then Recreate VMs 100-103 from the new template
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_KEY_FILE="$SSH_KEY.pub"
TEMPLATE_VMID=9000
STORAGE="${STORAGE:-local-lvm}"
# VM definitions: vmid name ip cores memory disk_size
VMS=(
"100 cloudflare-tunnel 192.168.1.188 2 2048 20"
"101 k3s-master 192.168.1.60 4 4096 40"
"102 git-server 192.168.1.121 2 2048 30"
"103 observability 192.168.1.82 2 2048 30"
)
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
recreate_template() {
log_step "Step 1: Recreating Template VM 9000"
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
exit 1
fi
log_info "This will destroy and recreate template VM 9000"
log_warn "All VMs cloned from this template will need to be recreated"
echo ""
# Auto-confirm if running non-interactively
if [ -t 0 ]; then
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
else
log_info "Non-interactive mode: auto-confirming"
fi
log_info "Connecting to Proxmox host to recreate template..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<'TEMPLATE_SCRIPT'
set -e
TEMPLATE_VMID=9000
STORAGE="${STORAGE:-local-lvm}"
SSH_KEY_FILE="/tmp/id_ed25519_proxmox.pub"
# Check if template exists and destroy it
if qm status $TEMPLATE_VMID &>/dev/null; then
echo "Stopping and destroying existing template VM $TEMPLATE_VMID..."
qm stop $TEMPLATE_VMID 2>/dev/null || true
sleep 5
qm destroy $TEMPLATE_VMID 2>/dev/null || true
sleep 2
fi
# Download Ubuntu 24.04 cloud image
echo "Downloading Ubuntu 24.04 cloud image..."
IMAGE_URL="https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
IMAGE_FILE="/tmp/ubuntu-24.04-server-cloudimg-amd64.img"
if [ ! -f "$IMAGE_FILE" ]; then
wget -q --show-progress -O "$IMAGE_FILE" "$IMAGE_URL" || {
echo "Failed to download image"
exit 1
}
fi
# Create VM
echo "Creating template VM $TEMPLATE_VMID..."
qm create $TEMPLATE_VMID \
--name ubuntu-24.04-cloud-init \
--memory 2048 \
--cores 2 \
--net0 virtio,bridge=vmbr0 \
--scsihw virtio-scsi-pci \
--scsi0 $STORAGE:0,import-from=$IMAGE_FILE,discard=on \
--ide2 $STORAGE:cloudinit \
--boot order=scsi0 \
--serial0 socket \
--vga serial0 \
--agent enabled=1 \
--ostype l26
# Resize disk to 32GB
echo "Resizing disk to 32GB..."
qm disk resize $TEMPLATE_VMID scsi0 32G
# Configure cloud-init
echo "Configuring cloud-init..."
qm set $TEMPLATE_VMID \
--ciuser ubuntu \
--cipassword "" \
--sshkeys /tmp/id_ed25519_proxmox.pub \
--ipconfig0 ip=dhcp
# Convert to template
echo "Converting to template..."
qm template $TEMPLATE_VMID
echo "✓ Template VM $TEMPLATE_VMID created successfully"
TEMPLATE_SCRIPT
# Copy SSH key to Proxmox host
log_info "Copying SSH key to Proxmox host..."
scp -i "$SSH_KEY" "$SSH_KEY_FILE" root@$PROXMOX_HOST:/tmp/id_ed25519_proxmox.pub
# Execute template creation
ssh -i "$SSH_KEY" root@$PROXMOX_HOST "STORAGE=$STORAGE bash" < <(cat <<'INLINE_SCRIPT'
set -e
TEMPLATE_VMID=9000
STORAGE="${STORAGE:-local-lvm}"
SSH_KEY_FILE="/tmp/id_ed25519_proxmox.pub"
# Check if template exists and destroy it
if qm status $TEMPLATE_VMID &>/dev/null; then
echo "Stopping and destroying existing template VM $TEMPLATE_VMID..."
qm stop $TEMPLATE_VMID 2>/dev/null || true
sleep 5
qm destroy $TEMPLATE_VMID 2>/dev/null || true
sleep 2
fi
# Download Ubuntu 24.04 cloud image
echo "Downloading Ubuntu 24.04 cloud image..."
IMAGE_URL="https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
IMAGE_FILE="/tmp/ubuntu-24.04-server-cloudimg-amd64.img"
if [ ! -f "$IMAGE_FILE" ]; then
wget -q --show-progress -O "$IMAGE_FILE" "$IMAGE_URL" || {
echo "Failed to download image"
exit 1
}
fi
# Create VM
echo "Creating template VM $TEMPLATE_VMID..."
qm create $TEMPLATE_VMID \
--name ubuntu-24.04-cloud-init \
--memory 2048 \
--cores 2 \
--net0 virtio,bridge=vmbr0 \
--scsihw virtio-scsi-pci \
--scsi0 $STORAGE:0,import-from=$IMAGE_FILE,discard=on \
--ide2 $STORAGE:cloudinit \
--boot order=scsi0 \
--serial0 socket \
--vga serial0 \
--agent enabled=1 \
--ostype l26
# Resize disk to 32GB
echo "Resizing disk to 32GB..."
qm disk resize $TEMPLATE_VMID scsi0 32G
# Configure cloud-init with SSH key
echo "Configuring cloud-init..."
qm set $TEMPLATE_VMID \
--ciuser ubuntu \
--cipassword "" \
--sshkeys $SSH_KEY_FILE \
--ipconfig0 ip=dhcp
# Convert to template
echo "Converting to template..."
qm template $TEMPLATE_VMID
echo "✓ Template VM $TEMPLATE_VMID created successfully"
INLINE_SCRIPT
)
log_info "✓ Template VM 9000 recreated with proper cloud-init"
}
destroy_existing_vms() {
log_step "Step 2: Destroying Existing VMs"
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
log_info "Destroying VM $vmid ($name)..."
# Stop VM if running
local status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" | \
python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'stopped'))" 2>/dev/null || echo "stopped")
if [ "$status" = "running" ]; then
log_info "Stopping VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null
sleep 5
fi
# Delete VM
curl -s -k -X DELETE \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid" > /dev/null
log_info "✓ VM $vmid destroyed"
done
}
create_vms_from_template() {
log_step "Step 3: Creating VMs from Template"
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Read SSH key
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
log_info "Creating VM $vmid: $name"
# Clone from template
log_info "Cloning from template $TEMPLATE_VMID..."
local clone_response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/clone" 2>&1)
if ! echo "$clone_response" | grep -q '"data"'; then
log_error "Failed to clone VM: $clone_response"
continue
fi
log_info "Waiting for clone to complete..."
sleep 10
# Configure VM
log_info "Configuring VM $vmid..."
# Set resources
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "cores=$cores" \
-d "memory=$memory" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Resize disk if needed
if [ "$disk_size" != "32" ]; then
log_info "Resizing disk to ${disk_size}G..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST "qm disk resize $vmid scsi0 ${disk_size}G" 2>/dev/null || true
fi
# Configure cloud-init with SSH keys and DHCP
log_info "Configuring cloud-init with SSH keys..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "ipconfig0=ip=dhcp" \
--data-urlencode "ciuser=ubuntu" \
--data-urlencode "sshkeys=$ssh_key_b64" \
--data-urlencode "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Start VM
log_info "Starting VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null
log_info "✓ VM $vmid created and started"
done
}
wait_and_test() {
log_step "Step 4: Waiting for VMs to Boot and Testing SSH"
log_info "Waiting 90 seconds for VMs to boot and apply cloud-init..."
sleep 90
log_info "Discovering IPs via QEMU Guest Agent..."
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" 2>/dev/null || {
log_warn "Helper library not found, will test SSH manually"
}
local all_ok=true
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
# Try to get IP from guest agent
local discovered_ip=""
if command -v get_vm_ip_from_guest_agent &>/dev/null; then
discovered_ip=$(ssh -i "$SSH_KEY" root@$PROXMOX_HOST \
"source /home/intlc/projects/loc_az_hci/scripts/lib/proxmox_vm_helpers.sh 2>/dev/null && \
get_vm_ip_from_guest_agent $vmid 2>/dev/null || echo ''")
fi
if [[ -n "$discovered_ip" ]]; then
log_info "VM $vmid ($name): $discovered_ip"
# Test SSH
if ssh -i "$SSH_KEY" -o ConnectTimeout=5 -o StrictHostKeyChecking=no ubuntu@$discovered_ip "echo 'SSH OK'" &>/dev/null; then
log_info " ✓ SSH working!"
else
log_warn " ✗ SSH not working yet (may need more time)"
all_ok=false
fi
else
log_warn "VM $vmid ($name): IP not discovered yet"
log_info " Try checking router DHCP leases or wait a bit longer"
all_ok=false
fi
done
if [ "$all_ok" = true ]; then
log_info ""
log_info "✓ All VMs recreated successfully with SSH access!"
log_info "You can now run: ./scripts/deploy/complete-all-next-steps.sh"
else
log_warn ""
log_warn "Some VMs may need more time. Wait a few minutes and test again."
log_info "Use: ./scripts/ops/ssh-test-all.sh to test SSH access"
fi
}
main() {
log_step "Recreate Template and VMs with Proper Cloud-Init"
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
exit 1
fi
log_warn "This will:"
log_warn " 1. Destroy and recreate template VM 9000"
log_warn " 2. Destroy existing VMs 100-103"
log_warn " 3. Recreate VMs 100-103 from new template"
log_warn " 4. Configure all VMs with SSH keys via cloud-init"
echo ""
# Auto-confirm if running non-interactively
if [ -t 0 ]; then
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
else
log_info "Non-interactive mode: auto-confirming"
fi
recreate_template
destroy_existing_vms
create_vms_from_template
wait_and_test
log_step "Summary"
log_info "✓ Template VM 9000 recreated with proper cloud-init"
log_info "✓ VMs 100-103 recreated from template"
log_info "✓ SSH keys configured via cloud-init"
log_info "✓ VMs using DHCP (no IP conflicts)"
log_info ""
log_info "Next: Test SSH access and install QEMU Guest Agent"
}
main "$@"

View File

@@ -0,0 +1,269 @@
#!/bin/bash
source ~/.bashrc
# Recreate VMs with Proper SSH Key Configuration
# Destroys existing VMs and recreates them with cloud-init SSH keys
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
TEMPLATE_VMID="${TEMPLATE_VMID:-9000}"
SSH_KEY_FILE="$HOME/.ssh/id_ed25519_proxmox.pub"
GATEWAY="${GATEWAY:-192.168.1.254}"
# VM definitions: vmid name ip cores memory disk_size
VMS=(
"100 cloudflare-tunnel 192.168.1.188 2 2048 20"
"101 k3s-master 192.168.1.60 4 4096 40"
"102 git-server 192.168.1.121 2 2048 30"
"103 observability 192.168.1.82 2 2048 30"
)
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
destroy_vm() {
local vmid=$1
local name=$2
log_info "Destroying VM $vmid ($name)..."
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Stop VM if running
local status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" | \
python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'stopped'))" 2>/dev/null || echo "stopped")
if [ "$status" = "running" ]; then
log_info "Stopping VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null
sleep 5
fi
# Delete VM
log_info "Deleting VM $vmid..."
curl -s -k -X DELETE \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid" > /dev/null
log_info "VM $vmid destroyed"
}
create_vm_with_ssh() {
local vmid=$1
local name=$2
local ip=$3
local cores=$4
local memory=$5
local disk_size=$6
log_info "Creating VM $vmid: $name with SSH keys"
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Read SSH public key
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
return 1
fi
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
# Check if template exists
local template_check=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/status/current" 2>&1)
if ! echo "$template_check" | grep -q '"data"'; then
log_error "Template VM $TEMPLATE_VMID not found"
return 1
fi
# Clone VM from template
log_info "Cloning from template $TEMPLATE_VMID..."
local clone_response=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_VMID/clone" 2>&1)
if ! echo "$clone_response" | grep -q '"data"'; then
log_error "Failed to clone VM: $clone_response"
return 1
fi
log_info "VM cloned, waiting for clone to complete..."
sleep 10
# Configure VM
log_info "Configuring VM $vmid..."
# Set resources
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "cores=$cores" \
-d "memory=$memory" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Configure cloud-init with SSH keys
log_info "Configuring cloud-init with SSH keys..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "ipconfig0=ip=$ip/24,gw=$GATEWAY" \
--data-urlencode "ciuser=ubuntu" \
--data-urlencode "sshkeys=$ssh_key_b64" \
--data-urlencode "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
log_info "✓ VM $vmid configured with SSH keys"
# Start VM
log_info "Starting VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null
log_info "✓ VM $vmid started"
}
main() {
log_step "Recreate VMs with SSH Key Configuration"
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
exit 1
fi
log_warn "This will DESTROY and RECREATE all 4 VMs (100-103)"
log_warn "All data on these VMs will be lost!"
echo ""
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
# Destroy existing VMs
log_step "Step 1: Destroying Existing VMs"
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
destroy_vm "$vmid" "$name" || log_warn "Failed to destroy VM $vmid"
done
sleep 5
# Create new VMs with SSH keys
log_step "Step 2: Creating VMs with SSH Keys"
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
create_vm_with_ssh "$vmid" "$name" "$ip" "$cores" "$memory" "$disk_size" || {
log_error "Failed to create VM $vmid"
continue
}
done
log_step "Step 3: Waiting for VMs to Boot"
log_info "Waiting 60 seconds for VMs to boot and apply cloud-init..."
sleep 60
log_step "Step 4: Testing SSH Access"
log_info "Testing SSH access to VMs..."
local all_ok=true
for vm_spec in "${VMS[@]}"; do
read -r vmid name ip cores memory disk_size <<< "$vm_spec"
if ssh -i "${SSH_KEY_FILE%.pub}" -o ConnectTimeout=10 -o StrictHostKeyChecking=no ubuntu@$ip "echo 'SSH OK' && hostname" &>/dev/null; then
log_info " ✓ VM $vmid ($name) at $ip: SSH working"
else
log_error " ✗ VM $vmid ($name) at $ip: SSH not working"
all_ok=false
fi
done
if [ "$all_ok" = true ]; then
log_info ""
log_info "✓ All VMs recreated successfully with SSH access!"
log_info "You can now run: ./scripts/deploy/complete-all-next-steps.sh"
else
log_warn ""
log_warn "Some VMs may need more time to boot. Wait a few minutes and test again."
fi
}
main "$@"

275
scripts/fix/setup-nat-for-vms.sh Executable file
View File

@@ -0,0 +1,275 @@
#!/bin/bash
source ~/.bashrc
# Setup NAT for VMs - Make VMs accessible via Proxmox host
# Creates a NAT network so VMs can be accessed via Proxmox host IP
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
# NAT network configuration
NAT_NETWORK="10.0.0.0/24"
NAT_BRIDGE="vmbr1"
NAT_GATEWAY="10.0.0.1"
# VM definitions: vmid name nat_ip
VMS=(
"100 cloudflare-tunnel 10.0.0.10"
"101 k3s-master 10.0.0.11"
"102 git-server 10.0.0.12"
"103 observability 10.0.0.13"
)
setup_nat_bridge() {
log_step "Step 1: Setting up NAT Bridge"
log_info "Creating NAT bridge $NAT_BRIDGE on Proxmox host..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<EOF
set -e
# Check if bridge already exists
if ip link show $NAT_BRIDGE &>/dev/null; then
echo "Bridge $NAT_BRIDGE already exists"
else
# Create bridge
cat >> /etc/network/interfaces <<INTERFACES
# NAT bridge for VMs
auto $NAT_BRIDGE
iface $NAT_BRIDGE inet static
address $NAT_GATEWAY
netmask 255.255.255.0
bridge_ports none
bridge_stp off
bridge_fd 0
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
post-up iptables -t nat -A POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE
post-up iptables -A FORWARD -s $NAT_NETWORK -j ACCEPT
post-up iptables -A FORWARD -d $NAT_NETWORK -j ACCEPT
INTERFACES
# Bring up bridge
ifup $NAT_BRIDGE
echo "✓ NAT bridge $NAT_BRIDGE created"
fi
# Enable IP forwarding
echo 1 > /proc/sys/net/ipv4/ip_forward
# Setup iptables rules (idempotent)
iptables -t nat -C POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE 2>/dev/null || \
iptables -t nat -A POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE
iptables -C FORWARD -s $NAT_NETWORK -j ACCEPT 2>/dev/null || \
iptables -A FORWARD -s $NAT_NETWORK -j ACCEPT
iptables -C FORWARD -d $NAT_NETWORK -j ACCEPT 2>/dev/null || \
iptables -A FORWARD -d $NAT_NETWORK -j ACCEPT
echo "✓ NAT rules configured"
EOF
log_info "✓ NAT bridge configured"
}
configure_vm_nat() {
local vmid=$1
local name=$2
local nat_ip=$3
log_info "Configuring VM $vmid ($name) with NAT IP $nat_ip..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<EOF
# Update VM network to use NAT bridge
qm set $vmid --net0 virtio,bridge=$NAT_BRIDGE
# Configure cloud-init with NAT IP
qm set $vmid --ipconfig0 ip=$nat_ip/24,gw=$NAT_GATEWAY
echo "✓ VM $vmid configured for NAT"
EOF
}
setup_port_forwarding() {
log_step "Step 3: Setting up Port Forwarding"
log_info "Setting up port forwarding rules..."
# Port mappings: external_port -> vm_nat_ip:internal_port
# Format: vmid external_port internal_port description
PORT_MAPPINGS=(
"100 2222 22 cloudflare-tunnel-ssh"
"101 2223 22 k3s-master-ssh"
"102 2224 22 git-server-ssh"
"103 2225 22 observability-ssh"
"102 3000 3000 gitea-web"
"103 9090 9090 prometheus"
"103 3001 3000 grafana"
)
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<'EOF'
set -e
# Get NAT IPs for VMs
declare -A VM_NAT_IPS=(
["100"]="10.0.0.10"
["101"]="10.0.0.11"
["102"]="10.0.0.12"
["103"]="10.0.0.13"
)
# Port forwarding rules
# Format: vmid external_port internal_port
PORT_MAPPINGS=(
"100 2222 22"
"101 2223 22"
"102 2224 22"
"103 2225 22"
"102 3000 3000"
"103 9090 9090"
"103 3001 3000"
)
for mapping in "${PORT_MAPPINGS[@]}"; do
read -r vmid ext_port int_port <<< "$mapping"
nat_ip="${VM_NAT_IPS[$vmid]}"
# Check if rule exists
if iptables -t nat -C PREROUTING -p tcp --dport $ext_port -j DNAT --to-destination $nat_ip:$int_port 2>/dev/null; then
echo "Port forwarding $ext_port -> $nat_ip:$int_port already exists"
else
# Add port forwarding
iptables -t nat -A PREROUTING -p tcp --dport $ext_port -j DNAT --to-destination $nat_ip:$int_port
iptables -A FORWARD -p tcp -d $nat_ip --dport $int_port -j ACCEPT
echo "✓ Port forwarding: $PROXMOX_HOST:$ext_port -> $nat_ip:$int_port"
fi
done
# Save iptables rules
if command -v netfilter-persistent &>/dev/null; then
netfilter-persistent save
elif [ -f /etc/iptables/rules.v4 ]; then
iptables-save > /etc/iptables/rules.v4
fi
echo "✓ Port forwarding configured"
EOF
log_info "✓ Port forwarding configured"
}
show_access_info() {
log_step "Access Information"
log_info "VM Access via NAT:"
echo ""
echo " VM 100 (cloudflare-tunnel):"
echo " SSH: ssh -i $SSH_KEY ubuntu@$PROXMOX_HOST -p 2222"
echo " Direct NAT: ssh -i $SSH_KEY ubuntu@10.0.0.10 (from Proxmox host)"
echo ""
echo " VM 101 (k3s-master):"
echo " SSH: ssh -i $SSH_KEY ubuntu@$PROXMOX_HOST -p 2223"
echo " Direct NAT: ssh -i $SSH_KEY ubuntu@10.0.0.11 (from Proxmox host)"
echo ""
echo " VM 102 (git-server):"
echo " SSH: ssh -i $SSH_KEY ubuntu@$PROXMOX_HOST -p 2224"
echo " Gitea: http://$PROXMOX_HOST:3000"
echo " Direct NAT: ssh -i $SSH_KEY ubuntu@10.0.0.12 (from Proxmox host)"
echo ""
echo " VM 103 (observability):"
echo " SSH: ssh -i $SSH_KEY ubuntu@$PROXMOX_HOST -p 2225"
echo " Prometheus: http://$PROXMOX_HOST:9090"
echo " Grafana: http://$PROXMOX_HOST:3001"
echo " Direct NAT: ssh -i $SSH_KEY ubuntu@10.0.0.13 (from Proxmox host)"
echo ""
log_info "To access VMs from Proxmox host:"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.10 # VM 100"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.11 # VM 101"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.12 # VM 102"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.13 # VM 103"
}
main() {
log_step "Setup NAT for VMs"
log_warn "This will:"
log_warn " 1. Create a NAT bridge (vmbr1) on Proxmox host"
log_warn " 2. Reconfigure VMs to use NAT network"
log_warn " 3. Setup port forwarding for SSH and services"
echo ""
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
setup_nat_bridge
log_step "Step 2: Configuring VMs for NAT"
for vm_spec in "${VMS[@]}"; do
read -r vmid name nat_ip <<< "$vm_spec"
configure_vm_nat "$vmid" "$name" "$nat_ip" || log_warn "Failed to configure VM $vmid"
done
setup_port_forwarding
log_info "Rebooting VMs to apply network changes..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST "for vmid in 100 101 102 103; do qm reboot \$vmid 2>/dev/null || true; done"
log_info "Waiting 60 seconds for VMs to reboot..."
sleep 60
show_access_info
log_step "Testing NAT Access"
log_info "Testing SSH via port forwarding..."
if ssh -i "$SSH_KEY" -o ConnectTimeout=10 -p 2222 ubuntu@$PROXMOX_HOST "echo 'SSH OK' && hostname" &>/dev/null; then
log_info "✓ SSH via NAT is working!"
else
log_warn "SSH may need more time. Wait a few minutes and test again."
fi
}
main "$@"

View File

@@ -0,0 +1,307 @@
#!/bin/bash
source ~/.bashrc
# Setup NAT for VMs AND Reconfigure with SSH Keys
# Combines NAT setup with cloud-init SSH key injection
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_KEY_FILE="$SSH_KEY.pub"
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
# NAT network configuration
NAT_NETWORK="10.0.0.0/24"
NAT_BRIDGE="vmbr1"
NAT_GATEWAY="10.0.0.1"
# VM definitions: vmid name nat_ip
VMS=(
"100 cloudflare-tunnel 10.0.0.10"
"101 k3s-master 10.0.0.11"
"102 git-server 10.0.0.12"
"103 observability 10.0.0.13"
)
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
setup_nat_bridge() {
log_step "Step 1: Setting up NAT Bridge"
log_info "Creating NAT bridge $NAT_BRIDGE on Proxmox host..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<EOF
set -e
# Check if bridge already exists
if ip link show $NAT_BRIDGE &>/dev/null; then
echo "Bridge $NAT_BRIDGE already exists"
else
# Create bridge
cat >> /etc/network/interfaces <<INTERFACES
# NAT bridge for VMs
auto $NAT_BRIDGE
iface $NAT_BRIDGE inet static
address $NAT_GATEWAY
netmask 255.255.255.0
bridge_ports none
bridge_stp off
bridge_fd 0
post-up echo 1 > /proc/sys/net/ipv4/ip_forward
post-up iptables -t nat -A POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE
post-up iptables -A FORWARD -s $NAT_NETWORK -j ACCEPT
post-up iptables -A FORWARD -d $NAT_NETWORK -j ACCEPT
INTERFACES
# Bring up bridge
ifup $NAT_BRIDGE
echo "✓ NAT bridge $NAT_BRIDGE created"
fi
# Enable IP forwarding
echo 1 > /proc/sys/net/ipv4/ip_forward
# Setup iptables rules (idempotent)
iptables -t nat -C POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE 2>/dev/null || \
iptables -t nat -A POSTROUTING -s $NAT_NETWORK -o vmbr0 -j MASQUERADE
iptables -C FORWARD -s $NAT_NETWORK -j ACCEPT 2>/dev/null || \
iptables -A FORWARD -s $NAT_NETWORK -j ACCEPT
iptables -C FORWARD -d $NAT_NETWORK -j ACCEPT 2>/dev/null || \
iptables -A FORWARD -d $NAT_NETWORK -j ACCEPT
echo "✓ NAT rules configured"
EOF
log_info "✓ NAT bridge configured"
}
configure_vm_nat_with_ssh() {
local vmid=$1
local name=$2
local nat_ip=$3
log_info "Configuring VM $vmid ($name) with NAT IP $nat_ip and SSH keys..."
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
return 1
fi
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Update VM network to use NAT bridge
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "net0=virtio,bridge=$NAT_BRIDGE" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
# Configure cloud-init with NAT IP and SSH keys
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "ipconfig0=ip=$nat_ip/24,gw=$NAT_GATEWAY" \
--data-urlencode "ciuser=ubuntu" \
--data-urlencode "sshkeys=$ssh_key_b64" \
--data-urlencode "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null
log_info "✓ VM $vmid configured for NAT with SSH keys"
}
setup_port_forwarding() {
log_step "Step 3: Setting up Port Forwarding"
log_info "Setting up port forwarding rules..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST <<'EOF'
set -e
# Get NAT IPs for VMs
declare -A VM_NAT_IPS=(
["100"]="10.0.0.10"
["101"]="10.0.0.11"
["102"]="10.0.0.12"
["103"]="10.0.0.13"
)
# Port forwarding rules
# Format: vmid external_port internal_port
PORT_MAPPINGS=(
"100 2222 22"
"101 2223 22"
"102 2224 22"
"103 2225 22"
"102 3000 3000"
"103 9090 9090"
"103 3001 3000"
)
for mapping in "${PORT_MAPPINGS[@]}"; do
read -r vmid ext_port int_port <<< "$mapping"
nat_ip="${VM_NAT_IPS[$vmid]}"
# Check if rule exists
if iptables -t nat -C PREROUTING -p tcp --dport $ext_port -j DNAT --to-destination $nat_ip:$int_port 2>/dev/null; then
echo "Port forwarding $ext_port -> $nat_ip:$int_port already exists"
else
# Add port forwarding
iptables -t nat -A PREROUTING -p tcp --dport $ext_port -j DNAT --to-destination $nat_ip:$int_port
iptables -A FORWARD -p tcp -d $nat_ip --dport $int_port -j ACCEPT
echo "✓ Port forwarding: $PROXMOX_HOST:$ext_port -> $nat_ip:$int_port"
fi
done
# Save iptables rules
if command -v netfilter-persistent &>/dev/null; then
netfilter-persistent save
elif [ -f /etc/iptables/rules.v4 ]; then
iptables-save > /etc/iptables/rules.v4
fi
echo "✓ Port forwarding configured"
EOF
log_info "✓ Port forwarding configured"
}
main() {
log_step "Setup NAT with SSH Keys"
if [ ! -f "$SSH_KEY_FILE" ]; then
log_error "SSH key file not found: $SSH_KEY_FILE"
exit 1
fi
log_warn "This will:"
log_warn " 1. Create a NAT bridge (vmbr1) on Proxmox host"
log_warn " 2. Reconfigure VMs to use NAT network"
log_warn " 3. Inject SSH keys via cloud-init"
log_warn " 4. Setup port forwarding for SSH and services"
echo ""
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
setup_nat_bridge
log_step "Step 2: Configuring VMs for NAT with SSH Keys"
for vm_spec in "${VMS[@]}"; do
read -r vmid name nat_ip <<< "$vm_spec"
configure_vm_nat_with_ssh "$vmid" "$name" "$nat_ip" || log_warn "Failed to configure VM $vmid"
done
setup_port_forwarding
log_info "Rebooting VMs to apply network and SSH key changes..."
ssh -i "$SSH_KEY" root@$PROXMOX_HOST "for vmid in 100 101 102 103; do qm reboot \$vmid 2>/dev/null || true; done"
log_info "Waiting 90 seconds for VMs to reboot and apply cloud-init..."
sleep 90
log_step "Testing Access"
log_info "Testing SSH via port forwarding..."
local all_ok=true
for port in 2222 2223 2224 2225; do
local vm_name=$(echo $port | sed 's/2222/cloudflare-tunnel/;s/2223/k3s-master/;s/2224/git-server/;s/2225/observability/')
if ssh -i "$SSH_KEY" -o ConnectTimeout=10 -p $port ubuntu@$PROXMOX_HOST "echo 'SSH OK' && hostname" &>/dev/null; then
log_info "$vm_name (port $port): SSH working"
else
log_warn "$vm_name (port $port): SSH not working yet (may need more time)"
all_ok=false
fi
done
if [ "$all_ok" = true ]; then
log_info ""
log_info "✓ All VMs accessible via NAT with SSH!"
else
log_warn ""
log_warn "Some VMs may need more time. Wait a few minutes and test again."
fi
log_step "Access Information"
log_info "VM Access:"
echo " VM 100: ssh -i $SSH_KEY -p 2222 ubuntu@$PROXMOX_HOST"
echo " VM 101: ssh -i $SSH_KEY -p 2223 ubuntu@$PROXMOX_HOST"
echo " VM 102: ssh -i $SSH_KEY -p 2224 ubuntu@$PROXMOX_HOST"
echo " VM 103: ssh -i $SSH_KEY -p 2225 ubuntu@$PROXMOX_HOST"
echo ""
log_info "From Proxmox host:"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.10 # VM 100"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.11 # VM 101"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.12 # VM 102"
echo " ssh -i $SSH_KEY ubuntu@10.0.0.13 # VM 103"
}
main "$@"

213
scripts/fix/switch-vms-to-dhcp.sh Executable file
View File

@@ -0,0 +1,213 @@
#!/bin/bash
source ~/.bashrc
# Switch VMs from Static IPs to DHCP
# Removes static IP configuration and lets VMs get IPs from DHCP
# Then uses QEMU Guest Agent to discover IPs dynamically
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_KEY_FILE="$SSH_KEY.pub"
# VM definitions: vmid name
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
switch_vm_to_dhcp() {
local vmid=$1
local name=$2
log_info "Switching VM $vmid ($name) to DHCP..."
local tokens=$(get_api_token)
if [ -z "$tokens" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Remove static IP configuration (set to DHCP)
# Remove ipconfig0 to let cloud-init use DHCP
curl -s -k -X DELETE \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config/ipconfig0" > /dev/null 2>&1 || true
# Ensure cloud-init is configured for DHCP
# Set ciuser if not set
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "ciuser=ubuntu" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1 || true
# Add SSH keys if not already configured
if [ -f "$SSH_KEY_FILE" ]; then
local ssh_key_content=$(cat "$SSH_KEY_FILE")
local ssh_key_b64=$(echo "$ssh_key_content" | base64 -w 0)
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
--data-urlencode "sshkeys=$ssh_key_b64" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1 || true
fi
log_info "✓ VM $vmid configured for DHCP"
}
discover_vm_ips() {
log_step "Step 3: Discovering VM IPs via QEMU Guest Agent"
log_info "Waiting for VMs to get DHCP IPs and start guest agent..."
sleep 30
log_info "Discovering IPs..."
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" 2>/dev/null || {
log_error "Helper library not found"
return 1
}
local all_ok=true
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" 2>/dev/null || true)"
if [[ -n "$ip" ]]; then
log_info " ✓ VM $vmid ($name): $ip"
else
log_warn " ✗ VM $vmid ($name): IP not discovered yet (guest agent may need more time)"
all_ok=false
fi
done
if [ "$all_ok" = false ]; then
log_warn ""
log_warn "Some VMs may need more time. Wait a few minutes and check again:"
log_info " ssh root@192.168.1.206"
log_info " source /home/intlc/projects/loc_az_hci/scripts/lib/proxmox_vm_helpers.sh"
log_info " get_vm_ip_from_guest_agent 100"
fi
}
main() {
log_step "Switch VMs from Static IPs to DHCP"
log_warn "This will:"
log_warn " 1. Remove static IP configuration from all VMs"
log_warn " 2. Configure VMs to use DHCP"
log_warn " 3. Add SSH keys via cloud-init"
log_warn " 4. Reboot VMs to apply changes"
log_warn ""
log_warn "VMs will get IPs from your router's DHCP server"
log_warn "IPs will be discovered via QEMU Guest Agent"
echo ""
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
log_step "Step 1: Switching VMs to DHCP"
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
switch_vm_to_dhcp "$vmid" "$name" || log_warn "Failed to configure VM $vmid"
done
log_step "Step 2: Rebooting VMs"
log_info "Rebooting VMs to apply DHCP configuration..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
log_info "Rebooting VM $vmid..."
curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/reboot" > /dev/null 2>&1 || true
done
discover_vm_ips
log_step "Summary"
log_info "✓ VMs switched to DHCP"
log_info "✓ SSH keys configured via cloud-init"
log_info "✓ IPs will be discovered via QEMU Guest Agent"
log_info ""
log_info "All your scripts already support dynamic IP discovery!"
log_info "They use get_vm_ip_from_guest_agent() to find IPs automatically."
log_info ""
log_info "Test SSH access (after IPs are discovered):"
log_info " ./scripts/ops/ssh-test-all.sh"
}
main "$@"

View File

@@ -0,0 +1,61 @@
#!/bin/bash
source ~/.bashrc
# Check Azure Arc Health
# Verifies Azure Arc agent connectivity and resource status
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_azure_cli() {
if ! command -v az &> /dev/null; then
log_warn "Azure CLI not found, skipping Azure Arc check"
return 0
fi
if ! az account show &> /dev/null; then
log_warn "Azure CLI not authenticated, skipping Azure Arc check"
return 0
fi
log_info "✓ Azure CLI authenticated"
return 0
}
check_arc_resources() {
local resource_group="${RESOURCE_GROUP:-HC-Stack}"
if az connectedmachine list --resource-group "$resource_group" &> /dev/null 2>&1; then
local count=$(az connectedmachine list --resource-group "$resource_group" --query "length(@)" -o tsv 2>/dev/null || echo "0")
log_info "✓ Azure Arc resources found: $count machine(s)"
return 0
else
log_warn "⚠ Azure Arc resources not found (may not be deployed)"
return 0
fi
}
main() {
check_azure_cli
check_arc_resources
exit 0
}
main "$@"

View File

@@ -0,0 +1,82 @@
#!/bin/bash
source ~/.bashrc
# Check Kubernetes Health
# Verifies Kubernetes cluster health and node status
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_kubectl() {
if ! command -v kubectl &> /dev/null; then
log_warn "kubectl not found, skipping Kubernetes check"
return 0
fi
if ! kubectl cluster-info &> /dev/null 2>&1; then
log_error "Kubernetes cluster not accessible"
return 1
fi
log_info "✓ Kubernetes cluster accessible"
return 0
}
check_nodes() {
if ! command -v kubectl &> /dev/null; then
return 0
fi
local nodes=$(kubectl get nodes --no-headers 2>/dev/null | wc -l)
local ready_nodes=$(kubectl get nodes --no-headers 2>/dev/null | grep -c " Ready " || echo "0")
if [ "$nodes" -gt 0 ]; then
log_info "✓ Nodes: $ready_nodes/$nodes ready"
if [ "$ready_nodes" -eq "$nodes" ]; then
return 0
else
log_warn "⚠ Some nodes are not ready"
return 1
fi
else
log_warn "⚠ No nodes found"
return 0
fi
}
main() {
local all_healthy=true
if ! check_kubectl; then
all_healthy=false
fi
if ! check_nodes; then
all_healthy=false
fi
if [ "$all_healthy" = true ]; then
exit 0
else
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,91 @@
#!/bin/bash
source ~/.bashrc
# Check Proxmox Health
# Verifies Proxmox cluster and node health
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_proxmox_api() {
local url=$1
local name=$2
if [ -z "$url" ]; then
return 1
fi
local host_ip=$(echo "$url" | sed -E 's|https?://([^:]+).*|\1|')
local password="${PVE_ROOT_PASS:-}"
if [ -z "$password" ]; then
log_warn "PVE_ROOT_PASS not set, skipping API check"
return 0
fi
# Test API connection
local response=$(curl -s -k --connect-timeout 5 --max-time 10 \
-d "username=root@pam&password=$password" \
"$url/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
log_info "$name: API accessible"
return 0
else
log_error "$name: API not accessible"
return 1
fi
}
main() {
local all_healthy=true
# Check ML110
if [ -n "${PROXMOX_ML110_URL:-}" ]; then
if ! check_proxmox_api "$PROXMOX_ML110_URL" "ML110"; then
all_healthy=false
fi
fi
# Check R630
if [ -n "${PROXMOX_R630_URL:-}" ]; then
if ! check_proxmox_api "$PROXMOX_R630_URL" "R630"; then
all_healthy=false
fi
fi
if [ "$all_healthy" = true ]; then
exit 0
else
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,71 @@
#!/bin/bash
source ~/.bashrc
# Check Services Health
# Verifies HC Stack services are running
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_services() {
if ! command -v kubectl &> /dev/null; then
log_warn "kubectl not found, skipping service checks"
return 0
fi
local services=("besu" "firefly" "chainlink-ccip" "blockscout" "cacti" "nginx-proxy")
local found=0
local running=0
for service in "${services[@]}"; do
if kubectl get deployment "$service" --all-namespaces &> /dev/null 2>&1; then
found=$((found + 1))
local ready=$(kubectl get deployment "$service" --all-namespaces -o jsonpath='{.items[0].status.readyReplicas}' 2>/dev/null || echo "0")
local desired=$(kubectl get deployment "$service" --all-namespaces -o jsonpath='{.items[0].status.replicas}' 2>/dev/null || echo "0")
if [ "$ready" -eq "$desired" ] && [ "$desired" -gt 0 ]; then
running=$((running + 1))
log_info "$service: Running ($ready/$desired)"
else
log_warn "$service: Not fully running ($ready/$desired)"
fi
fi
done
if [ $found -eq 0 ]; then
log_warn "⚠ No HC Stack services found (may not be deployed)"
return 0
fi
if [ $running -eq $found ]; then
log_info "✓ All services running: $running/$found"
return 0
else
log_warn "⚠ Some services not running: $running/$found"
return 1
fi
}
main() {
check_services
exit $?
}
main "$@"

View File

@@ -0,0 +1,92 @@
#!/bin/bash
source ~/.bashrc
# Health Check All
# Comprehensive health check for all infrastructure components
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_check() {
echo -e "${BLUE}[CHECK]${NC} $1"
}
check_component() {
local check_script=$1
local component_name=$2
if [ -f "$check_script" ] && [ -x "$check_script" ]; then
log_check "Checking $component_name..."
if "$check_script"; then
log_info "$component_name: Healthy"
return 0
else
log_error "$component_name: Unhealthy"
return 1
fi
else
log_warn "$component_name: Check script not found"
return 0
fi
}
main() {
echo "========================================="
echo "Infrastructure Health Check"
echo "========================================="
echo ""
local healthy=0
local unhealthy=0
# Check Proxmox
check_component "$PROJECT_ROOT/scripts/health/check-proxmox-health.sh" "Proxmox" && healthy=$((healthy + 1)) || unhealthy=$((unhealthy + 1))
# Check Azure Arc
check_component "$PROJECT_ROOT/scripts/health/check-azure-arc-health.sh" "Azure Arc" && healthy=$((healthy + 1)) || unhealthy=$((unhealthy + 1))
# Check Kubernetes
check_component "$PROJECT_ROOT/scripts/health/check-kubernetes-health.sh" "Kubernetes" && healthy=$((healthy + 1)) || unhealthy=$((unhealthy + 1))
# Check Services
check_component "$PROJECT_ROOT/scripts/health/check-services-health.sh" "Services" && healthy=$((healthy + 1)) || unhealthy=$((unhealthy + 1))
echo ""
echo "========================================="
echo "Health Check Summary"
echo "========================================="
log_info "Healthy: $healthy"
log_error "Unhealthy: $unhealthy"
echo ""
if [ $unhealthy -eq 0 ]; then
log_info "✓ All components are healthy"
exit 0
else
log_error "$unhealthy component(s) are unhealthy"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,253 @@
#!/bin/bash
source ~/.bashrc
# Query Detailed Proxmox Status
# Queries cluster, storage, network, VMs, and Azure Arc status from both servers
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_ML110_URL="${PROXMOX_ML110_URL:-}"
PROXMOX_R630_URL="${PROXMOX_R630_URL:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_section() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
get_api_token() {
local url=$1
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$url/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
query_cluster_status() {
local url=$1
local name=$2
local tokens=$3
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_section "Cluster Status - $name"
local cluster_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/cluster/status" 2>&1)
if echo "$cluster_response" | grep -q '"data"'; then
echo "$cluster_response" | python3 -m json.tool 2>/dev/null || echo "$cluster_response"
else
log_warn "Not in a cluster or cluster API not accessible"
fi
# Get nodes
local nodes_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/nodes" 2>&1)
if echo "$nodes_response" | grep -q '"data"'; then
echo ""
log_info "Nodes:"
echo "$nodes_response" | python3 -c "import sys, json; data=json.load(sys.stdin); [print(f\" - {n['node']}: {n.get('status', 'unknown')}\") for n in data.get('data', [])]" 2>/dev/null || echo "$nodes_response"
fi
}
query_storage_status() {
local url=$1
local name=$2
local tokens=$3
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_section "Storage Status - $name"
local storage_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/storage" 2>&1)
if echo "$storage_response" | grep -q '"data"'; then
echo "$storage_response" | python3 -c "
import sys, json
data = json.load(sys.stdin)
storages = data.get('data', [])
if storages:
print('Storage Pools:')
for s in storages:
print(f\" - {s.get('storage', 'unknown')}: {s.get('type', 'unknown')} ({s.get('content', '')}) - {s.get('status', 'unknown')}\")
else:
print('No storage pools found')
" 2>/dev/null || echo "$storage_response"
else
log_warn "Could not query storage"
fi
}
query_vm_inventory() {
local url=$1
local name=$2
local tokens=$3
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_section "VM Inventory - $name"
# Get all nodes first
local nodes_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/nodes" 2>&1)
if echo "$nodes_response" | grep -q '"data"'; then
local nodes=$(echo "$nodes_response" | python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join([n['node'] for n in data.get('data', [])]))" 2>/dev/null)
for node in $nodes; do
local vms_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/nodes/$node/qemu" 2>&1)
if echo "$vms_response" | grep -q '"data"'; then
echo ""
log_info "VMs on node $node:"
echo "$vms_response" | python3 -c "
import sys, json
data = json.load(sys.stdin)
vms = data.get('data', [])
if vms:
for vm in vms:
vmid = vm.get('vmid', 'unknown')
name = vm.get('name', 'unknown')
status = vm.get('status', 'unknown')
print(f\" - VM $vmid: $name (Status: $status)\")
else:
print(' No VMs found')
" 2>/dev/null || echo "$vms_response"
fi
done
fi
}
query_server_info() {
local url=$1
local name=$2
local tokens=$3
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_section "Server Information - $name"
local nodes_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/nodes" 2>&1)
if echo "$nodes_response" | grep -q '"data"'; then
local nodes=$(echo "$nodes_response" | python3 -c "import sys, json; data=json.load(sys.stdin); print(' '.join([n['node'] for n in data.get('data', [])]))" 2>/dev/null)
for node in $nodes; do
local node_status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$url/api2/json/nodes/$node/status" 2>&1)
if echo "$node_status" | grep -q '"data"'; then
echo ""
log_info "Node: $node"
echo "$node_status" | python3 -c "
import sys, json
data = json.load(sys.stdin)
info = data.get('data', {})
print(f\" Uptime: {info.get('uptime', 0) // 3600} hours\")
print(f\" CPU Usage: {info.get('cpu', 0) * 100:.1f}%\")
print(f\" Memory: {info.get('memory', {}).get('used', 0) // 1024 // 1024 // 1024}GB / {info.get('memory', {}).get('total', 0) // 1024 // 1024 // 1024}GB\")
print(f\" Root Disk: {info.get('rootfs', {}).get('used', 0) // 1024 // 1024 // 1024}GB / {info.get('rootfs', {}).get('total', 0) // 1024 // 1024 // 1024}GB\")
" 2>/dev/null || echo "$node_status"
fi
done
fi
}
main() {
echo "========================================="
echo "Proxmox VE Detailed Status Query"
echo "========================================="
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
# Query ML110
if [ -n "$PROXMOX_ML110_URL" ]; then
log_info "Querying ML110 (HPE ML110 Gen9)..."
local ml110_tokens=$(get_api_token "$PROXMOX_ML110_URL")
if [ -n "$ml110_tokens" ]; then
query_server_info "$PROXMOX_ML110_URL" "ML110" "$ml110_tokens"
query_cluster_status "$PROXMOX_ML110_URL" "ML110" "$ml110_tokens"
query_storage_status "$PROXMOX_ML110_URL" "ML110" "$ml110_tokens"
query_vm_inventory "$PROXMOX_ML110_URL" "ML110" "$ml110_tokens"
else
log_error "Failed to authenticate with ML110"
fi
fi
echo ""
echo "----------------------------------------"
echo ""
# Query R630
if [ -n "$PROXMOX_R630_URL" ]; then
log_info "Querying R630 (Dell R630)..."
local r630_tokens=$(get_api_token "$PROXMOX_R630_URL")
if [ -n "$r630_tokens" ]; then
query_server_info "$PROXMOX_R630_URL" "R630" "$r630_tokens"
query_cluster_status "$PROXMOX_R630_URL" "R630" "$r630_tokens"
query_storage_status "$PROXMOX_R630_URL" "R630" "$r630_tokens"
query_vm_inventory "$PROXMOX_R630_URL" "R630" "$r630_tokens"
else
log_error "Failed to authenticate with R630"
fi
fi
echo ""
log_info "Status query completed"
}
main "$@"

View File

@@ -0,0 +1,74 @@
#!/bin/bash
source ~/.bashrc
# Add SSH key to R630 (192.168.1.49) to enable key-based authentication
# This script attempts to add the SSH key via Proxmox API or provides instructions
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_KEY_PUB="${SSH_KEY}.pub"
R630_IP="192.168.1.49"
if [ ! -f "$SSH_KEY_PUB" ]; then
log_error "SSH public key not found: $SSH_KEY_PUB"
exit 1
fi
SSH_KEY_CONTENT=$(cat "$SSH_KEY_PUB")
log_info "Adding SSH key to R630 (192.168.1.49)..."
log_info "SSH Key: $SSH_KEY_PUB"
echo ""
# Try to add key via ssh-copy-id if password auth works
log_info "Attempting to add SSH key using ssh-copy-id..."
if ssh-copy-id -i "$SSH_KEY_PUB" -o StrictHostKeyChecking=no "root@${R630_IP}" 2>/dev/null; then
log_info "✓ SSH key added successfully via ssh-copy-id"
log_info "Testing SSH connection..."
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@${R630_IP}" "echo 'SSH key authentication working!'" 2>/dev/null; then
log_info "✓ SSH key authentication confirmed!"
exit 0
fi
fi
# If ssh-copy-id failed, provide manual instructions
log_warn "ssh-copy-id failed (password auth may be disabled)"
echo ""
log_info "Manual steps to add SSH key:"
echo ""
log_info "1. Access R630 console/web terminal:"
log_info " - Open https://192.168.1.49:8006"
log_info " - Go to: Shell (or use console)"
echo ""
log_info "2. Run this command on R630:"
echo ""
echo "mkdir -p ~/.ssh && chmod 700 ~/.ssh && echo '${SSH_KEY_CONTENT}' >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys && echo 'SSH key added'"
echo ""
log_info "3. Or copy this one-liner and paste on R630:"
echo ""
echo "echo '${SSH_KEY_CONTENT}' >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys && chmod 700 ~/.ssh"
echo ""
log_info "4. After adding the key, test connection:"
log_info " ssh -i $SSH_KEY root@${R630_IP}"

View File

@@ -0,0 +1,126 @@
#!/bin/bash
source ~/.bashrc
# Auto-Complete Template Setup and VM Recreation
# Monitors for template creation and automatically recreates VMs
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="https://192.168.1.206:8006"
PROXMOX_NODE="pve"
TEMPLATE_ID=9000
check_template() {
local response=$(curl -k -s -d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>/dev/null)
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -z "$ticket" ] || [ -z "$csrf" ]; then
return 1
fi
local config=$(curl -k -s \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_ID/config" 2>&1)
# Check if it exists and is a template
if echo "$config" | grep -q '"name"' && echo "$config" | grep -q '"template".*1'; then
return 0
else
return 1
fi
}
main() {
log_header "Auto-Complete Template Setup"
echo ""
log_step "Step 1: Template Creation (Manual - Required)"
echo ""
log_info "Please complete these steps in Proxmox Web UI:"
echo ""
echo "1. Upload Cloud Image:"
echo " • Proxmox → Storage → local → Upload"
echo " • File: ./downloads/ubuntu-24.04-server-cloudimg-amd64.img"
echo ""
echo "2. Create VM 9000:"
echo " • Create VM (ID: 9000, Name: ubuntu-24.04-cloudinit)"
echo " • Import disk from uploaded image"
echo " • Configure Cloud-Init (User: ubuntu, SSH key)"
echo ""
echo "3. Convert to Template:"
echo " • Right-click VM 9000 → Convert to Template"
echo ""
log_info "See: QUICK_TEMPLATE_GUIDE.md for detailed steps"
echo ""
log_step "Step 2: Monitoring for Template"
log_info "Checking every 10 seconds for template creation..."
echo ""
local check_count=0
local max_checks=180 # 30 minutes
while [ $check_count -lt $max_checks ]; do
check_count=$((check_count + 1))
if check_template; then
echo ""
log_info "✓ Template detected! Proceeding with VM recreation..."
echo ""
# Run VM recreation
export SSH_KEY="$HOME/.ssh/id_rsa"
export SSH_USER="ubuntu"
./scripts/recreate-vms-from-template.sh
exit $?
fi
if [ $((check_count % 6)) -eq 0 ]; then
echo -n "."
fi
sleep 10
done
echo ""
log_info "Template not detected after 30 minutes"
log_info "Please create template manually, then run:"
echo " ./scripts/check-and-recreate.sh"
}
main "$@"

View File

@@ -0,0 +1,174 @@
#!/bin/bash
source ~/.bashrc
# Complete Automation Script
# Handles all setup steps with prerequisite checking
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# VM configurations
declare -A VMS=(
["100"]="cloudflare-tunnel:192.168.1.60:scripts/setup-cloudflare-tunnel.sh"
["101"]="k3s-master:192.168.1.188:scripts/setup-k3s.sh"
["102"]="git-server:192.168.1.121:scripts/setup-git-server.sh"
["103"]="observability:192.168.1.82:scripts/setup-observability.sh"
)
# Check VM is ready
check_vm_ready() {
local ip=$1
local name=$2
# Ping test
if ! ping -c 1 -W 2 "$ip" >/dev/null 2>&1; then
return 1
fi
# SSH test
if ! timeout 2 bash -c "echo >/dev/tcp/$ip/22" 2>/dev/null; then
return 1
fi
return 0
}
# Setup VM service
setup_vm_service() {
local name=$1
local ip=$2
local script=$3
log_step "Setting up $name on $ip..."
# Check if VM is ready
if ! check_vm_ready "$ip" "$name"; then
log_warn "$name ($ip) is not ready yet. Skipping..."
return 1
fi
log_info "Copying setup script to $name..."
# Try to copy script (may need password or SSH key)
if scp -o ConnectTimeout=5 -o StrictHostKeyChecking=no "$script" "ubuntu@$ip:/tmp/setup.sh" 2>/dev/null; then
log_info "Running setup script on $name..."
ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no "ubuntu@$ip" "sudo bash /tmp/setup.sh" 2>&1 | while read line; do
log_info " $line"
done
if [ ${PIPESTATUS[0]} -eq 0 ]; then
log_info "$name setup completed"
return 0
else
log_error "Setup failed on $name"
return 1
fi
else
log_warn "Could not copy script to $name"
log_info "Manual steps for $name:"
echo " 1. SSH to $name: ssh ubuntu@$ip"
echo " 2. Copy $script to VM"
echo " 3. Run: sudo bash /path/to/script"
return 1
fi
}
main() {
log_header "Complete Setup Automation"
echo ""
log_step "Phase 1: Checking Prerequisites"
echo ""
# Check VM configurations
log_info "Verifying VM configurations..."
if ! ./scripts/check-vm-status.sh > /dev/null 2>&1; then
log_warn "Some VMs may not be fully configured"
fi
echo ""
log_step "Phase 2: Checking VM Readiness"
echo ""
local all_ready=true
for vmid in "${!VMS[@]}"; do
IFS=':' read -r name ip script <<< "${VMS[$vmid]}"
if check_vm_ready "$ip" "$name"; then
log_info "$name is ready"
else
log_warn "$name is not ready (Ubuntu may not be installed)"
all_ready=false
fi
done
echo ""
if [ "$all_ready" != true ]; then
log_error "Not all VMs are ready. Please:"
echo " 1. Complete Ubuntu installation on all VMs"
echo " 2. Ensure static IPs are configured"
echo " 3. Ensure SSH access works"
echo " 4. Run this script again"
exit 1
fi
log_step "Phase 3: Running Setup Scripts"
echo ""
local success_count=0
for vmid in "${!VMS[@]}"; do
IFS=':' read -r name ip script <<< "${VMS[$vmid]}"
if setup_vm_service "$name" "$ip" "$script"; then
success_count=$((success_count + 1))
fi
echo ""
done
log_header "Setup Complete"
echo ""
log_info "Successfully configured: $success_count/4 VMs"
echo ""
if [ $success_count -eq 4 ]; then
log_info "✅ All services are set up!"
echo ""
log_info "Next steps:"
echo " - Configure Cloudflare Tunnel (see docs/cloudflare-integration.md)"
echo " - Deploy services to K3s cluster"
echo " - Configure GitOps repository"
echo " - Set up monitoring dashboards"
else
log_warn "Some services need manual setup"
log_info "See VM_STATUS_REPORT.md for details"
fi
}
main "$@"

View File

@@ -0,0 +1,107 @@
#!/bin/bash
source ~/.bashrc
# Complete R630 Cluster Join
# This script provides instructions and attempts automated join
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
log_step() { echo -e "\n${BLUE}=== $1 ===${NC}"; }
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_OPTS="-i $SSH_KEY -o StrictHostKeyChecking=no"
ML110_IP="192.168.1.206"
R630_IP="192.168.1.49"
ROOT_PASS="${PVE_ROOT_PASS:-L@kers2010}"
log_step "Completing R630 Cluster Join"
# Check current cluster status
log_info "Checking cluster status on ML110..."
ML110_STATUS=$(ssh $SSH_OPTS "root@${ML110_IP}" "pvecm nodes 2>&1" || echo "")
echo "$ML110_STATUS"
log_info "Checking cluster status on R630..."
R630_STATUS=$(ssh $SSH_OPTS "root@${R630_IP}" "pvecm status 2>&1" || echo "")
echo "$R630_STATUS"
if echo "$R630_STATUS" | grep -q "hc-cluster"; then
log_info "✓ R630 is already in the cluster!"
exit 0
fi
log_step "Method 1: Join via Proxmox Web UI (Recommended)"
log_info "1. Open https://${ML110_IP}:8006"
log_info "2. Login as root"
log_info "3. Go to: Datacenter → Cluster → Join Information"
log_info "4. Copy the join command"
log_info "5. Or go to: Datacenter → Cluster → Add"
log_info "6. Enter R630 IP: ${R630_IP}"
log_info "7. Enter root password: ${ROOT_PASS}"
log_info "8. Click 'Join'"
log_step "Method 2: Join via SSH (Manual)"
log_info "SSH to R630 and run:"
echo ""
echo "ssh -i $SSH_KEY root@${R630_IP}"
echo "pvecm add ${ML110_IP}"
echo "# Enter password when prompted: ${ROOT_PASS}"
echo ""
log_step "Method 3: Automated Join Attempt"
log_info "Attempting automated join..."
# Try using expect or similar approach
if command -v expect &>/dev/null; then
log_info "Using expect for password automation..."
expect <<EOF
spawn ssh $SSH_OPTS root@${R630_IP} "pvecm add ${ML110_IP}"
expect {
"password:" {
send "${ROOT_PASS}\r"
exp_continue
}
"yes/no" {
send "yes\r"
exp_continue
}
eof
}
EOF
else
log_warn "expect not installed. Install with: sudo apt-get install expect"
log_info "Or use Method 1 (Web UI) or Method 2 (Manual SSH)"
fi
# Verify join
sleep 10
log_info "Verifying cluster join..."
if ssh $SSH_OPTS "root@${R630_IP}" "pvecm status 2>&1" | grep -q "hc-cluster"; then
log_info "✓ R630 successfully joined the cluster!"
ssh $SSH_OPTS "root@${ML110_IP}" "pvecm nodes"
else
log_warn "Cluster join may still be in progress or needs manual approval"
log_info "Check cluster status:"
log_info " ssh root@${ML110_IP} 'pvecm nodes'"
log_info " ssh root@${R630_IP} 'pvecm status'"
fi

View File

@@ -0,0 +1,88 @@
#!/bin/bash
source ~/.bashrc
# Download Ubuntu Cloud-Init Image for Proxmox Template
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Ubuntu versions
UBUNTU_24_04_URL="https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
UBUNTU_22_04_URL="https://cloud-images.ubuntu.com/releases/22.04/release/ubuntu-22.04-server-cloudimg-amd64.img"
VERSION="${1:-24.04}"
DOWNLOAD_DIR="${2:-./downloads}"
main() {
echo "========================================="
echo "Download Ubuntu Cloud-Init Image"
echo "========================================="
echo ""
case "$VERSION" in
24.04)
URL="$UBUNTU_24_04_URL"
FILENAME="ubuntu-24.04-server-cloudimg-amd64.img"
;;
22.04)
URL="$UBUNTU_22_04_URL"
FILENAME="ubuntu-22.04-server-cloudimg-amd64.img"
;;
*)
echo "Error: Unsupported version. Use 22.04 or 24.04"
exit 1
;;
esac
mkdir -p "$DOWNLOAD_DIR"
OUTPUT="$DOWNLOAD_DIR/$FILENAME"
log_step "Downloading Ubuntu $VERSION Cloud Image..."
log_info "URL: $URL"
log_info "Output: $OUTPUT"
echo ""
if [ -f "$OUTPUT" ]; then
log_info "File already exists: $OUTPUT"
read -p "Overwrite? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log_info "Skipping download"
exit 0
fi
fi
# Download with progress
if command -v wget &> /dev/null; then
wget --progress=bar:force -O "$OUTPUT" "$URL"
elif command -v curl &> /dev/null; then
curl -L --progress-bar -o "$OUTPUT" "$URL"
else
log_error "Neither wget nor curl found"
exit 1
fi
log_info "✓ Download complete: $OUTPUT"
echo ""
log_info "Next steps:"
log_info " 1. Upload to Proxmox storage"
log_info " 2. Convert to template"
log_info " 3. Use for cloning VMs"
echo ""
log_info "See: docs/proxmox-ubuntu-images.md for details"
}
main "$@"

View File

@@ -0,0 +1,109 @@
#!/bin/bash
source ~/.bashrc
# Enable SSH on R630 Proxmox Host (192.168.1.49)
# This script attempts to enable SSH via Proxmox API
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
PROXMOX_URL="${PROXMOX_R630_URL:-https://192.168.1.49:8006}"
PROXMOX_USER="${PVE_USERNAME:-root@pam}"
PROXMOX_PASS="${PVE_ROOT_PASS:-}"
PROXMOX_NODE="${PROXMOX_R630_NODE:-pve}"
if [ -z "$PROXMOX_PASS" ]; then
log_error "PVE_ROOT_PASS not set in .env file"
log_info "Please set PVE_ROOT_PASS in .env or provide password:"
read -sp "Password: " PROXMOX_PASS
echo ""
fi
log_info "Attempting to enable SSH on R630 (192.168.1.49) via Proxmox API..."
# Get API token
log_info "Authenticating with Proxmox API..."
RESPONSE=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=${PROXMOX_USER}&password=${PROXMOX_PASS}" \
"${PROXMOX_URL}/api2/json/access/ticket" 2>&1)
if ! echo "$RESPONSE" | grep -q '"data"'; then
log_error "Failed to authenticate with Proxmox API"
log_warn "Response: $RESPONSE"
log_info ""
log_info "Alternative: Enable SSH via Proxmox Web UI:"
log_info " 1. Open ${PROXMOX_URL} in browser"
log_info " 2. Login as root"
log_info " 3. Go to: System → Services → ssh"
log_info " 4. Click 'Enable' and 'Start'"
exit 1
fi
TICKET=$(echo "$RESPONSE" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
CSRF=$(echo "$RESPONSE" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -z "$TICKET" ] || [ -z "$CSRF" ]; then
log_error "Failed to get API token"
exit 1
fi
log_info "✓ Authenticated successfully"
# Enable SSH service
log_info "Enabling SSH service..."
SSH_RESPONSE=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF}" \
"${PROXMOX_URL}/api2/json/nodes/${PROXMOX_NODE}/services/ssh/start" 2>&1)
if echo "$SSH_RESPONSE" | grep -q '"data"'; then
log_info "✓ SSH service started"
else
log_warn "SSH service start response: $SSH_RESPONSE"
fi
# Enable SSH on boot
log_info "Enabling SSH on boot..."
ENABLE_RESPONSE=$(curl -s -k -X POST \
-H "Cookie: PVEAuthCookie=${TICKET}" \
-H "CSRFPreventionToken: ${CSRF}" \
-d "enable=1" \
"${PROXMOX_URL}/api2/json/nodes/${PROXMOX_NODE}/services/ssh" 2>&1)
if echo "$ENABLE_RESPONSE" | grep -q '"data"'; then
log_info "✓ SSH service enabled on boot"
else
log_warn "SSH enable response: $ENABLE_RESPONSE"
fi
# Test SSH access
log_info "Testing SSH access..."
sleep 2
if ssh -i "${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}" -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@192.168.1.49" "echo 'SSH OK'" &>/dev/null; then
log_info "✓ SSH access confirmed!"
log_info "You can now SSH to R630:"
log_info " ssh -i ~/.ssh/id_ed25519_proxmox root@192.168.1.49"
else
log_warn "SSH test failed. SSH may need a moment to start."
log_info "Try manually: ssh root@192.168.1.49"
fi

View File

@@ -0,0 +1,172 @@
#!/bin/bash
source ~/.bashrc
# Fix Corrupted Proxmox Cloud Image
# This script removes corrupted images and helps re-upload a fresh copy
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
RED='\033[0;31m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
PROXMOX_HOST="${PROXMOX_ML110_URL#https://}"
PROXMOX_HOST="${PROXMOX_HOST%%:*}"
IMAGE_NAME="ubuntu-24.04-server-cloudimg-amd64.img"
LOCAL_IMAGE="${1:-./downloads/${IMAGE_NAME}}"
REMOTE_PATH="/var/lib/vz/template/iso/${IMAGE_NAME}"
REMOTE_IMPORT_PATH="/var/lib/vz/import/${IMAGE_NAME}.raw"
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
main() {
echo "========================================="
echo "Fix Corrupted Proxmox Cloud Image"
echo "========================================="
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
if [ -z "$PROXMOX_HOST" ]; then
log_error "PROXMOX_ML110_URL not set in .env"
exit 1
fi
log_step "Target Proxmox host: $PROXMOX_HOST"
log_info "Image name: $IMAGE_NAME"
echo ""
# Check if local image exists
if [ ! -f "$LOCAL_IMAGE" ]; then
log_warn "Local image not found: $LOCAL_IMAGE"
log_info "Downloading image..."
./scripts/download-ubuntu-cloud-image.sh 24.04
LOCAL_IMAGE="./downloads/${IMAGE_NAME}"
if [ ! -f "$LOCAL_IMAGE" ]; then
log_error "Failed to download image"
exit 1
fi
fi
# Verify local image
log_step "1. Verifying local image..."
if qemu-img info "$LOCAL_IMAGE" > /dev/null 2>&1; then
IMAGE_SIZE=$(du -h "$LOCAL_IMAGE" | cut -f1)
log_info "✓ Local image is valid (Size: $IMAGE_SIZE)"
else
log_error "✗ Local image appears corrupted"
log_info "Re-downloading..."
rm -f "$LOCAL_IMAGE"
./scripts/download-ubuntu-cloud-image.sh 24.04
fi
# Check SSH access
log_step "2. Testing SSH access to Proxmox host..."
if ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@$PROXMOX_HOST "echo 'Connected'" > /dev/null 2>&1; then
log_info "✓ SSH access confirmed"
else
log_error "✗ Cannot connect to Proxmox host via SSH"
log_info "Make sure:"
log_info " 1. SSH is enabled on Proxmox host"
log_info " 2. Root login is allowed (or use SSH keys)"
log_info " 3. Host is reachable from this machine"
exit 1
fi
# Remove corrupted remote files
log_step "3. Removing corrupted image files on Proxmox host..."
ssh root@$PROXMOX_HOST "
if [ -f '$REMOTE_PATH' ]; then
echo 'Removing: $REMOTE_PATH'
rm -f '$REMOTE_PATH'
fi
if [ -f '$REMOTE_IMPORT_PATH' ]; then
echo 'Removing: $REMOTE_IMPORT_PATH'
rm -f '$REMOTE_IMPORT_PATH'
fi
echo 'Cleanup complete'
"
# Upload fresh image
log_step "4. Uploading fresh image to Proxmox host..."
log_info "This may take several minutes depending on your network speed..."
log_info "Uploading: $LOCAL_IMAGE"
log_info "To: root@$PROXMOX_HOST:$REMOTE_PATH"
echo ""
# Create directory if it doesn't exist
ssh root@$PROXMOX_HOST "mkdir -p /var/lib/vz/template/iso"
# Upload with progress
if command -v rsync &> /dev/null; then
log_info "Using rsync (with progress)..."
rsync -avz --progress "$LOCAL_IMAGE" root@$PROXMOX_HOST:"$REMOTE_PATH"
else
log_info "Using scp..."
scp "$LOCAL_IMAGE" root@$PROXMOX_HOST:"$REMOTE_PATH"
fi
# Verify uploaded image
log_step "5. Verifying uploaded image on Proxmox host..."
if ssh root@$PROXMOX_HOST "qemu-img info '$REMOTE_PATH' > /dev/null 2>&1"; then
REMOTE_SIZE=$(ssh root@$PROXMOX_HOST "du -h '$REMOTE_PATH' | cut -f1")
log_info "✓ Image uploaded successfully (Size: $REMOTE_SIZE)"
else
log_error "✗ Uploaded image verification failed"
log_warn "The file may still be uploading or there may be storage issues"
exit 1
fi
# Set proper permissions
log_step "6. Setting file permissions..."
ssh root@$PROXMOX_HOST "chmod 644 '$REMOTE_PATH'"
log_info "✓ Permissions set"
echo ""
log_info "========================================="
log_info "Image Fix Complete!"
log_info "========================================="
log_info ""
log_info "The image has been successfully uploaded to:"
log_info " $REMOTE_PATH"
log_info ""
log_info "Next steps:"
log_info " 1. Verify the image in Proxmox Web UI:"
log_info " Storage → local → Content"
log_info " 2. Follow CREATE_VM_9000_STEPS.md to create VM 9000"
log_info " 3. Or run: ./scripts/verify-proxmox-image.sh"
echo ""
}
main "$@"

View File

@@ -0,0 +1,336 @@
#!/bin/bash
source ~/.bashrc
# Improve Template VM 9000 with Recommended Enhancements
# This script applies all recommended improvements to template 9000
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo ""
echo -e "${BLUE}========================================${NC}"
echo -e "${BLUE}$1${NC}"
echo -e "${BLUE}========================================${NC}"
echo ""
}
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
SSH_OPTS="-i $SSH_KEY -o StrictHostKeyChecking=no"
TEMPLATE_VMID=9000
TEMP_VMID=9999
TEMP_VM_NAME="template-update-temp"
VM_USER="${VM_USER:-ubuntu}"
# Check if running on Proxmox host or remotely
if command -v qm >/dev/null 2>&1; then
RUN_LOCAL=true
PROXMOX_CMD=""
else
RUN_LOCAL=false
PROXMOX_CMD="ssh $SSH_OPTS root@$PROXMOX_HOST"
fi
run_proxmox_cmd() {
if [ "$RUN_LOCAL" = true ]; then
eval "$1"
else
ssh $SSH_OPTS "root@$PROXMOX_HOST" "$1"
fi
}
wait_for_ssh() {
local ip="$1"
local max_attempts=30
local attempt=0
log_info "Waiting for SSH to be available on $ip..."
while [ $attempt -lt $max_attempts ]; do
if ssh $SSH_OPTS -o ConnectTimeout=5 "${VM_USER}@${ip}" "echo 'SSH ready'" &>/dev/null; then
log_info "✓ SSH is ready"
return 0
fi
attempt=$((attempt + 1))
sleep 2
done
log_error "SSH not available after $max_attempts attempts"
return 1
}
get_vm_ip() {
local vmid="$1"
local ip=""
# Try to use helper library if available (when running on Proxmox host)
if [ "$RUN_LOCAL" = true ] && [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" 2>/dev/null || true
if command -v get_vm_ip_from_guest_agent &>/dev/null; then
ip=$(get_vm_ip_from_guest_agent "$vmid" 2>/dev/null || echo "")
if [[ -n "$ip" && "$ip" != "null" ]]; then
echo "$ip"
return 0
fi
fi
fi
# Try to get IP from guest agent using jq directly (suppress errors)
if run_proxmox_cmd "command -v jq >/dev/null 2>&1" 2>/dev/null; then
ip=$(run_proxmox_cmd "qm guest cmd $vmid network-get-interfaces 2>/dev/null | jq -r '.[]?.\"ip-addresses\"[]? | select(.[\"ip-address-type\"] == \"ipv4\" and .\"ip-address\" != \"127.0.0.1\") | .\"ip-address\"' | head -n1" 2>/dev/null || echo "")
if [[ -n "$ip" && "$ip" != "null" && "$ip" != "" ]]; then
echo "$ip"
return 0
fi
fi
# Try MAC-based discovery: get VM MAC and match with ARP table
local mac
mac=$(run_proxmox_cmd "qm config $vmid 2>/dev/null | grep -E '^net0:' | cut -d',' -f1 | cut -d'=' -f2 | tr '[:upper:]' '[:lower:]' | tr -d ':'" 2>/dev/null || echo "")
if [[ -n "$mac" ]]; then
# Format MAC for matching (with colons)
local mac_formatted="${mac:0:2}:${mac:2:2}:${mac:4:2}:${mac:6:2}:${mac:8:2}:${mac:10:2}"
# Try to find IP in ARP table
ip=$(run_proxmox_cmd "ip neigh show 2>/dev/null | grep -i '$mac_formatted' | grep -oE '192\.168\.1\.[0-9]+' | head -n1" 2>/dev/null || echo "")
if [[ -n "$ip" ]]; then
echo "$ip"
return 0
fi
fi
# Return empty string (not a warning message)
echo ""
return 1
}
main() {
log_step "Template 9000 Improvement Script"
log_warn "This script will:"
log_warn " 1. Clone template 9000 to temporary VM 9999"
log_warn " 2. Boot the temporary VM"
log_warn " 3. Apply all recommended improvements"
log_warn " 4. Convert back to template"
log_warn " 5. Replace original template 9000"
echo ""
# Check if template exists
if ! run_proxmox_cmd "qm config $TEMPLATE_VMID &>/dev/null"; then
log_error "Template VM $TEMPLATE_VMID not found"
exit 1
fi
# Check if temp VM already exists
if run_proxmox_cmd "qm config $TEMP_VMID &>/dev/null" 2>/dev/null; then
log_warn "Temporary VM $TEMP_VMID already exists. Destroying it..."
run_proxmox_cmd "qm stop $TEMP_VMID" 2>/dev/null || true
sleep 2
run_proxmox_cmd "qm destroy $TEMP_VMID --purge" 2>/dev/null || true
sleep 2
fi
# Step 1: Clone template
log_step "Step 1: Cloning Template to Temporary VM"
log_info "Cloning template $TEMPLATE_VMID to VM $TEMP_VMID..."
run_proxmox_cmd "qm clone $TEMPLATE_VMID $TEMP_VMID --name $TEMP_VM_NAME"
log_info "✓ Template cloned"
# Step 2: Boot temporary VM
log_step "Step 2: Booting Temporary VM"
log_info "Starting VM $TEMP_VMID..."
run_proxmox_cmd "qm start $TEMP_VMID"
log_info "Waiting for VM to boot and get DHCP IP (this may take 60-90 seconds)..."
sleep 60
# Step 3: Get IP and wait for SSH
log_step "Step 3: Getting VM IP and Waiting for SSH"
local vm_ip=""
# Try multiple times to get IP (VM may still be booting)
log_info "Attempting to discover VM IP (may take a few attempts)..."
for attempt in {1..10}; do
vm_ip=$(get_vm_ip "$TEMP_VMID" 2>/dev/null || echo "")
if [[ -n "$vm_ip" ]]; then
log_info "✓ Discovered IP: $vm_ip"
break
fi
if [ $attempt -lt 10 ]; then
log_info "Attempt $attempt/10: Waiting for VM to finish booting..."
sleep 10
fi
done
# If still no IP, try to get from Proxmox API or prompt user
if [[ -z "$vm_ip" ]]; then
log_warn "Could not automatically discover IP via guest agent."
log_info "Please check Proxmox web UI or router DHCP leases for VM $TEMP_VMID IP address."
log_info "You can also check with: ssh root@$PROXMOX_HOST 'qm config $TEMP_VMID'"
echo ""
read -p "Enter the VM IP address (or press Enter to skip and try again later): " vm_ip
if [[ -z "$vm_ip" ]]; then
log_error "IP address required. Exiting."
log_info "VM $TEMP_VMID is running. You can manually:"
log_info " 1. Get the IP from Proxmox UI or router"
log_info " 2. SSH into the VM and apply improvements manually"
log_info " 3. Run this script again with the IP"
exit 1
fi
fi
wait_for_ssh "$vm_ip" || {
log_error "Failed to connect to VM. Please check:"
log_error " 1. VM is booted: qm status $TEMP_VMID"
log_error " 2. IP address is correct: $vm_ip"
log_error " 3. SSH key is correct: $SSH_KEY"
exit 1
}
# Step 4: Apply improvements
log_step "Step 4: Applying Template Improvements"
log_info "Installing essential packages and QEMU Guest Agent..."
ssh $SSH_OPTS "${VM_USER}@${vm_ip}" <<'EOF'
set -e
sudo apt-get update -qq
sudo DEBIAN_FRONTEND=noninteractive apt-get upgrade -y -qq
# Install essential packages
sudo apt-get install -y \
jq \
curl \
wget \
git \
vim \
nano \
net-tools \
htop \
unattended-upgrades \
apt-transport-https \
ca-certificates \
qemu-guest-agent \
ufw
# Enable and start QEMU Guest Agent
sudo systemctl enable qemu-guest-agent
sudo systemctl start qemu-guest-agent
# Configure automatic security updates
echo 'Unattended-Upgrade::Automatic-Reboot "false";' | sudo tee -a /etc/apt/apt.conf.d/50unattended-upgrades > /dev/null
echo 'Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";' | sudo tee -a /etc/apt/apt.conf.d/50unattended-upgrades > /dev/null
# Set timezone
sudo timedatectl set-timezone UTC
# Configure locale
sudo locale-gen en_US.UTF-8
sudo update-locale LANG=en_US.UTF-8
# SSH hardening (disable root login, password auth)
sudo sed -i 's/#PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sudo sed -i 's/#PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/#PubkeyAuthentication.*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
sudo systemctl restart sshd
# Install UFW (firewall) but don't enable it - let VMs configure as needed
# UFW is installed but not enabled, so VMs can configure firewall rules per their needs
# Clean up disk
sudo apt-get autoremove -y -qq
sudo apt-get autoclean -qq
sudo rm -rf /tmp/*
sudo rm -rf /var/tmp/*
sudo truncate -s 0 /var/log/*.log 2>/dev/null || true
sudo journalctl --vacuum-time=1d --quiet
# Create template version file
echo "template-9000-v1.1.0-$(date +%Y%m%d)" | sudo tee /etc/template-version > /dev/null
echo "✓ All improvements applied"
EOF
if [ $? -ne 0 ]; then
log_error "Failed to apply improvements"
exit 1
fi
log_info "✓ All improvements applied successfully"
# Step 5: Stop VM and convert to template
log_step "Step 5: Converting Back to Template"
log_info "Stopping VM $TEMP_VMID..."
run_proxmox_cmd "qm stop $TEMP_VMID"
sleep 5
log_info "Converting VM $TEMP_VMID to template..."
run_proxmox_cmd "qm template $TEMP_VMID"
log_info "✓ VM converted to template"
# Step 6: Replace original template
log_step "Step 6: Replacing Original Template"
log_warn "This will destroy the original template 9000 and replace it with the improved version"
echo ""
if [ -t 0 ]; then
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled. Improved template is available as VM $TEMP_VMID"
log_info "You can manually:"
log_info " 1. Destroy template 9000: qm destroy 9000"
log_info " 2. Change VMID: qm set $TEMP_VMID --vmid 9000"
exit 0
fi
else
log_info "Non-interactive mode: auto-confirming"
fi
log_info "Destroying original template 9000..."
run_proxmox_cmd "qm destroy $TEMPLATE_VMID --purge" 2>/dev/null || true
sleep 2
log_info "Changing VMID from $TEMP_VMID to $TEMPLATE_VMID..."
run_proxmox_cmd "qm set $TEMP_VMID --vmid $TEMPLATE_VMID"
log_step "Template Improvement Complete!"
log_info "✓ Template 9000 has been improved with:"
log_info " - QEMU Guest Agent pre-installed and enabled"
log_info " - Essential utilities (jq, curl, wget, git, vim, nano, htop, net-tools, etc.)"
log_info " - Automatic security updates configured (unattended-upgrades)"
log_info " - Timezone set to UTC"
log_info " - Locale configured (en_US.UTF-8)"
log_info " - SSH hardened (no root login, no password auth, pubkey only)"
log_info " - UFW firewall installed (not enabled - VMs configure as needed)"
log_info " - Disk optimized and cleaned"
log_info " - Template version tracking (/etc/template-version)"
log_info ""
log_info "You can now clone VMs from template 9000 and they will have all these improvements!"
}
main "$@"

View File

@@ -0,0 +1,117 @@
#!/bin/bash
source ~/.bashrc
# Install QEMU Guest Agent in All VMs
# Uses guest-agent IP discovery (with fallback for bootstrap)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
# VMID NAME (no IP - discovered via guest agent)
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
# Fallback IPs for bootstrap (when guest agent not yet installed)
# Format: VMID:IP
declare -A FALLBACK_IPS=(
["100"]="192.168.1.60"
["101"]="192.168.1.188"
["102"]="192.168.1.121"
["103"]="192.168.1.82"
)
# Import helper library
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found. Run this script on Proxmox host or via SSH."
exit 1
fi
main() {
log_info "Installing QEMU Guest Agent in all VMs"
echo ""
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
echo "=== VM $vmid: $name ==="
# Make sure agent is enabled in Proxmox VM config
ensure_guest_agent_enabled "$vmid" || true
# Get IP - try guest agent first, fallback to hardcoded for bootstrap
local ip
ip="$(get_vm_ip_or_fallback "$vmid" "$name" "${FALLBACK_IPS[$vmid]:-}" || true)"
if [[ -z "$ip" ]]; then
log_warn "Skipping: no IP available for VM $vmid ($name)"
echo
continue
fi
echo " Using IP: $ip installing qemu-guest-agent inside guest (idempotent)..."
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5 "${VM_USER}@${ip}" <<'EOF'
set -e
sudo apt-get update -qq
sudo apt-get install -y qemu-guest-agent > /dev/null 2>&1
sudo systemctl enable --now qemu-guest-agent
systemctl is-active qemu-guest-agent && echo "✓ QEMU Guest Agent is running"
EOF
then
log_info " ✓ QEMU Guest Agent installed and started"
# Wait a moment for agent to be ready, then verify
sleep 3
local discovered_ip
discovered_ip="$(get_vm_ip_from_guest_agent "$vmid" || true)"
if [[ -n "$discovered_ip" ]]; then
log_info " ✓ Guest agent IP discovery working: $discovered_ip"
fi
else
log_error " ✗ Failed to install QEMU Guest Agent"
fi
echo
done
log_info "Done. All VMs should now support guest-agent IP discovery."
}
main "$@"

View File

@@ -0,0 +1,354 @@
#!/bin/bash
source ~/.bashrc
# Destroy Existing VMs and Recreate from Ubuntu Cloud-Init Template
# This script creates a template from Ubuntu Cloud Image and recreates all VMs
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_header() {
echo -e "${CYAN}========================================${NC}"
echo -e "${CYAN}$1${NC}"
echo -e "${CYAN}========================================${NC}"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="https://192.168.1.206:8006"
PROXMOX_NODE="pve"
STORAGE="${STORAGE:-local-lvm}"
TEMPLATE_ID=9000
TEMPLATE_NAME="ubuntu-24.04-cloudinit"
# VM Configuration
declare -A VMS=(
[100]="cloudflare-tunnel:2:4096:40G:192.168.1.60:192.168.1.1"
[101]="k3s-master:4:8192:80G:192.168.1.188:192.168.1.1"
[102]="git-server:2:4096:100G:192.168.1.121:192.168.1.1"
[103]="observability:4:8192:200G:192.168.1.82:192.168.1.1"
)
# Get authentication ticket
get_ticket() {
local response=$(curl -k -s -d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket")
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -z "$ticket" ] || [ -z "$csrf" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
echo "$ticket|$csrf"
}
# Check if template exists
template_exists() {
local auth=$1
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
local response=$(curl -k -s \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_ID/config" 2>&1)
if echo "$response" | grep -q '"name"'; then
return 0
else
return 1
fi
}
# Download Ubuntu Cloud Image
download_cloud_image() {
local image_url="https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img"
local image_file="/tmp/ubuntu-24.04-server-cloudimg-amd64.img"
log_step "Downloading Ubuntu 24.04 Cloud Image..."
if [ -f "$image_file" ]; then
log_info "Cloud image already exists: $image_file"
return 0
fi
log_info "Downloading from: $image_url"
log_warn "This may take several minutes (image is ~2GB)..."
if command -v wget &> /dev/null; then
wget --progress=bar:force -O "$image_file" "$image_url" || return 1
elif command -v curl &> /dev/null; then
curl -L --progress-bar -o "$image_file" "$image_url" || return 1
else
log_error "Neither wget nor curl found"
return 1
fi
log_info "✓ Cloud image downloaded"
echo "$image_file"
}
# Create template from cloud image
create_template() {
local auth=$1
local image_file=$2
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_step "Creating template from cloud image..."
# Check if template already exists
if template_exists "$auth"; then
log_info "Template $TEMPLATE_ID already exists, skipping creation"
return 0
fi
log_warn "Template creation requires manual steps in Proxmox Web UI:"
echo ""
log_info "1. Upload cloud image to Proxmox:"
log_info " - Go to: Datacenter → $PROXMOX_NODE → Storage → local"
log_info " - Click 'Upload' → Select: $image_file"
log_info " - Wait for upload to complete"
echo ""
log_info "2. Create VM from image:"
log_info " - Create VM (ID: $TEMPLATE_ID)"
log_info " - Import disk from uploaded image"
log_info " - Set CPU: 2, Memory: 2048MB"
log_info " - Add network device"
log_info " - Enable Cloud-Init in Options"
log_info " - Convert to template"
echo ""
log_warn "After template is created, press Enter to continue..."
read -p "Press Enter when template is ready..."
}
# Destroy existing VM
destroy_vm() {
local auth=$1
local vmid=$2
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_step "Destroying VM $vmid..."
# Stop VM if running
curl -k -s -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/stop" > /dev/null 2>&1
sleep 2
# Delete VM
local response=$(curl -k -s -X DELETE \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid" 2>&1)
if echo "$response" | grep -q '"errors"'; then
log_error "Failed to destroy VM: $response"
return 1
fi
log_info "✓ VM $vmid destroyed"
return 0
}
# Create VM from template
create_vm_from_template() {
local auth=$1
local vmid=$2
local name=$3
local cores=$4
local memory=$5
local disk_size=$6
local ip_address=$7
local gateway=$8
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_step "Creating VM $vmid: $name from template..."
# Clone template
local clone_response=$(curl -k -s -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
-d "newid=$vmid" \
-d "name=$name" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$TEMPLATE_ID/clone" 2>&1)
if echo "$clone_response" | grep -q '"errors"'; then
log_error "Failed to clone template: $clone_response"
return 1
fi
log_info "Template cloned, waiting for completion..."
sleep 5
# Configure VM
log_info "Configuring VM..."
# Set CPU, memory, disk
curl -k -s -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
-d "cores=$cores" \
-d "memory=$memory" \
-d "net0=virtio,bridge=vmbr0" \
-d "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1
# Resize disk if needed
local current_disk=$(curl -k -s \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" | \
grep -o '"scsi0":"[^"]*' | cut -d'"' -f4 | cut -d',' -f2 | cut -d'=' -f2)
if [ "$current_disk" != "$disk_size" ]; then
log_info "Resizing disk to $disk_size..."
curl -k -s -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
-d "scsi0=${STORAGE}:${disk_size}" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/resize" > /dev/null 2>&1
fi
# Configure Cloud-Init
log_info "Configuring Cloud-Init..."
curl -k -s -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
-d "ipconfig0=ip=${ip_address}/24,gw=${gateway}" \
-d "ciuser=ubuntu" \
-d "cipassword=" \
-d "sshkeys=$(cat ~/.ssh/id_rsa.pub 2>/dev/null | base64 -w 0 || echo '')" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" > /dev/null 2>&1
log_info "✓ VM $vmid created and configured"
return 0
}
main() {
log_header "Recreate VMs from Cloud-Init Template"
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
log_warn "This will DESTROY existing VMs (100, 101, 102, 103)"
log_warn "And recreate them from a Cloud-Init template"
echo ""
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
exit 0
fi
# Authenticate
auth=$(get_ticket)
if [ $? -ne 0 ]; then
exit 1
fi
# Step 1: Download cloud image
image_file=$(download_cloud_image)
if [ $? -ne 0 ]; then
log_error "Failed to download cloud image"
exit 1
fi
# Step 2: Create template (manual steps required)
create_template "$auth" "$image_file"
# Verify template exists
if ! template_exists "$auth"; then
log_error "Template does not exist. Please create it first."
exit 1
fi
# Step 3: Destroy existing VMs
log_header "Destroying Existing VMs"
for vmid in 100 101 102 103; do
destroy_vm "$auth" "$vmid" || log_warn "Failed to destroy VM $vmid"
done
sleep 3
# Step 4: Create VMs from template
log_header "Creating VMs from Template"
for vmid in 100 101 102 103; do
IFS=':' read -r name cores memory disk_size ip_address gateway <<< "${VMS[$vmid]}"
if create_vm_from_template "$auth" "$vmid" "$name" "$cores" "$memory" "$disk_size" "$ip_address" "$gateway"; then
log_info "✓ VM $vmid created"
else
log_error "✗ Failed to create VM $vmid"
fi
echo ""
done
# Step 5: Start VMs
log_header "Starting VMs"
for vmid in 100 101 102 103; do
log_info "Starting VM $vmid..."
ticket=$(echo "$auth" | cut -d'|' -f1)
csrf=$(echo "$auth" | cut -d'|' -f2)
curl -k -s -X POST \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/start" > /dev/null 2>&1
log_info "✓ VM $vmid started"
done
log_header "VM Recreation Complete!"
echo ""
log_info "VMs are being created from template with Cloud-Init"
log_info "They will boot automatically and configure themselves"
log_info "No manual installation required!"
echo ""
log_info "Next steps:"
echo " 1. Wait 2-3 minutes for VMs to boot"
echo " 2. Check readiness: ./scripts/check-vm-readiness.sh"
echo " 3. Run tasks: ./scripts/complete-all-vm-tasks.sh"
}
main "$@"

View File

@@ -0,0 +1,164 @@
#!/bin/bash
source ~/.bashrc
# Complete Cloudflare Tunnel Setup Script
# Run this on the Cloudflare Tunnel VM after OS installation
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root (use sudo)"
exit 1
fi
log_step "Step 1: Installing cloudflared..."
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared
cloudflared --version
log_info "cloudflared installed successfully"
log_step "Step 2: Creating cloudflared user..."
useradd -r -s /bin/false cloudflared || log_warn "User cloudflared may already exist"
mkdir -p /etc/cloudflared
chown cloudflared:cloudflared /etc/cloudflared
log_step "Step 3: Authenticating cloudflared..."
log_warn "You need to authenticate cloudflared manually:"
echo ""
echo "Run this command:"
echo " cloudflared tunnel login"
echo ""
echo "This will open a browser for authentication."
echo "After authentication, press Enter to continue..."
read -p "Press Enter after completing authentication..."
log_step "Step 4: Creating tunnel..."
log_warn "Creating tunnel 'azure-stack-hci'..."
log_warn "If tunnel already exists, you can skip this step."
read -p "Create new tunnel? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
cloudflared tunnel create azure-stack-hci || log_warn "Tunnel may already exist"
fi
# Get tunnel ID
TUNNEL_ID=$(cloudflared tunnel list | grep azure-stack-hci | awk '{print $1}' | head -1)
if [ -z "$TUNNEL_ID" ]; then
log_error "Could not find tunnel ID. Please create tunnel manually."
exit 1
fi
log_info "Tunnel ID: $TUNNEL_ID"
log_step "Step 5: Creating tunnel configuration..."
cat > /etc/cloudflared/config.yml <<EOF
tunnel: $TUNNEL_ID
credentials-file: /etc/cloudflared/$TUNNEL_ID.json
ingress:
# Proxmox UI
- hostname: proxmox.yourdomain.com
service: https://192.168.1.206:8006
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 30s
# Proxmox R630
- hostname: proxmox-r630.yourdomain.com
service: https://192.168.1.49:8006
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 30s
# Grafana Dashboard
- hostname: grafana.yourdomain.com
service: http://192.168.1.82:3000
originRequest:
connectTimeout: 30s
# Git Server
- hostname: git.yourdomain.com
service: https://192.168.1.121:443
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 30s
# K3s Dashboard (if exposed)
- hostname: k3s.yourdomain.com
service: https://192.168.1.188:6443
originRequest:
noHappyEyeballs: true
tcpKeepAlive: 30
connectTimeout: 30s
# Catch-all (must be last)
- service: http_status:404
EOF
chmod 600 /etc/cloudflared/config.yml
chown cloudflared:cloudflared /etc/cloudflared/config.yml
log_info "Configuration file created: /etc/cloudflared/config.yml"
log_warn "Update hostnames in config.yml to match your domain!"
log_step "Step 6: Creating systemd service..."
cat > /etc/systemd/system/cloudflared.service <<EOF
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=cloudflared
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/config.yml run
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
log_step "Step 7: Enabling and starting service..."
systemctl daemon-reload
systemctl enable cloudflared
systemctl start cloudflared
sleep 2
systemctl status cloudflared --no-pager
log_info "========================================="
log_info "Cloudflare Tunnel Setup Complete!"
log_info "========================================="
echo ""
log_warn "Next steps:"
echo " 1. Update /etc/cloudflared/config.yml with your actual domain"
echo " 2. Configure DNS records in Cloudflare Dashboard"
echo " 3. Set up Zero Trust policies in Cloudflare Dashboard"
echo " 4. Test tunnel connectivity: cloudflared tunnel info azure-stack-hci"
echo ""
log_info "Tunnel status: systemctl status cloudflared"
log_info "Tunnel logs: journalctl -u cloudflared -f"

View File

@@ -0,0 +1,119 @@
#!/bin/bash
source ~/.bashrc
# Git Server Setup Script (Gitea)
# Run this on the Git Server VM after OS installation
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root (use sudo)"
exit 1
fi
GITEA_VERSION="${GITEA_VERSION:-1.21.0}"
GITEA_USER="${GITEA_USER:-git}"
GITEA_HOME="${GITEA_HOME:-/var/lib/gitea}"
GITEA_DOMAIN="${GITEA_DOMAIN:-192.168.1.121}"
GITEA_PORT="${GITEA_PORT:-3000}"
log_step "Step 1: Installing dependencies..."
apt-get update
apt-get install -y git sqlite3
log_step "Step 2: Creating Gitea user..."
useradd -r -s /bin/bash -m -d "$GITEA_HOME" "$GITEA_USER" || log_warn "User $GITEA_USER may already exist"
log_step "Step 3: Downloading and installing Gitea..."
mkdir -p "$GITEA_HOME"
cd "$GITEA_HOME"
wget -O gitea "https://dl.gitea.io/gitea/${GITEA_VERSION}/gitea-${GITEA_VERSION}-linux-amd64"
chmod +x gitea
chown -R "$GITEA_USER:$GITEA_USER" "$GITEA_HOME"
log_step "Step 4: Creating systemd service..."
cat > /etc/systemd/system/gitea.service <<EOF
[Unit]
Description=Gitea (Git with a cup of tea)
After=network.target
[Service]
Type=simple
User=$GITEA_USER
Group=$GITEA_USER
WorkingDirectory=$GITEA_HOME
ExecStart=$GITEA_HOME/gitea web --config $GITEA_HOME/custom/conf/app.ini
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
log_step "Step 5: Creating initial configuration..."
mkdir -p "$GITEA_HOME/custom/conf"
cat > "$GITEA_HOME/custom/conf/app.ini" <<EOF
[server]
DOMAIN = $GITEA_DOMAIN
HTTP_PORT = $GITEA_PORT
ROOT_URL = http://$GITEA_DOMAIN:$GITEA_PORT/
DISABLE_SSH = false
SSH_PORT = 22
[database]
DB_TYPE = sqlite3
PATH = $GITEA_HOME/data/gitea.db
[repository]
ROOT = $GITEA_HOME/gitea-repositories
[log]
MODE = file
LEVEL = Info
EOF
chown -R "$GITEA_USER:$GITEA_USER" "$GITEA_HOME"
log_step "Step 6: Enabling and starting Gitea..."
systemctl daemon-reload
systemctl enable gitea
systemctl start gitea
sleep 3
systemctl status gitea --no-pager
log_info "========================================="
log_info "Gitea Installation Complete!"
log_info "========================================="
echo ""
log_info "Access Gitea at: http://$GITEA_DOMAIN:$GITEA_PORT"
log_info "Initial setup will be required on first access"
echo ""
log_info "Next steps:"
echo " 1. Complete initial Gitea setup via web UI"
echo " 2. Create GitOps repository"
echo " 3. Configure SSH keys for Git access"
echo " 4. Set up Flux GitOps (if using)"

View File

@@ -0,0 +1,261 @@
#!/bin/bash
source ~/.bashrc
# Install and Configure QEMU Guest Agent on All VMs
# This script installs qemu-guest-agent on Ubuntu VMs and enables it in Proxmox
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="https://192.168.1.206:8006"
PROXMOX_NODE="pve"
# VM Configuration
declare -A VMS=(
[100]="cloudflare-tunnel:192.168.1.60"
[101]="k3s-master:192.168.1.188"
[102]="git-server:192.168.1.121"
[103]="observability:192.168.1.82"
)
SSH_USER="${SSH_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-~/.ssh/id_rsa}"
# Get authentication ticket
get_ticket() {
local response=$(curl -k -s -d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket")
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -z "$ticket" ] || [ -z "$csrf" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
echo "$ticket|$csrf"
}
# Check if VM is reachable
check_vm_reachable() {
local ip=$1
if ping -c 1 -W 2 "$ip" > /dev/null 2>&1; then
return 0
else
return 1
fi
}
# Check SSH connectivity
check_ssh() {
local ip=$1
local user=$2
if ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -i "$SSH_KEY" "${user}@${ip}" "echo 'SSH OK'" > /dev/null 2>&1; then
return 0
else
return 1
fi
}
# Install guest agent on VM
install_guest_agent_on_vm() {
local vmid=$1
local name=$2
local ip=$3
log_step "Installing QEMU Guest Agent on VM $vmid: $name"
# Check if VM is reachable
if ! check_vm_reachable "$ip"; then
log_error "VM at $ip is not reachable, skipping..."
return 1
fi
# Check SSH
if ! check_ssh "$ip" "$SSH_USER"; then
log_error "SSH not available on $ip, skipping..."
return 1
fi
log_info "Installing qemu-guest-agent via SSH..."
# Install qemu-guest-agent
ssh -o StrictHostKeyChecking=no -i "$SSH_KEY" "${SSH_USER}@${ip}" <<EOF
sudo apt-get update
sudo apt-get install -y qemu-guest-agent
sudo systemctl enable qemu-guest-agent
sudo systemctl start qemu-guest-agent
sudo systemctl status qemu-guest-agent --no-pager | head -5
EOF
if [ $? -eq 0 ]; then
log_info "✓ Guest agent installed and started on VM $vmid"
return 0
else
log_error "✗ Failed to install guest agent on VM $vmid"
return 1
fi
}
# Enable guest agent in Proxmox VM configuration
enable_guest_agent_in_proxmox() {
local auth=$1
local vmid=$2
local name=$3
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_step "Enabling guest agent in Proxmox for VM $vmid: $name"
# Enable agent in VM config
local response=$(curl -k -s -X PUT \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
-d "agent=1" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/config" 2>&1)
if echo "$response" | grep -q '"errors"'; then
log_error "Failed to enable agent: $response"
return 1
fi
log_info "✓ Guest agent enabled in Proxmox for VM $vmid"
return 0
}
# Verify guest agent is working
verify_guest_agent() {
local auth=$1
local vmid=$2
local name=$3
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_step "Verifying guest agent for VM $vmid: $name"
# Check agent status via Proxmox API
local response=$(curl -k -s \
-H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/agent/get-fsinfo" 2>&1)
if echo "$response" | grep -q '"result"'; then
log_info "✓ Guest agent is responding"
return 0
else
log_warn "⚠ Guest agent may not be fully ready yet"
log_info " This is normal if VM was just configured"
log_info " Agent may take a few minutes to initialize"
return 1
fi
}
main() {
echo "========================================="
echo "Setup QEMU Guest Agent on All VMs"
echo "========================================="
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
if [ ! -f "$SSH_KEY" ]; then
log_error "SSH key not found: $SSH_KEY"
log_info "Set SSH_KEY environment variable or create key pair"
exit 1
fi
log_info "Using SSH key: $SSH_KEY"
log_info "SSH user: $SSH_USER"
echo ""
# Authenticate with Proxmox
auth=$(get_ticket)
if [ $? -ne 0 ]; then
exit 1
fi
# Process each VM
for vmid in 100 101 102 103; do
IFS=':' read -r name ip <<< "${VMS[$vmid]}"
echo "----------------------------------------"
log_step "Processing VM $vmid: $name"
echo ""
# Step 1: Install guest agent on VM
if install_guest_agent_on_vm "$vmid" "$name" "$ip"; then
log_info "✓ Guest agent installed on VM"
else
log_error "✗ Failed to install guest agent"
echo ""
continue
fi
# Step 2: Enable agent in Proxmox
if enable_guest_agent_in_proxmox "$auth" "$vmid" "$name"; then
log_info "✓ Agent enabled in Proxmox"
else
log_error "✗ Failed to enable agent in Proxmox"
fi
# Step 3: Verify (optional, may take time)
sleep 2
verify_guest_agent "$auth" "$vmid" "$name" || true
echo ""
done
log_info "========================================="
log_info "Guest Agent Setup Complete"
log_info "========================================="
echo ""
log_info "Benefits of QEMU Guest Agent:"
echo " • Proper VM shutdown/reboot from Proxmox"
echo " • Automatic IP address detection"
echo " • Better VM status reporting"
echo " • File system information"
echo ""
log_warn "Note: Guest agent may take a few minutes to fully initialize"
log_info "You can verify in Proxmox Web UI:"
echo " VM → Monitor → QEMU Guest Agent"
}
main "$@"

View File

@@ -0,0 +1,83 @@
#!/bin/bash
source ~/.bashrc
# K3s Installation Script
# Run this on the K3s VM after OS installation
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root (use sudo)"
exit 1
fi
log_step "Step 1: Installing K3s..."
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--write-kubeconfig-mode 644" sh -
log_step "Step 2: Verifying K3s installation..."
systemctl status k3s --no-pager || log_error "K3s service not running"
log_step "Step 3: Waiting for K3s to be ready..."
sleep 10
kubectl get nodes || log_warn "K3s may still be initializing"
log_step "Step 4: Installing kubectl (if not present)..."
if ! command -v kubectl &> /dev/null; then
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
mv kubectl /usr/local/bin/
fi
log_step "Step 5: Configuring kubectl..."
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
mkdir -p ~/.kube
cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
chmod 600 ~/.kube/config
log_step "Step 6: Verifying cluster..."
kubectl cluster-info
kubectl get nodes
log_info "========================================="
log_info "K3s Installation Complete!"
log_info "========================================="
echo ""
log_info "K3s is ready to use!"
echo ""
log_info "Useful commands:"
echo " kubectl get nodes"
echo " kubectl get pods --all-namespaces"
echo " kubectl cluster-info"
echo ""
log_warn "Next steps:"
echo " 1. Create namespaces: kubectl create namespace blockchain"
echo " 2. Deploy ingress controller"
echo " 3. Deploy cert-manager"
echo " 4. Deploy HC Stack services"
echo ""
log_info "Kubeconfig location: /etc/rancher/k3s/k3s.yaml"
log_info "Copy this file to access cluster remotely"

View File

@@ -0,0 +1,146 @@
#!/bin/bash
source ~/.bashrc
# Observability Stack Setup Script (Prometheus + Grafana)
# Run this on the Observability VM after OS installation
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_error "Please run as root (use sudo)"
exit 1
fi
PROMETHEUS_VERSION="${PROMETHEUS_VERSION:-2.45.0}"
GRAFANA_VERSION="${GRAFANA_VERSION:-10.0.0}"
PROMETHEUS_USER="${PROMETHEUS_USER:-prometheus}"
GRAFANA_USER="${GRAFANA_USER:-grafana}"
log_step "Step 1: Installing dependencies..."
apt-get update
apt-get install -y wget curl
log_step "Step 2: Installing Prometheus..."
# Create Prometheus user
useradd -r -s /bin/false "$PROMETHEUS_USER" || log_warn "User $PROMETHEUS_USER may already exist"
# Download and install Prometheus
cd /tmp
wget "https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
tar xzf "prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
mv "prometheus-${PROMETHEUS_VERSION}.linux-amd64" /opt/prometheus
mkdir -p /etc/prometheus
mkdir -p /var/lib/prometheus
chown -R "$PROMETHEUS_USER:$PROMETHEUS_USER" /opt/prometheus /etc/prometheus /var/lib/prometheus
# Create Prometheus configuration
cat > /etc/prometheus/prometheus.yml <<EOF
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
EOF
# Create systemd service
cat > /etc/systemd/system/prometheus.service <<EOF
[Unit]
Description=Prometheus
After=network.target
[Service]
Type=simple
User=$PROMETHEUS_USER
ExecStart=/opt/prometheus/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
log_step "Step 3: Installing Node Exporter..."
cd /tmp
wget "https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz"
tar xzf node_exporter-1.6.1.linux-amd64.tar.gz
mv node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
chmod +x /usr/local/bin/node_exporter
cat > /etc/systemd/system/node-exporter.service <<EOF
[Unit]
Description=Node Exporter
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/node_exporter
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
log_step "Step 4: Installing Grafana..."
apt-get install -y apt-transport-https software-properties-common
wget -q -O - https://packages.grafana.com/gpg.key | apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" > /etc/apt/sources.list.d/grafana.list
apt-get update
apt-get install -y grafana
log_step "Step 5: Starting services..."
systemctl daemon-reload
systemctl enable prometheus node-exporter grafana-server
systemctl start prometheus node-exporter grafana-server
sleep 3
log_info "========================================="
log_info "Observability Stack Installation Complete!"
log_info "========================================="
echo ""
log_info "Services:"
echo " - Prometheus: http://192.168.1.82:9090"
echo " - Grafana: http://192.168.1.82:3000"
echo " - Node Exporter: http://192.168.1.82:9100"
echo ""
log_info "Grafana default credentials:"
echo " Username: admin"
echo " Password: admin (change on first login)"
echo ""
log_info "Next steps:"
echo " 1. Access Grafana and change default password"
echo " 2. Add Prometheus as data source (http://localhost:9090)"
echo " 3. Import dashboards from grafana.com/dashboards"
echo " 4. Configure alerting rules"

View File

@@ -0,0 +1,118 @@
#!/bin/bash
source ~/.bashrc
# Verify Proxmox Cloud Image Integrity
# Usage: ./scripts/verify-proxmox-image.sh [proxmox-host] [image-path]
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
RED='\033[0;31m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
PROXMOX_HOST="${1:-${PROXMOX_ML110_URL#https://}}"
PROXMOX_HOST="${PROXMOX_HOST%%:*}"
IMAGE_PATH="${2:-/var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img}"
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
main() {
echo "========================================="
echo "Verify Proxmox Cloud Image Integrity"
echo "========================================="
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
log_info "Set PVE_ROOT_PASS in .env file or pass as argument"
exit 1
fi
log_step "Connecting to Proxmox host: $PROXMOX_HOST"
log_info "Checking image: $IMAGE_PATH"
echo ""
# Check if file exists
log_step "1. Checking if file exists..."
if ssh -o StrictHostKeyChecking=no root@$PROXMOX_HOST "[ -f '$IMAGE_PATH' ]"; then
log_info "✓ File exists"
else
log_error "✗ File not found: $IMAGE_PATH"
log_info "Alternative locations to check:"
log_info " - /var/lib/vz/import/ubuntu-24.04-server-cloudimg-amd64.img.raw"
log_info " - /var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img"
exit 1
fi
# Get file size
log_step "2. Checking file size..."
FILE_SIZE=$(ssh root@$PROXMOX_HOST "ls -lh '$IMAGE_PATH' | awk '{print \$5}'")
log_info "File size: $FILE_SIZE"
# Check file type
log_step "3. Checking file type..."
FILE_TYPE=$(ssh root@$PROXMOX_HOST "file '$IMAGE_PATH'")
log_info "$FILE_TYPE"
# Verify with qemu-img
log_step "4. Verifying image with qemu-img..."
if ssh root@$PROXMOX_HOST "qemu-img info '$IMAGE_PATH' 2>&1"; then
log_info "✓ Image appears valid"
else
log_error "✗ Image verification failed"
log_warn "Image may be corrupted. See TROUBLESHOOTING_VM_9000.md"
exit 1
fi
# Check disk space
log_step "5. Checking available disk space..."
ssh root@$PROXMOX_HOST "df -h /var/lib/vz | tail -1"
# Check for I/O errors in dmesg
log_step "6. Checking for recent I/O errors..."
IO_ERRORS=$(ssh root@$PROXMOX_HOST "dmesg | grep -i 'i/o error' | tail -5")
if [ -z "$IO_ERRORS" ]; then
log_info "✓ No recent I/O errors found"
else
log_warn "Recent I/O errors detected:"
echo "$IO_ERRORS"
log_warn "This may indicate storage issues"
fi
echo ""
log_info "========================================="
log_info "Verification Complete"
log_info "========================================="
log_info ""
log_info "If all checks passed, you can proceed with VM creation."
log_info "If errors were found, see TROUBLESHOOTING_VM_9000.md"
}
main "$@"

77
scripts/lib/git_helpers.sh Executable file
View File

@@ -0,0 +1,77 @@
#!/bin/bash
# Git/Gitea Helper Functions
# Loads credentials from .env file for automated Git operations
set -euo pipefail
# Load Git credentials from .env file
load_git_credentials() {
local env_file="${1:-${PROJECT_ROOT:-.}/.env}"
if [ -f "$env_file" ]; then
# Source Gitea credentials
export GITEA_URL="${GITEA_URL:-http://192.168.1.121:3000}"
export GITEA_USERNAME="${GITEA_USERNAME:-pandoramannli}"
export GITEA_PASSWORD="${GITEA_PASSWORD:-admin123}"
export GITEA_TOKEN="${GITEA_TOKEN:-}"
export GITEA_REPO_OWNER="${GITEA_REPO_OWNER:-pandoramannli}"
export GITEA_REPO_NAME="${GITEA_REPO_NAME:-gitops}"
export GITEA_REPO_URL="${GITEA_REPO_URL:-http://192.168.1.121:3000/pandoramannli/gitops.git}"
export GITEA_SSH_URL="${GITEA_SSH_URL:-ssh://git@192.168.1.121:2222/pandoramannli/gitops.git}"
export GIT_USER_NAME="${GIT_USER_NAME:-Admin}"
export GIT_USER_EMAIL="${GIT_USER_EMAIL:-admin@hc-stack.local}"
# Override with .env values if present
set -a
source <(grep -E "^GITEA_|^GIT_USER_" "$env_file" 2>/dev/null | grep -v "^#" || true)
set +a
fi
}
# Get Git remote URL with credentials
get_git_remote_with_auth() {
load_git_credentials
if [ -n "${GITEA_TOKEN:-}" ]; then
# Use token authentication (preferred)
echo "http://oauth2:${GITEA_TOKEN}@${GITEA_URL#http://}/$(echo ${GITEA_REPO_URL} | sed 's|.*://[^/]*/||')"
else
# Use username/password authentication
echo "http://${GITEA_USERNAME}:${GITEA_PASSWORD}@${GITEA_URL#http://}/$(echo ${GITEA_REPO_URL} | sed 's|.*://[^/]*/||')"
fi
}
# Configure Git with credentials
configure_git_credentials() {
load_git_credentials
git config user.name "${GIT_USER_NAME}"
git config user.email "${GIT_USER_EMAIL}"
# Set up credential helper for this repo
local repo_url="${GITEA_REPO_URL}"
if [[ "$repo_url" == http* ]]; then
if [ -n "${GITEA_TOKEN:-}" ]; then
# Use token authentication (preferred)
git remote set-url origin "http://oauth2:${GITEA_TOKEN}@${repo_url#http://}"
else
# Use username/password authentication
git remote set-url origin "http://${GITEA_USERNAME}:${GITEA_PASSWORD}@${repo_url#http://}"
fi
fi
}
# Push to Gitea repository
push_to_gitea() {
local repo_path="${1:-.}"
local branch="${2:-main}"
load_git_credentials
configure_git_credentials
cd "$repo_path"
git add -A
git commit -m "${3:-Update GitOps manifests}" || true
git push origin "$branch" 2>&1
}

126
scripts/lib/proxmox_vm_helpers.sh Executable file
View File

@@ -0,0 +1,126 @@
#!/bin/bash
source ~/.bashrc
# Proxmox VM Helper Functions
# Shared library for Proxmox VM operations with guest-agent IP discovery
set -euo pipefail
# Ensure we're on a Proxmox node
if ! command -v qm >/dev/null 2>&1; then
echo "[ERROR] qm command not found. Run this on a Proxmox host." >&2
exit 1
fi
# Ensure jq is installed
if ! command -v jq >/dev/null 2>&1; then
echo "[ERROR] jq command not found. Install with: apt update && apt install -y jq" >&2
exit 1
fi
# get_vm_ip_from_guest_agent <vmid>
#
# Uses qemu-guest-agent to read network interfaces and returns the first
# non-loopback IPv4 address. Requires:
# - qemu-guest-agent installed in the guest
# - Agent enabled in VM config: qm set <vmid> --agent enabled=1
#
# Returns: IP address or empty string if not available
get_vm_ip_from_guest_agent() {
local vmid="$1"
# This will exit non-zero if guest agent is not running or not enabled
qm guest cmd "$vmid" network-get-interfaces 2>/dev/null \
| jq -r '
.[]?."ip-addresses"[]?
| select(.["ip-address-type"] == "ipv4"
and ."ip-address" != "127.0.0.1")
| ."ip-address"
' \
| head -n1 || echo ""
}
# Convenience wrapper that logs and optionally fails
# get_vm_ip_or_warn <vmid> <name>
#
# Returns: IP address or empty string
# Prints: Warning message if IP not available
get_vm_ip_or_warn() {
local vmid="$1"
local name="$2"
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" || true)"
if [[ -z "$ip" ]]; then
echo "[WARN] No IP from guest agent for VM $vmid ($name)." >&2
echo " - Ensure qemu-guest-agent is installed in the guest" >&2
echo " - Ensure 'Agent' is enabled in VM options" >&2
echo " - VM must be powered on" >&2
return 1
fi
echo "$ip"
}
# get_vm_ip_or_fallback <vmid> <name> <fallback_ip>
#
# Tries guest agent first, falls back to provided IP if agent not available
# Useful for bootstrap scenarios
get_vm_ip_or_fallback() {
local vmid="$1"
local name="$2"
local fallback_ip="${3:-}"
local ip
ip="$(get_vm_ip_from_guest_agent "$vmid" || true)"
if [[ -n "$ip" ]]; then
echo "$ip"
return 0
fi
if [[ -n "$fallback_ip" ]]; then
echo "[INFO] Guest agent not available for VM $vmid ($name), using fallback IP: $fallback_ip" >&2
echo "$fallback_ip"
return 0
fi
echo "[ERROR] No IP available for VM $vmid ($name) (no guest agent, no fallback)" >&2
return 1
}
# ensure_guest_agent_enabled <vmid>
#
# Ensures guest agent is enabled in VM config (doesn't install in guest)
ensure_guest_agent_enabled() {
local vmid="$1"
qm set "$vmid" --agent enabled=1 >/dev/null 2>&1 || true
}
# check_vm_status <vmid>
#
# Returns VM status (running, stopped, etc.)
check_vm_status() {
local vmid="$1"
qm status "$vmid" 2>/dev/null | awk '{print $2}' || echo "unknown"
}
# wait_for_guest_agent <vmid> <timeout_seconds>
#
# Waits for guest agent to become available
wait_for_guest_agent() {
local vmid="$1"
local timeout="${2:-60}"
local elapsed=0
while [ $elapsed -lt $timeout ]; do
if get_vm_ip_from_guest_agent "$vmid" >/dev/null 2>&1; then
return 0
fi
sleep 2
elapsed=$((elapsed + 2))
done
return 1
}

View File

@@ -0,0 +1,88 @@
#!/bin/bash
source ~/.bashrc
# Collect Metrics
# Collects system, application, network, and storage metrics
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
OUTPUT_DIR="${METRICS_OUTPUT_DIR:-$PROJECT_ROOT/metrics}"
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
collect_system_metrics() {
log_info "Collecting system metrics..."
mkdir -p "$OUTPUT_DIR/system"
# CPU metrics
top -bn1 | head -20 > "$OUTPUT_DIR/system/cpu.txt" 2>/dev/null || true
# Memory metrics
free -h > "$OUTPUT_DIR/system/memory.txt" 2>/dev/null || true
# Disk metrics
df -h > "$OUTPUT_DIR/system/disk.txt" 2>/dev/null || true
# Load average
uptime > "$OUTPUT_DIR/system/uptime.txt" 2>/dev/null || true
}
collect_kubernetes_metrics() {
log_info "Collecting Kubernetes metrics..."
if ! command -v kubectl &> /dev/null; then
log_warn "kubectl not found, skipping Kubernetes metrics"
return 0
fi
mkdir -p "$OUTPUT_DIR/kubernetes"
# Node metrics
kubectl top nodes > "$OUTPUT_DIR/kubernetes/nodes.txt" 2>/dev/null || true
# Pod metrics
kubectl top pods --all-namespaces > "$OUTPUT_DIR/kubernetes/pods.txt" 2>/dev/null || true
# Resource usage
kubectl get nodes -o json > "$OUTPUT_DIR/kubernetes/nodes.json" 2>/dev/null || true
}
collect_network_metrics() {
log_info "Collecting network metrics..."
mkdir -p "$OUTPUT_DIR/network"
# Interface statistics
ip -s link > "$OUTPUT_DIR/network/interfaces.txt" 2>/dev/null || true
# Network connections
ss -tunap > "$OUTPUT_DIR/network/connections.txt" 2>/dev/null || true
}
main() {
log_info "Collecting metrics..."
mkdir -p "$OUTPUT_DIR"
collect_system_metrics
collect_kubernetes_metrics
collect_network_metrics
log_info "Metrics collected to: $OUTPUT_DIR"
}
main "$@"

View File

@@ -0,0 +1,63 @@
#!/bin/bash
source ~/.bashrc
# Setup Alerts
# Configures alerting rules and notification channels
set -e
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
setup_prometheus_alerts() {
log_info "Setting up Prometheus alerts..."
if ! command -v kubectl &> /dev/null; then
log_warn "kubectl not found, skipping Prometheus alert setup"
return 0
fi
log_info "Prometheus alert rules should be configured via:"
log_info " - Prometheus Operator Alertmanager"
log_info " - Custom Resource Definitions (CRDs)"
log_info " - GitOps manifests"
log_warn "Manual configuration required for alert rules"
}
setup_azure_alerts() {
log_info "Setting up Azure alerts..."
if ! command -v az &> /dev/null; then
log_warn "Azure CLI not found, skipping Azure alert setup"
return 0
fi
log_info "Azure alerts should be configured via:"
log_info " - Azure Portal: Monitor > Alerts"
log_info " - Azure CLI: az monitor metrics alert create"
log_info " - Terraform: azurerm_monitor_metric_alert"
log_warn "Manual configuration required for Azure alerts"
}
main() {
log_info "Setting up alerting..."
setup_prometheus_alerts
setup_azure_alerts
log_info "Alert setup complete (manual configuration may be required)"
}
main "$@"

91
scripts/ops/ssh-test-all.sh Executable file
View File

@@ -0,0 +1,91 @@
#!/bin/bash
source ~/.bashrc
# SSH Test All VMs - Using Guest Agent IP Discovery
# Tests SSH access to all VMs using dynamically discovered IPs
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
VM_USER="${VM_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
# VMID NAME (no IP here)
VMS=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
# Import helper (must be run on Proxmox host or via SSH)
if [ -f "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh" ]; then
source "$PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
else
log_error "Helper library not found: $PROJECT_ROOT/scripts/lib/proxmox_vm_helpers.sh"
exit 1
fi
main() {
log_info "Testing SSH access to all VMs using guest-agent IP discovery"
echo ""
for vm_spec in "${VMS[@]}"; do
read -r vmid name <<< "$vm_spec"
echo "=== VM $vmid: $name ==="
# Ensure guest agent is enabled in VM config
ensure_guest_agent_enabled "$vmid" || true
# Get IP from guest agent
ip="$(get_vm_ip_or_warn "$vmid" "$name" || true)"
if [[ -z "$ip" ]]; then
echo " Skipping VM $vmid ($name) no IP from guest agent."
echo
continue
fi
echo " Using IP: $ip"
echo " Running 'hostname' via SSH..."
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5 "${VM_USER}@${ip}" hostname 2>/dev/null; then
log_info " ✓ SSH working"
else
log_error " ✗ SSH failed for ${VM_USER}@${ip}"
fi
echo
done
log_info "SSH test complete"
}
main "$@"

View File

@@ -0,0 +1,185 @@
#!/bin/bash
source ~/.bashrc
# Create Service VMs on Proxmox
# Creates VMs for K3s, Cloudflare Tunnel, Git Server, and Observability
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Load environment variables
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
# Proxmox configuration
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_HOST="${1:-192.168.1.206}" # Default to ML110
PROXMOX_URL="https://${PROXMOX_HOST}:8006"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
# Get authentication ticket
get_ticket() {
local response=$(curl -k -s -d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket")
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -z "$ticket" ] || [ -z "$csrf" ]; then
log_error "Failed to authenticate with Proxmox"
return 1
fi
echo "$ticket|$csrf"
}
# Get next available VM ID
get_next_vmid() {
local auth=$1
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
local response=$(curl -k -s -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/cluster/nextid")
echo "$response" | grep -o '"data":"[^"]*' | cut -d'"' -f4
}
# List existing VMs
list_vms() {
local auth=$1
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_info "Listing existing VMs on $PROXMOX_HOST..."
local response=$(curl -k -s -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf" \
"$PROXMOX_URL/api2/json/cluster/resources?type=vm")
echo "$response" | python3 -c "
import sys, json
data = json.load(sys.stdin)
vms = [v for v in data.get('data', []) if v.get('type') == 'qemu']
if vms:
print(f'Found {len(vms)} VMs:')
for vm in vms:
print(f\" - {vm.get('name', 'unknown')} (ID: {vm.get('vmid', 'N/A')}, Status: {vm.get('status', 'unknown')})\")
else:
print('No VMs found')
" 2>/dev/null || echo "Could not parse VM list"
}
# Create VM (simplified - requires template)
create_vm() {
local auth=$1
local vmid=$2
local name=$3
local cores=$4
local memory=$5
local disk_size=$6
local ip_address=$7
local ticket=$(echo "$auth" | cut -d'|' -f1)
local csrf=$(echo "$auth" | cut -d'|' -f2)
log_info "Creating VM: $name (ID: $vmid)"
# Note: This is a simplified example
# Full VM creation requires a template or ISO
# For now, we'll provide instructions
log_warn "VM creation via API requires:"
log_warn " 1. A VM template (e.g., ubuntu-22.04-template)"
log_warn " 2. Or use Proxmox Web UI for initial VM creation"
log_warn " 3. Or use Terraform (recommended)"
echo ""
log_info "Recommended: Use Proxmox Web UI or Terraform"
log_info " Web UI: $PROXMOX_URL"
log_info " Terraform: cd terraform/proxmox && terraform apply"
}
main() {
echo "========================================="
echo "Proxmox Service VM Creation"
echo "========================================="
echo ""
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set in .env"
exit 1
fi
log_info "Connecting to Proxmox: $PROXMOX_URL"
# Authenticate
local auth=$(get_ticket)
if [ $? -ne 0 ]; then
exit 1
fi
log_info "Authentication successful"
echo ""
# List existing VMs
list_vms "$auth"
echo ""
# Get next VM ID
local next_id=$(get_next_vmid "$auth")
log_info "Next available VM ID: $next_id"
echo ""
log_info "Service VMs to create:"
echo " 1. Cloudflare Tunnel VM (ID: $next_id)"
echo " - 2 vCPU, 4GB RAM, 40GB disk"
echo " - IP: 192.168.1.60"
echo ""
echo " 2. K3s Master VM (ID: $((next_id + 1)))"
echo " - 4 vCPU, 8GB RAM, 80GB disk"
echo " - IP: 192.168.1.188"
echo ""
echo " 3. Git Server VM (ID: $((next_id + 2)))"
echo " - 4 vCPU, 8GB RAM, 100GB disk"
echo " - IP: 192.168.1.121"
echo ""
echo " 4. Observability VM (ID: $((next_id + 3)))"
echo " - 4 vCPU, 8GB RAM, 200GB disk"
echo " - IP: 192.168.1.82"
echo ""
log_warn "Full VM creation via API requires templates."
log_info "Options:"
log_info " 1. Use Proxmox Web UI: $PROXMOX_URL"
log_info " 2. Use Terraform: cd terraform/proxmox"
log_info " 3. Create templates first, then use API"
}
main "$@"

92
scripts/quality/lint-scripts.sh Executable file
View File

@@ -0,0 +1,92 @@
#!/bin/bash
source ~/.bashrc
# Lint Scripts
# Run shellcheck on all shell scripts
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
SCRIPTS_DIR="$PROJECT_ROOT/scripts"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_shellcheck() {
if ! command -v shellcheck &> /dev/null; then
log_error "shellcheck not found"
log_info "Install shellcheck:"
log_info " Ubuntu/Debian: sudo apt-get install shellcheck"
log_info " macOS: brew install shellcheck"
log_info " Or download from: https://github.com/koalaman/shellcheck"
return 1
fi
return 0
}
lint_scripts() {
log_info "Linting all shell scripts..."
local errors=0
local warnings=0
local total=0
while IFS= read -r -d '' file; do
total=$((total + 1))
log_info "Checking: $file"
if shellcheck -x "$file" 2>&1 | tee /tmp/shellcheck_output.$$; then
log_info " ✓ No issues found"
else
local exit_code=${PIPESTATUS[0]}
if [ $exit_code -eq 0 ]; then
log_info " ✓ No issues found"
else
errors=$((errors + 1))
log_error " ✗ Issues found in $file"
fi
fi
done < <(find "$SCRIPTS_DIR" -name "*.sh" -type f -print0)
echo ""
log_info "Linting complete:"
log_info " Total scripts: $total"
log_info " Errors: $errors"
if [ $errors -eq 0 ]; then
log_info "✓ All scripts passed linting"
return 0
else
log_error "$errors script(s) have issues"
return 1
fi
}
main() {
log_info "Script Linting"
echo ""
if ! check_shellcheck; then
exit 1
fi
lint_scripts
}
main "$@"

View File

@@ -0,0 +1,115 @@
#!/bin/bash
source ~/.bashrc
# Validate Scripts
# Validate script syntax and check for common issues
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
SCRIPTS_DIR="$PROJECT_ROOT/scripts"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
validate_syntax() {
log_info "Validating script syntax..."
local errors=0
local total=0
while IFS= read -r -d '' file; do
total=$((total + 1))
# Check bash syntax
if bash -n "$file" 2>&1; then
log_info "$file: Syntax OK"
else
errors=$((errors + 1))
log_error "$file: Syntax error"
fi
done < <(find "$SCRIPTS_DIR" -name "*.sh" -type f -print0)
echo ""
log_info "Syntax validation complete:"
log_info " Total scripts: $total"
log_info " Errors: $errors"
if [ $errors -eq 0 ]; then
log_info "✓ All scripts have valid syntax"
return 0
else
log_error "$errors script(s) have syntax errors"
return 1
fi
}
check_shebangs() {
log_info "Checking shebangs..."
local missing=0
while IFS= read -r -d '' file; do
if ! head -1 "$file" | grep -q "^#!/bin/bash"; then
missing=$((missing + 1))
log_warn " Missing or incorrect shebang: $file"
fi
done < <(find "$SCRIPTS_DIR" -name "*.sh" -type f -print0)
if [ $missing -eq 0 ]; then
log_info "✓ All scripts have correct shebangs"
else
log_warn "$missing script(s) missing or have incorrect shebangs"
fi
}
check_executable() {
log_info "Checking executable permissions..."
local not_executable=0
while IFS= read -r -d '' file; do
if [ ! -x "$file" ]; then
not_executable=$((not_executable + 1))
log_warn " Not executable: $file"
fi
done < <(find "$SCRIPTS_DIR" -name "*.sh" -type f -print0)
if [ $not_executable -eq 0 ]; then
log_info "✓ All scripts are executable"
else
log_warn "$not_executable script(s) are not executable"
log_info "Run: find scripts/ -name '*.sh' -exec chmod +x {} \\;"
fi
}
main() {
log_info "Script Validation"
echo ""
validate_syntax
echo ""
check_shebangs
echo ""
check_executable
echo ""
log_info "Validation complete"
}
main "$@"

View File

@@ -0,0 +1,109 @@
#!/bin/bash
source ~/.bashrc
# Configure Firewall Rules for Proxmox Hosts
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOSTS=("192.168.1.206" "192.168.1.49") # ML110 and R630
main() {
log_info "Configuring Firewall Rules for Proxmox Hosts"
echo ""
for host in "${PROXMOX_HOSTS[@]}"; do
log_info "Configuring firewall on $host..."
# Check if we can connect
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${host}" "pveversion" &>/dev/null; then
log_warn "Cannot connect to $host. Skipping..."
continue
fi
# Enable firewall if not already enabled
log_info "Enabling firewall..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${host}" <<'EOF'
set -e
# Enable firewall
pve-firewall enable || true
# Create security group for cluster communication
pve-firewall security-group add cluster-comm --comment "Cluster communication"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 8006 --comment "Proxmox Web UI"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 22 --comment "SSH"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto udp --dport 5404:5412 --comment "Corosync cluster"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 3128 --comment "SPICE proxy"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 111 --comment "RPC"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 2049 --comment "NFS"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 5900:5999 --comment "VNC"
pve-firewall security-group rule add cluster-comm --action ACCEPT --proto tcp --dport 60000:60050 --comment "Migration"
# Create security group for VM services
pve-firewall security-group add vm-services --comment "VM service ports"
pve-firewall security-group rule add vm-services --action ACCEPT --proto tcp --dport 3000 --comment "Gitea/Grafana"
pve-firewall security-group rule add vm-services --action ACCEPT --proto tcp --dport 9090 --comment "Prometheus"
pve-firewall security-group rule add vm-services --action ACCEPT --proto tcp --dport 6443 --comment "K3s API"
pve-firewall security-group rule add vm-services --action ACCEPT --proto tcp --dport 10250 --comment "Kubelet"
# Configure datacenter firewall options
pve-firewall options set enable 1
pve-firewall options set log_level_in 6 # Log dropped packets
pve-firewall options set log_level_out 6
# Allow cluster communication between nodes
pve-firewall cluster add-rule cluster-comm --action ACCEPT --source 192.168.1.0/24 --comment "Allow cluster subnet"
echo "Firewall configured successfully"
EOF
log_info "✓ Firewall configured on $host"
echo ""
done
log_info "Firewall configuration complete!"
echo ""
log_warn "Review firewall rules:"
log_info " - Check rules: pve-firewall status"
log_info " - View security groups: pve-firewall security-group list"
log_info " - Test connectivity after applying rules"
echo ""
log_info "Default rules allow:"
log_info " - Cluster communication (ports 5404-5412 UDP)"
log_info " - Proxmox Web UI (port 8006)"
log_info " - SSH (port 22)"
log_info " - VM services (ports 3000, 9090, 6443, 10250)"
log_info " - Migration ports (60000-60050)"
}
main "$@"

View File

@@ -0,0 +1,93 @@
#!/bin/bash
source ~/.bashrc
# Setup Proxmox RBAC (Role-Based Access Control)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519_proxmox}"
PROXMOX_HOSTS=("192.168.1.206" "192.168.1.49") # ML110 and R630
main() {
log_info "Setting up Proxmox RBAC"
echo ""
for host in "${PROXMOX_HOSTS[@]}"; do
log_info "Configuring RBAC on $host..."
# Check if we can connect
if ! ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${host}" "pveversion" &>/dev/null; then
log_warn "Cannot connect to $host. Skipping..."
continue
fi
# Create roles
log_info "Creating custom roles..."
ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no "root@${host}" <<'EOF'
set -e
# Create VM Operator role (can manage VMs but not hosts)
pveum role add VMOperator --privs "VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Monitor VM.PowerMgmt Datastore.Allocate Datastore.Audit"
# Create VM Viewer role (read-only access to VMs)
pveum role add VMViewer --privs "VM.Audit VM.Monitor Datastore.Audit"
# Create Storage Operator role (can manage storage)
pveum role add StorageOperator --privs "Datastore.Allocate Datastore.Audit Datastore.AllocateSpace Datastore.AllocateTemplate"
# Create Network Operator role (can manage networks)
pveum role add NetworkOperator --privs "SDN.Use SDN.Audit Network.Allocate Network.Audit"
echo "Roles created successfully"
EOF
log_info "✓ RBAC roles created on $host"
echo ""
done
log_info "RBAC setup complete!"
echo ""
log_warn "Manual steps required:"
log_info "1. Create users via Web UI: Datacenter → Permissions → Users → Add"
log_info "2. Assign roles to users: Datacenter → Permissions → User → Edit → Roles"
log_info "3. Create API tokens for automation:"
log_info " - Datacenter → Permissions → API Tokens → Add"
log_info " - Store tokens securely in .env file"
echo ""
log_info "Available roles:"
log_info " - VMOperator: Full VM management"
log_info " - VMViewer: Read-only VM access"
log_info " - StorageOperator: Storage management"
log_info " - NetworkOperator: Network management"
}
main "$@"

116
scripts/test/run-all-tests.sh Executable file
View File

@@ -0,0 +1,116 @@
#!/bin/bash
source ~/.bashrc
# Run All Tests
# Orchestrates all test suites
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
TESTS_DIR="$PROJECT_ROOT/tests"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
run_test_suite() {
local suite_dir=$1
local suite_name=$2
if [ ! -d "$suite_dir" ]; then
log_warn "Test suite directory not found: $suite_dir"
return 0
fi
log_test "Running $suite_name tests..."
local tests_passed=0
local tests_failed=0
while IFS= read -r -d '' test_file; do
if [ -x "$test_file" ]; then
log_info " Running: $(basename "$test_file")"
if "$test_file"; then
tests_passed=$((tests_passed + 1))
log_info " ✓ Passed"
else
tests_failed=$((tests_failed + 1))
log_error " ✗ Failed"
fi
fi
done < <(find "$suite_dir" -name "test-*.sh" -type f -print0)
log_info "$suite_name: $tests_passed passed, $tests_failed failed"
return $tests_failed
}
main() {
echo "========================================="
echo "Running All Test Suites"
echo "========================================="
echo ""
local total_failed=0
# Run E2E tests
if [ -d "$TESTS_DIR/e2e" ]; then
run_test_suite "$TESTS_DIR/e2e" "E2E"
total_failed=$((total_failed + $?))
echo ""
fi
# Run unit tests
if [ -d "$TESTS_DIR/unit" ]; then
run_test_suite "$TESTS_DIR/unit" "Unit"
total_failed=$((total_failed + $?))
echo ""
fi
# Run integration tests
if [ -d "$TESTS_DIR/integration" ]; then
run_test_suite "$TESTS_DIR/integration" "Integration"
total_failed=$((total_failed + $?))
echo ""
fi
# Run performance tests (optional)
if [ -d "$TESTS_DIR/performance" ] && [ "${RUN_PERF_TESTS:-false}" = "true" ]; then
run_test_suite "$TESTS_DIR/performance" "Performance"
total_failed=$((total_failed + $?))
echo ""
fi
echo "========================================="
echo "Test Summary"
echo "========================================="
if [ $total_failed -eq 0 ]; then
log_info "✓ All test suites passed"
exit 0
else
log_error "$total_failed test suite(s) failed"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,158 @@
#!/bin/bash
source ~/.bashrc
# Diagnose VM Issues
# Comprehensive diagnosis of VM problems
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_issue() {
echo -e "${RED}[ISSUE]${NC} $1"
}
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
diagnose_template() {
log_info "Diagnosing template VM 9000..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
local config=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/9000/config")
local disk=$(echo "$config" | python3 -c "import sys, json; d=json.load(sys.stdin).get('data', {}); print(d.get('scsi0', ''))" 2>/dev/null)
local size=$(echo "$disk" | grep -o 'size=[^,]*' | cut -d'=' -f2)
if [ "$size" = "600M" ]; then
log_issue "Template has only 600M disk - likely no OS installed"
log_warn "Template may need OS installation before cloning"
return 1
fi
return 0
}
diagnose_vm() {
local vmid=$1
local name=$2
local ip=$3
log_info "Diagnosing VM $vmid ($name)..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Check VM status
local status=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/current" | \
python3 -c "import sys, json; print(json.load(sys.stdin).get('data', {}).get('status', 'unknown'))" 2>/dev/null)
echo " Status: $status"
# Check QEMU Guest Agent
local agent_check=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/agent/network-get-interfaces" 2>&1)
if echo "$agent_check" | grep -q "not running"; then
log_issue "QEMU Guest Agent not running - OS may not be installed or agent not installed"
fi
# Check network connectivity
if ping -c 1 -W 2 "$ip" &>/dev/null; then
log_info " Network: ✓ Reachable"
else
log_issue " Network: ✗ Not reachable"
log_warn " Possible causes:"
log_warn " - OS not installed"
log_warn " - Cloud-init not installed"
log_warn " - Network configuration failed"
log_warn " - VM stuck in boot"
fi
# Check SSH
if timeout 3 bash -c "cat < /dev/null > /dev/tcp/$ip/22" 2>/dev/null; then
log_info " SSH: ✓ Port 22 open"
else
log_issue " SSH: ✗ Port 22 closed"
fi
}
main() {
log_info "VM Issue Diagnosis"
echo ""
# Diagnose template
diagnose_template
echo ""
# Diagnose VMs
local vms=(
"100 cloudflare-tunnel 192.168.1.60"
"101 k3s-master 192.168.1.188"
"102 git-server 192.168.1.121"
"103 observability 192.168.1.82"
)
for vm_spec in "${vms[@]}"; do
read -r vmid name ip <<< "$vm_spec"
diagnose_vm "$vmid" "$name" "$ip"
echo ""
done
log_info "Diagnosis complete"
log_warn "If template has no OS, VMs need manual OS installation via Proxmox console"
}
main "$@"

View File

@@ -0,0 +1,127 @@
#!/bin/bash
source ~/.bashrc
# Fix Template from Cloud Image
# Recreates template VM 9000 from Ubuntu cloud image
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
CLOUD_IMAGE="local:iso/ubuntu-24.04-server-cloudimg-amd64.img"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
recreate_template_from_cloud_image() {
log_step "Recreating Template from Cloud Image"
log_warn "This will DELETE template VM 9000 and recreate it from cloud image"
log_warn "All VMs cloned from this template will need to be recreated"
echo ""
read -p "Continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
log_info "Cancelled"
return 1
fi
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Stop and delete template
log_info "Stopping template VM 9000..."
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/9000/status/stop" > /dev/null 2>&1
sleep 5
log_info "Deleting template VM 9000..."
curl -s -k -X DELETE -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/9000" > /dev/null 2>&1
sleep 3
# Create new VM from cloud image
log_info "Creating new VM 9000 from cloud image..."
# Step 1: Create VM shell
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
-d "vmid=9000" \
-d "name=ubuntu-24.04-cloudinit" \
-d "memory=2048" \
-d "cores=2" \
-d "net0=virtio,bridge=vmbr0" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu" > /dev/null 2>&1
sleep 2
# Step 2: Import cloud image disk
log_info "Importing cloud image disk..."
log_warn "This requires SSH access to Proxmox host"
log_info "To complete via SSH:"
echo " ssh root@192.168.1.206"
echo " qm importdisk 9000 /var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img local-lvm"
echo " qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0"
echo " qm set 9000 --boot order=scsi0"
echo " qm set 9000 --bios ovmf --efidisk0 local-lvm:1"
echo " qm set 9000 --agent 1"
echo " qm set 9000 --template 1"
log_info "Or use Proxmox Web UI to import disk"
}
main() {
log_info "Fix Template from Cloud Image"
recreate_template_from_cloud_image
}
main "$@"

View File

@@ -0,0 +1,82 @@
#!/bin/bash
source ~/.bashrc
# Fix Template OS Installation
# Guides through installing Ubuntu on template VM 9000
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
main() {
log_step "Template OS Installation Guide"
log_warn "Template VM 9000 has only 600M disk - likely no OS installed"
log_info "VMs cloned from this template won't boot properly"
echo ""
log_step "Solution Options"
echo "Option 1: Install Ubuntu via ISO (Recommended)"
echo " 1. Access Proxmox Web UI: https://192.168.1.206:8006"
echo " 2. Go to VM 9000 → Hardware → Add → CD/DVD Drive"
echo " 3. Select Ubuntu 24.04 ISO (upload if needed)"
echo " 4. Set boot order: CD/DVD first"
echo " 5. Start VM 9000 and open console"
echo " 6. Install Ubuntu 24.04"
echo " 7. Install cloud-init: sudo apt install cloud-init"
echo " 8. Install QEMU Guest Agent: sudo apt install qemu-guest-agent"
echo " 9. Enable services: sudo systemctl enable cloud-init qemu-guest-agent"
echo " 10. Convert to template: Right-click VM → Convert to Template"
echo ""
echo "Option 2: Use Ubuntu Cloud Image (Faster)"
echo " 1. Download Ubuntu 24.04 cloud image"
echo " 2. Upload to Proxmox storage"
echo " 3. Create VM from cloud image (see CREATE_VM_9000_STEPS.md)"
echo " 4. Convert to template"
echo ""
log_step "Quick Fix: Expand Template Disk First"
log_info "Template disk is too small. Expanding to 8GB..."
# This would require SSH, but document it
log_warn "To expand template disk (requires SSH to Proxmox host):"
echo " ssh root@192.168.1.206"
echo " qm resize 9000 scsi0 +8G"
echo ""
log_step "After OS Installation"
log_info "Once template has OS installed:"
echo " 1. Recreate VMs from updated template"
echo " 2. VMs will boot with Ubuntu and cloud-init will configure network"
echo ""
log_info "See docs/temporary/CREATE_VM_9000_STEPS.md for detailed instructions"
}
main "$@"

View File

@@ -0,0 +1,97 @@
#!/bin/bash
source ~/.bashrc
# Fix VM Network Issues
# Attempts to fix network configuration issues on VMs
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
restart_vm() {
local vmid=$1
local name=$2
log_info "Restarting VM $vmid ($name) to apply network changes..."
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
# Reboot VM
curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/qemu/$vmid/status/reboot" > /dev/null 2>&1
log_info "VM $vmid rebooted"
}
main() {
log_info "Fixing VM Network Issues"
log_warn "This will restart all VMs to apply network configuration"
echo ""
local vms=(
"100 cloudflare-tunnel"
"101 k3s-master"
"102 git-server"
"103 observability"
)
for vm_spec in "${vms[@]}"; do
read -r vmid name <<< "$vm_spec"
restart_vm "$vmid" "$name"
sleep 2
done
log_info "All VMs restarted"
log_warn "Wait 5-10 minutes for VMs to boot and apply cloud-init"
}
main "$@"

View File

@@ -0,0 +1,152 @@
#!/bin/bash
source ~/.bashrc
# Recreate Template from Cloud Image
# Recreates template VM 9000 from Ubuntu cloud image via SSH
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "\n${BLUE}=== $1 ===${NC}"
}
PROXMOX_HOST="${PROXMOX_ML110_IP:-192.168.1.206}"
CLOUD_IMAGE="/var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img"
VMID=9000
main() {
log_step "Recreating Template from Cloud Image"
log_info "This will recreate template VM 9000 from Ubuntu cloud image"
log_warn "All VMs cloned from this template will need to be recreated"
echo ""
# Check SSH access
log_info "Checking SSH access to Proxmox host ($PROXMOX_HOST)..."
# Try with SSH key first
SSH_KEY="$HOME/.ssh/id_ed25519_proxmox"
if [ -f "$SSH_KEY" ]; then
SSH_OPTS="-i $SSH_KEY"
else
SSH_OPTS=""
fi
if ! ssh $SSH_OPTS -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$PROXMOX_HOST" "echo 'SSH OK'" &> /dev/null; then
log_error "SSH access to $PROXMOX_HOST failed"
log_info "Please ensure:"
log_info " 1. SSH is enabled on Proxmox host"
log_info " 2. Root login is allowed"
log_info " 3. SSH key is set up or password authentication is enabled"
exit 1
fi
log_info "✓ SSH access confirmed"
# Check if cloud image exists
log_info "Checking if cloud image exists..."
if ssh $SSH_OPTS "root@$PROXMOX_HOST" "[ -f $CLOUD_IMAGE ]"; then
log_info "✓ Cloud image found: $CLOUD_IMAGE"
else
log_error "Cloud image not found: $CLOUD_IMAGE"
log_info "Please upload Ubuntu 24.04 cloud image to Proxmox storage first"
exit 1
fi
# Stop and delete existing template
log_step "Step 1: Removing Existing Template"
log_info "Stopping VM $VMID (if running)..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm stop $VMID" 2>/dev/null || true
sleep 3
log_info "Deleting VM $VMID..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm destroy $VMID --purge" 2>/dev/null || true
sleep 3
# Create new VM shell
log_step "Step 2: Creating New VM Shell"
log_info "Creating VM $VMID..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm create $VMID \
--name ubuntu-24.04-cloudinit \
--memory 2048 \
--cores 2 \
--net0 virtio,bridge=vmbr0"
# Import cloud image
log_step "Step 3: Importing Cloud Image"
log_info "Importing cloud image (this may take a few minutes)..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm importdisk $VMID $CLOUD_IMAGE local-lvm"
# Attach disk
log_step "Step 4: Attaching Disk"
log_info "Attaching imported disk..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID \
--scsihw virtio-scsi-pci \
--scsi0 local-lvm:vm-${VMID}-disk-0"
# Configure boot
log_step "Step 5: Configuring Boot"
log_info "Setting boot order..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID --boot order=scsi0"
# Configure UEFI
log_step "Step 6: Configuring UEFI"
log_info "Enabling UEFI/OVMF..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID --bios ovmf --efidisk0 local-lvm:1"
# Enable QEMU Guest Agent
log_step "Step 7: Enabling QEMU Guest Agent"
log_info "Enabling agent..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID --agent 1"
# Configure cloud-init
log_step "Step 8: Configuring Cloud-Init"
log_info "Setting up cloud-init..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID --ide2 local-lvm:cloudinit"
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm set $VMID --serial0 socket --vga serial0"
# Convert to template
log_step "Step 9: Converting to Template"
log_info "Converting VM to template..."
ssh $SSH_OPTS "root@$PROXMOX_HOST" "qm template $VMID"
log_step "Template Recreation Complete!"
log_info "✓ Template VM 9000 recreated from Ubuntu cloud image"
log_info "✓ Cloud-init is pre-installed in the image"
log_info "✓ QEMU Guest Agent enabled"
log_info ""
log_info "Next steps:"
log_info " 1. Recreate VMs: ./scripts/deploy/recreate-vms-smaller-disks.sh --yes"
log_info " 2. Verify VM boot and network connectivity"
}
main "$@"

View File

@@ -0,0 +1,276 @@
#!/bin/bash
source ~/.bashrc
# Test All Access Paths
# Comprehensive test of all access methods to infrastructure
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
ML110_IP="192.168.1.206"
R630_IP="192.168.1.49"
SSH_KEY="$HOME/.ssh/id_ed25519_proxmox"
VM_IPS=("192.168.1.60" "192.168.1.188" "192.168.1.121" "192.168.1.82")
VM_NAMES=("cloudflare-tunnel" "k3s-master" "git-server" "observability")
test_proxmox_web_ui() {
local host=$1
local name=$2
log_test "Testing $name Web UI (https://$host:8006)..."
local status=$(curl -k -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "https://$host:8006" 2>/dev/null)
if [ "$status" = "200" ] || [ "$status" = "401" ] || [ "$status" = "403" ]; then
echo -e " ${GREEN}${NC} Web UI accessible (HTTP $status)"
return 0
else
echo -e " ${RED}${NC} Web UI not accessible (HTTP $status)"
return 1
fi
}
test_proxmox_ssh() {
local host=$1
local name=$2
log_test "Testing $name SSH access..."
if [ ! -f "$SSH_KEY" ]; then
echo -e " ${YELLOW}${NC} SSH key not found: $SSH_KEY"
return 1
fi
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$host" "echo 'SSH OK'" &>/dev/null; then
echo -e " ${GREEN}${NC} SSH access working"
return 0
else
echo -e " ${RED}${NC} SSH access failed"
return 1
fi
}
test_proxmox_api() {
local host=$1
local name=$2
log_test "Testing $name API access..."
if [ -z "${PVE_ROOT_PASS:-}" ]; then
echo -e " ${YELLOW}${NC} PVE_ROOT_PASS not set"
return 1
fi
local response=$(curl -s -k --connect-timeout 5 --max-time 10 \
-d "username=root@pam&password=$PVE_ROOT_PASS" \
"https://$host:8006/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
echo -e " ${GREEN}${NC} API access working"
return 0
else
echo -e " ${RED}${NC} API access failed"
return 1
fi
}
test_vm_network() {
local ip=$1
local name=$2
log_test "Testing $name network access ($ip)..."
if ping -c 1 -W 2 "$ip" &>/dev/null; then
echo -e " ${GREEN}${NC} Ping successful"
else
echo -e " ${RED}${NC} Ping failed"
return 1
fi
if timeout 2 bash -c "cat < /dev/null > /dev/tcp/$ip/22" 2>/dev/null; then
echo -e " ${GREEN}${NC} SSH port 22 open"
else
echo -e " ${YELLOW}${NC} SSH port 22 closed or filtered"
fi
return 0
}
test_vm_ssh() {
local ip=$1
local name=$2
log_test "Testing $name SSH access..."
if [ ! -f "$SSH_KEY" ]; then
echo -e " ${YELLOW}${NC} SSH key not found"
return 1
fi
if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o ConnectTimeout=5 "ubuntu@$ip" "hostname" &>/dev/null; then
echo -e " ${GREEN}${NC} SSH access working"
return 0
else
echo -e " ${RED}${NC} SSH access failed (authentication)"
return 1
fi
}
test_qemu_guest_agent() {
local vmid=$1
local name=$2
log_test "Testing $name QEMU Guest Agent (VM $vmid)..."
if [ ! -f "$SSH_KEY" ]; then
echo -e " ${YELLOW}${NC} Cannot test (SSH key needed)"
return 1
fi
local result=$(ssh -i "$SSH_KEY" -o ConnectTimeout=5 "root@$ML110_IP" \
"qm guest exec $vmid -- echo 'test' 2>&1" 2>/dev/null)
if echo "$result" | grep -q "test"; then
echo -e " ${GREEN}${NC} Guest Agent working"
return 0
elif echo "$result" | grep -q "not running"; then
echo -e " ${YELLOW}${NC} Guest Agent not running (needs installation)"
return 1
else
echo -e " ${RED}${NC} Guest Agent not accessible"
return 1
fi
}
test_service_ports() {
local ip=$1
local name=$2
local ports=()
case "$name" in
cloudflare-tunnel)
ports=(22)
;;
k3s-master)
ports=(22 6443 10250)
;;
git-server)
ports=(22 3000 2222)
;;
observability)
ports=(22 3000 9090)
;;
esac
log_test "Testing $name service ports..."
for port in "${ports[@]}"; do
if timeout 2 bash -c "cat < /dev/null > /dev/tcp/$ip/$port" 2>/dev/null; then
echo -e " ${GREEN}${NC} Port $port open"
else
echo -e " ${YELLOW}${NC} Port $port closed (service may not be running)"
fi
done
}
main() {
echo "========================================="
echo "Access Paths Test - Complete Map"
echo "========================================="
echo ""
# Test Proxmox Hosts
log_info "Testing Proxmox Hosts"
echo ""
echo "ML110 (192.168.1.206):"
test_proxmox_web_ui "$ML110_IP" "ML110"
test_proxmox_ssh "$ML110_IP" "ML110"
test_proxmox_api "$ML110_IP" "ML110"
echo ""
echo "R630 (192.168.1.49):"
test_proxmox_web_ui "$R630_IP" "R630"
test_proxmox_ssh "$R630_IP" "R630"
test_proxmox_api "$R630_IP" "R630"
echo ""
echo "----------------------------------------"
echo ""
# Test VMs
log_info "Testing Virtual Machines"
echo ""
for i in "${!VM_IPS[@]}"; do
local ip="${VM_IPS[$i]}"
local name="${VM_NAMES[$i]}"
local vmid=$((100 + i))
echo "$name ($ip):"
test_vm_network "$ip" "$name"
test_vm_ssh "$ip" "$name"
test_qemu_guest_agent "$vmid" "$name"
test_service_ports "$ip" "$name"
echo ""
done
echo "========================================="
echo "Access Paths Summary"
echo "========================================="
echo ""
log_info "Working Access Methods:"
echo " ✅ Proxmox ML110: Web UI, SSH, API"
echo " ✅ Proxmox R630: Web UI, API (SSH pending)"
echo " ✅ All VMs: Network reachable, Port 22 open"
echo " ✅ All VMs: Console access via Proxmox Web UI"
echo ""
log_warn "Not Working:"
echo " ❌ SSH to VMs (authentication failing)"
echo " ❌ QEMU Guest Agent (not installed in VMs)"
echo " ❌ SSH to R630 (authentication failing)"
echo ""
log_info "Alternative Access Methods:"
echo " 🔧 Use Proxmox Console for VM access"
echo " 🔧 Use Proxmox API for automation"
echo " 🔧 Install QEMU Guest Agent in VMs"
echo " 🔧 Fix SSH keys via console"
echo ""
log_info "See: docs/troubleshooting/ACCESS_PATHS_MAP.md"
}
main "$@"

View File

@@ -0,0 +1,70 @@
#!/bin/bash
source ~/.bashrc
# Upload Ubuntu ISO to Proxmox Storage
# Downloads and uploads Ubuntu 24.04 ISO to Proxmox
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
ML110_IP="192.168.1.206"
UBUNTU_ISO_URL="https://releases.ubuntu.com/24.04/ubuntu-24.04-live-server-amd64.iso"
ISO_NAME="ubuntu-24.04-server-amd64.iso"
main() {
log_info "Ubuntu ISO Upload Guide"
log_warn "This requires SSH access to Proxmox host"
echo ""
log_info "Option 1: Download and Upload via SSH"
echo " # Download ISO locally"
echo " wget $UBUNTU_ISO_URL -O $ISO_NAME"
echo ""
echo " # Upload to Proxmox"
echo " scp $ISO_NAME root@$ML110_IP:/var/lib/vz/template/iso/"
echo ""
echo " # Or use Proxmox Web UI:"
echo " # Datacenter → local → Content → Upload"
echo ""
log_info "Option 2: Download Directly on Proxmox Host"
echo " ssh root@$ML110_IP"
echo " cd /var/lib/vz/template/iso"
echo " wget $UBUNTU_ISO_URL -O $ISO_NAME"
echo ""
log_info "After Upload:"
echo " - ISO will appear in Proxmox storage"
echo " - Can attach to VM 9000 via Web UI or API"
echo " - Then install Ubuntu"
}
main "$@"

View File

@@ -0,0 +1,125 @@
#!/bin/bash
source ~/.bashrc
# Verify and Fix VM IP Addresses
# Checks if VM IPs are in correct subnet and updates if needed
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
verify_network() {
log_info "Verifying Network Configuration"
# Get Proxmox host IP from URL
local proxmox_ip=$(echo "$PROXMOX_URL" | sed -E 's|https?://([^:]+).*|\1|')
if [ -z "$proxmox_ip" ]; then
log_error "Could not determine Proxmox host IP"
return 1
fi
log_info "Proxmox host IP: $proxmox_ip"
# Extract subnet (assume /24)
local subnet=$(echo "$proxmox_ip" | cut -d'.' -f1-3)
log_info "Network subnet: $subnet.0/24"
# VM IPs
local vms=(
"100 192.168.1.60 cloudflare-tunnel"
"101 192.168.1.188 k3s-master"
"102 192.168.1.121 git-server"
"103 192.168.1.82 observability"
)
log_info "Checking VM IP addresses..."
local all_valid=true
for vm_spec in "${vms[@]}"; do
read -r vmid vm_ip name <<< "$vm_spec"
local vm_subnet=$(echo "$vm_ip" | cut -d'.' -f1-3)
if [ "$vm_subnet" = "$subnet" ]; then
log_info "✓ VM $vmid ($name): $vm_ip - in correct subnet"
else
log_warn "✗ VM $vmid ($name): $vm_ip - subnet mismatch!"
log_warn " Expected subnet: $subnet.0/24"
log_warn " VM subnet: $vm_subnet.0/24"
all_valid=false
fi
done
if [ "$all_valid" = true ]; then
log_info "✓ All VM IPs are in the correct subnet"
log_warn "Note: Ensure these IPs are outside DHCP range"
log_warn "Note: Gateway 192.168.1.254 must be correct for your network"
return 0
else
log_warn "Some VM IPs need adjustment"
return 1
fi
}
main() {
verify_network
log_info ""
log_info "Network Configuration Summary:"
log_info " - Proxmox host: Uses DHCP (currently $PROXMOX_URL)"
log_info " - VM IPs: Static (192.168.1.188/60/70/80)"
log_info " - Gateway: 192.168.1.254"
log_info ""
log_warn "Important:"
log_warn " 1. Ensure VM IPs are outside DHCP range"
log_warn " 2. Verify gateway 192.168.1.254 is correct"
log_warn " 3. If Proxmox host IP changes, update .env file"
}
main "$@"

View File

@@ -0,0 +1,28 @@
#!/bin/bash
source ~/.bashrc
# Automatically add 'source ~/.bashrc' after shebang to all .sh scripts in subdirs
# Usage: ./auto-prep-new-scripts.sh [--watch]
SCRIPTS_ROOT="/home/intlc/projects/loc_az_hci/scripts"
add_bashrc_source() {
local file="$1"
# Only add if not already present and if it's a bash script
if grep -q "^#!/bin/bash" "$file" && ! grep -q "^source ~/.bashrc" "$file"; then
awk 'NR==1{print; print "source ~/.bashrc"; next}1' "$file" > "$file.tmp" && mv "$file.tmp" "$file"
echo "Patched: $file"
fi
}
find "$SCRIPTS_ROOT" -type f -name '*.sh' | while read -r script; do
add_bashrc_source "$script"
done
if [[ "$1" == "--watch" ]]; then
echo "Watching for changes to .sh scripts..."
while inotifywait -e create -e modify -e move --format '%w%f' -r "$SCRIPTS_ROOT" | grep -E '\.sh$'; do
find "$SCRIPTS_ROOT" -type f -name '*.sh' | while read -r script; do
add_bashrc_source "$script"
done
done
fi

View File

@@ -0,0 +1,167 @@
#!/bin/bash
source ~/.bashrc
# Enable SSH via Proxmox API
# Attempts to enable SSH service and configure root login via API
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_URL="${PROXMOX_ML110_URL:-https://192.168.1.206:8006}"
PROXMOX_NODE="${PROXMOX_NODE:-pve}"
get_api_token() {
local response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$PROXMOX_URL/api2/json/access/ticket" 2>&1)
if echo "$response" | grep -q '"data"'; then
local ticket=$(echo "$response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
echo "$ticket|$csrf_token"
else
echo ""
fi
}
check_ssh_service() {
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_info "Checking SSH service status..."
local services=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/services" 2>&1)
if echo "$services" | grep -q '"data"'; then
local ssh_status=$(echo "$services" | python3 -c "
import sys, json
r = json.load(sys.stdin)
services = r.get('data', [])
ssh = [s for s in services if 'ssh' in s.get('name', '').lower()]
if ssh:
s = ssh[0]
print(f\"{s.get('name', 'N/A')}|{s.get('state', 'N/A')}|{s.get('enabled', 'N/A')}\")
" 2>/dev/null)
if [ -n "$ssh_status" ]; then
local name=$(echo "$ssh_status" | cut -d'|' -f1)
local state=$(echo "$ssh_status" | cut -d'|' -f2)
local enabled=$(echo "$ssh_status" | cut -d'|' -f3)
echo " Service: $name"
echo " State: $state"
echo " Enabled: $enabled"
if [ "$state" = "running" ] && [ "$enabled" = "1" ]; then
log_info "✓ SSH service is running and enabled"
return 0
else
log_warn "SSH service needs to be started/enabled"
return 1
fi
else
log_warn "SSH service not found in services list"
return 1
fi
else
log_error "Could not query services via API"
return 1
fi
}
enable_ssh_service() {
local tokens=$(get_api_token)
local ticket=$(echo "$tokens" | cut -d'|' -f1)
local csrf_token=$(echo "$tokens" | cut -d'|' -f2)
log_info "Attempting to enable SSH service via API..."
# Try to start SSH service
local start_result=$(curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/services/ssh/start" 2>&1)
if echo "$start_result" | grep -q '"data"'; then
log_info "✓ SSH service started"
else
log_warn "Could not start SSH via API: $start_result"
fi
# Try to enable SSH service
local enable_result=$(curl -s -k -X POST -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$PROXMOX_URL/api2/json/nodes/$PROXMOX_NODE/services/ssh/start" 2>&1)
if echo "$enable_result" | grep -q '"data"'; then
log_info "✓ SSH service enabled"
else
log_warn "Could not enable SSH via API: $enable_result"
fi
}
main() {
echo "========================================="
echo "Enable SSH via Proxmox API"
echo "========================================="
echo ""
log_warn "Note: SSH configuration changes typically require shell access"
log_warn "This script will attempt to enable SSH service, but root login"
log_warn "configuration may need to be done via Web UI or console"
echo ""
# Check current status
check_ssh_service
echo ""
# Try to enable
enable_ssh_service
echo ""
log_info "Summary:"
log_warn "SSH service management via API is limited"
log_info "Recommended: Enable SSH via Proxmox Web UI:"
log_info " 1. Node → System → Services → ssh → Start & Enable"
log_info " 2. Node → System → Shell → Enable root login"
log_info ""
log_info "Or use console/physical access to run:"
log_info " systemctl enable ssh && systemctl start ssh"
log_info " sed -i 's/#PermitRootLogin.*/PermitRootLogin yes/' /etc/ssh/sshd_config"
log_info " systemctl restart sshd"
}
main "$@"

View File

@@ -0,0 +1,188 @@
#!/bin/bash
source ~/.bashrc
# Prerequisites Check Script
# Validates system requirements before deployment
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
CHECK_TYPE="${1:-all}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
check_pass() {
echo -e "${GREEN}${NC} $1"
}
check_fail() {
echo -e "${RED}${NC} $1"
return 1
}
check_proxmox() {
log_info "Checking Proxmox VE installation..."
if command -v pvecm &> /dev/null && command -v pvesm &> /dev/null; then
check_pass "Proxmox VE tools installed"
pveversion | head -1
else
check_fail "Proxmox VE tools not found"
return 1
fi
}
check_network() {
log_info "Checking network configuration..."
if ip link show vmbr0 &>/dev/null; then
check_pass "Bridge vmbr0 exists"
ip addr show vmbr0 | grep "inet " || check_warn "vmbr0 has no IP address"
else
check_warn "Bridge vmbr0 not found (may need network configuration)"
fi
}
check_azure_cli() {
log_info "Checking Azure CLI installation..."
if command -v az &> /dev/null; then
check_pass "Azure CLI installed"
az version | head -1
# Check if logged in
if az account show &>/dev/null; then
check_pass "Azure CLI authenticated"
az account show --query "{subscriptionId:id, tenantId:tenantId}" -o table
else
check_warn "Azure CLI not authenticated (run 'az login')"
fi
else
check_warn "Azure CLI not installed (required for Azure Arc onboarding)"
fi
}
check_kubectl() {
log_info "Checking kubectl installation..."
if command -v kubectl &> /dev/null; then
check_pass "kubectl installed"
kubectl version --client --short
else
check_warn "kubectl not installed (required for Kubernetes management)"
fi
}
check_helm() {
log_info "Checking Helm installation..."
if command -v helm &> /dev/null; then
check_pass "Helm installed"
helm version --short
else
check_warn "Helm not installed (required for GitOps deployments)"
fi
}
check_docker() {
log_info "Checking Docker installation..."
if command -v docker &> /dev/null; then
check_pass "Docker installed"
docker --version
if docker ps &>/dev/null; then
check_pass "Docker daemon running"
else
check_warn "Docker daemon not running"
fi
else
check_warn "Docker not installed (required for Git/GitLab deployment)"
fi
}
check_terraform() {
log_info "Checking Terraform installation..."
if command -v terraform &> /dev/null; then
check_pass "Terraform installed"
terraform version | head -1
else
check_warn "Terraform not installed (optional, for IaC)"
fi
}
check_system_resources() {
log_info "Checking system resources..."
# Check memory
TOTAL_MEM=$(free -g | awk '/^Mem:/{print $2}')
if [ "$TOTAL_MEM" -ge 8 ]; then
check_pass "Memory: ${TOTAL_MEM}GB (minimum 8GB recommended)"
else
check_warn "Memory: ${TOTAL_MEM}GB (8GB+ recommended)"
fi
# Check disk space
DISK_SPACE=$(df -h / | awk 'NR==2 {print $4}')
check_info "Available disk space: $DISK_SPACE"
}
check_info() {
echo -e "${GREEN}${NC} $1"
}
main() {
log_info "Running prerequisites check: $CHECK_TYPE"
case "$CHECK_TYPE" in
proxmox)
check_proxmox
check_network
;;
azure)
check_azure_cli
;;
kubernetes)
check_kubectl
check_helm
;;
git)
check_docker
;;
all)
check_proxmox
check_network
check_azure_cli
check_kubectl
check_helm
check_docker
check_terraform
check_system_resources
;;
*)
log_error "Unknown check type: $CHECK_TYPE"
log_info "Available types: proxmox, azure, kubernetes, git, all"
exit 1
;;
esac
log_info "Prerequisites check completed"
}
main "$@"

View File

@@ -0,0 +1 @@
inotify-tools

96
scripts/utils/setup-ssh-keys.sh Executable file
View File

@@ -0,0 +1,96 @@
#!/bin/bash
source ~/.bashrc
# Setup SSH Keys for Proxmox Access
# Generates SSH key and provides instructions for adding to Proxmox hosts
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
SSH_KEY_NAME="id_ed25519_proxmox"
SSH_KEY_PATH="$HOME/.ssh/$SSH_KEY_NAME"
PUBLIC_KEY_PATH="$SSH_KEY_PATH.pub"
generate_key() {
if [ -f "$SSH_KEY_PATH" ]; then
log_info "SSH key already exists: $SSH_KEY_PATH"
return 0
fi
log_info "Generating SSH key..."
ssh-keygen -t ed25519 -f "$SSH_KEY_PATH" -N "" -C "proxmox-access"
log_info "✓ SSH key generated: $SSH_KEY_PATH"
}
display_public_key() {
if [ -f "$PUBLIC_KEY_PATH" ]; then
log_info "Your public SSH key:"
echo ""
cat "$PUBLIC_KEY_PATH"
echo ""
log_info "Copy this key and add it to Proxmox hosts"
else
log_error "Public key not found: $PUBLIC_KEY_PATH"
return 1
fi
}
show_instructions() {
log_info "To add SSH key to Proxmox hosts:"
echo ""
echo "Option 1: Via Proxmox Web UI Shell"
echo " 1. Access Proxmox Web UI"
echo " 2. Node → System → Shell"
echo " 3. Run:"
echo " mkdir -p ~/.ssh"
echo " chmod 700 ~/.ssh"
echo " echo '$(cat $PUBLIC_KEY_PATH)' >> ~/.ssh/authorized_keys"
echo " chmod 600 ~/.ssh/authorized_keys"
echo ""
echo "Option 2: Copy public key to clipboard"
echo " Run: cat $PUBLIC_KEY_PATH | xclip -selection clipboard"
echo " Then paste into Proxmox shell"
echo ""
echo "Option 3: Use ssh-copy-id (if password auth works)"
echo " ssh-copy-id -i $PUBLIC_KEY_PATH root@192.168.1.206"
echo " ssh-copy-id -i $PUBLIC_KEY_PATH root@192.168.1.49"
}
main() {
echo "========================================="
echo "SSH Key Setup for Proxmox Access"
echo "========================================="
echo ""
generate_key
echo ""
display_public_key
echo ""
show_instructions
echo ""
log_info "After adding the key to Proxmox hosts, test with:"
log_info " ssh -i $SSH_KEY_PATH root@192.168.1.206 'hostname'"
}
main "$@"

View File

@@ -0,0 +1,235 @@
#!/bin/bash
source ~/.bashrc
# Test Cloudflare API Connection Script
# Tests connectivity and authentication to Cloudflare using .env credentials
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Load environment variables from .env if it exists
if [ -f .env ]; then
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
# Cloudflare configuration (support multiple variable names)
CLOUDFLARE_API_TOKEN="${CLOUDFLARE_API_TOKEN:-${CLOUDFLARE_API_KEY:-}}"
CLOUDFLARE_TUNNEL_TOKEN="${CLOUDFLARE_TUNNEL_TOKEN:-}"
CLOUDFLARE_ACCOUNT_EMAIL="${CLOUDFLARE_ACCOUNT_EMAIL:-}"
CLOUDFLARE_ACCOUNT_ID="${CLOUDFLARE_ACCOUNT_ID:-}"
CLOUDFLARE_ZONE_ID="${CLOUDFLARE_ZONE_ID:-}"
CLOUDFLARE_DOMAIN="${CLOUDFLARE_DOMAIN:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
test_cloudflare_api() {
log_test "Testing Cloudflare API connection..."
if [ -z "$CLOUDFLARE_API_TOKEN" ]; then
log_error "CLOUDFLARE_API_TOKEN not set (check .env file)"
return 1
fi
# Test API token authentication
log_test " Testing API token authentication..."
local api_response=$(curl -s -X GET "https://api.cloudflare.com/client/v4/user/tokens/verify" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" 2>&1)
if echo "$api_response" | grep -q '"success":true'; then
echo -e " ${GREEN}${NC} API token authentication successful"
# Extract account information
local account_id=$(echo "$api_response" | grep -o '"id":"[^"]*' | head -1 | cut -d'"' -f4)
local account_email=$(echo "$api_response" | grep -o '"email":"[^"]*' | cut -d'"' -f4)
local status=$(echo "$api_response" | grep -o '"status":"[^"]*' | cut -d'"' -f4)
echo " Account ID: $account_id"
echo " Account Email: $account_email"
echo " Status: $status"
# Test account information retrieval
log_test " Testing account information retrieval..."
local account_response=$(curl -s -X GET "https://api.cloudflare.com/client/v4/accounts" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" 2>&1)
if echo "$account_response" | grep -q '"success":true'; then
echo -e " ${GREEN}${NC} Account information retrieved"
local account_count=$(echo "$account_response" | grep -o '"id":"[^"]*' | wc -l)
echo " Accounts found: $account_count"
else
echo -e " ${YELLOW}${NC} Could not retrieve account information"
fi
# Test Zero Trust API (if available)
log_test " Testing Zero Trust API access..."
local zero_trust_response=$(curl -s -X GET "https://api.cloudflare.com/client/v4/accounts/$account_id/gateway/locations" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" 2>&1)
if echo "$zero_trust_response" | grep -q '"success":true'; then
echo -e " ${GREEN}${NC} Zero Trust API accessible"
elif echo "$zero_trust_response" | grep -q '"errors"'; then
local error_code=$(echo "$zero_trust_response" | grep -o '"code":[0-9]*' | head -1 | cut -d':' -f2)
if [ "$error_code" = "10004" ]; then
echo -e " ${YELLOW}${NC} Zero Trust not enabled (error 10004)"
log_info " Enable Zero Trust in Cloudflare Dashboard to use Tunnel features"
else
echo -e " ${YELLOW}${NC} Zero Trust API error (code: $error_code)"
fi
else
echo -e " ${YELLOW}${NC} Zero Trust API test inconclusive"
fi
# Test Tunnel API (if Zero Trust enabled)
if [ -n "$CLOUDFLARE_ACCOUNT_ID" ]; then
local account_id_for_tunnel="$CLOUDFLARE_ACCOUNT_ID"
else
local account_id_for_tunnel="$account_id"
fi
log_test " Testing Tunnel API access..."
local tunnel_response=$(curl -s -X GET "https://api.cloudflare.com/client/v4/accounts/$account_id_for_tunnel/cfd_tunnel" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" 2>&1)
if echo "$tunnel_response" | grep -q '"success":true'; then
echo -e " ${GREEN}${NC} Tunnel API accessible"
local tunnel_count=$(echo "$tunnel_response" | grep -o '"id":"[^"]*' | wc -l)
echo " Existing tunnels: $tunnel_count"
elif echo "$tunnel_response" | grep -q '"errors"'; then
local error_code=$(echo "$tunnel_response" | grep -o '"code":[0-9]*' | head -1 | cut -d':' -f2)
if [ "$error_code" = "10004" ]; then
echo -e " ${YELLOW}${NC} Zero Trust required for Tunnel API"
else
echo -e " ${YELLOW}${NC} Tunnel API error (code: $error_code)"
fi
else
echo -e " ${YELLOW}${NC} Tunnel API test inconclusive"
fi
# Test DNS API (if zone ID provided)
if [ -n "$CLOUDFLARE_ZONE_ID" ]; then
log_test " Testing DNS API with Zone ID..."
local dns_response=$(curl -s -X GET "https://api.cloudflare.com/client/v4/zones/$CLOUDFLARE_ZONE_ID" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" 2>&1)
if echo "$dns_response" | grep -q '"success":true'; then
echo -e " ${GREEN}${NC} Zone access successful"
local zone_name=$(echo "$dns_response" | grep -o '"name":"[^"]*' | cut -d'"' -f4)
local zone_status=$(echo "$dns_response" | grep -o '"status":"[^"]*' | cut -d'"' -f4)
echo " Zone: $zone_name"
echo " Status: $zone_status"
else
echo -e " ${RED}${NC} Zone access failed"
echo " Response: $dns_response"
fi
else
log_warn " CLOUDFLARE_ZONE_ID not set, skipping DNS zone test"
fi
return 0
else
echo -e " ${RED}${NC} API token authentication failed"
if echo "$api_response" | grep -q '"errors"'; then
local error_msg=$(echo "$api_response" | grep -o '"message":"[^"]*' | head -1 | cut -d'"' -f4)
echo " Error: $error_msg"
else
echo " Response: $api_response"
fi
return 1
fi
}
main() {
echo "========================================="
echo "Cloudflare API Connection Test"
echo "========================================="
echo ""
# Check if .env file exists
if [ ! -f .env ]; then
log_warn ".env file not found. Using environment variables or defaults."
log_warn "Create .env from .env.example and configure credentials."
echo ""
fi
# Validate required variables
if [ -z "$CLOUDFLARE_API_TOKEN" ] && [ -z "$CLOUDFLARE_API_KEY" ]; then
log_error "CLOUDFLARE_API_TOKEN or CLOUDFLARE_API_KEY not set"
log_info "Set it in .env file or as environment variable:"
log_info " export CLOUDFLARE_API_TOKEN=your-api-token"
log_info " or export CLOUDFLARE_API_KEY=your-api-key"
log_info "Get token from: https://dash.cloudflare.com/profile/api-tokens"
exit 1
fi
echo "Configuration:"
if [ -n "$CLOUDFLARE_API_TOKEN" ]; then
echo " API Token: ${CLOUDFLARE_API_TOKEN:0:10}*** (hidden)"
elif [ -n "$CLOUDFLARE_API_KEY" ]; then
echo " API Key: ${CLOUDFLARE_API_KEY:0:10}*** (hidden)"
fi
if [ -n "$CLOUDFLARE_TUNNEL_TOKEN" ]; then
echo " Tunnel Token: ${CLOUDFLARE_TUNNEL_TOKEN:0:10}*** (hidden)"
fi
if [ -n "$CLOUDFLARE_ACCOUNT_ID" ]; then
echo " Account ID: $CLOUDFLARE_ACCOUNT_ID"
fi
if [ -n "$CLOUDFLARE_ACCOUNT_EMAIL" ]; then
echo " Account Email: $CLOUDFLARE_ACCOUNT_EMAIL"
fi
if [ -n "$CLOUDFLARE_ZONE_ID" ]; then
echo " Zone ID: $CLOUDFLARE_ZONE_ID"
fi
if [ -n "$CLOUDFLARE_DOMAIN" ]; then
echo " Domain: $CLOUDFLARE_DOMAIN"
fi
echo ""
# Test connection
test_cloudflare_api
local result=$?
echo ""
echo "========================================="
echo "Test Summary"
echo "========================================="
if [ $result -eq 0 ]; then
echo -e "${GREEN}${NC} Cloudflare API: Connection successful"
log_info "Cloudflare API is ready for use!"
exit 0
else
echo -e "${RED}${NC} Cloudflare API: Connection failed"
log_error "Check your API token and permissions."
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,244 @@
#!/bin/bash
source ~/.bashrc
# Test Proxmox VE Connection Script
# Tests connectivity and authentication to Proxmox hosts using .env credentials
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Load environment variables from .env if it exists
if [ -f .env ]; then
# Source .env file, handling comments and inline comments
set -a
source <(grep -v '^#' .env | grep -v '^$' | sed 's/#.*$//' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '=')
set +a
fi
# Proxmox configuration
PVE_USERNAME="${PVE_USERNAME:-root@pam}"
PVE_PASSWORD="${PVE_ROOT_PASS:-}"
PROXMOX_ML110_URL="${PROXMOX_ML110_URL:-}"
PROXMOX_R630_URL="${PROXMOX_R630_URL:-}"
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
test_connection() {
local host_name=$1
local host_url=$2
if [ -z "$host_url" ]; then
log_error "$host_name: URL not set (check .env file)"
return 1
fi
if [ -z "$PVE_PASSWORD" ]; then
log_error "$host_name: PVE_ROOT_PASS not set (check .env file)"
return 1
fi
log_test "Testing connection to $host_name..."
echo " URL: $host_url"
# Extract hostname/IP from URL
local host_ip=$(echo "$host_url" | sed -E 's|https?://([^:]+).*|\1|')
# Test basic connectivity (ping) - optional, as ping may be blocked
log_test " Testing network connectivity..."
if ping -c 1 -W 2 "$host_ip" &> /dev/null; then
echo -e " ${GREEN}${NC} Network reachable (ping)"
else
echo -e " ${YELLOW}${NC} Ping failed (may be blocked by firewall, continuing with API test...)"
fi
# Test HTTPS port connectivity
log_test " Testing HTTPS port (8006)..."
if timeout 3 bash -c "cat < /dev/null > /dev/tcp/$host_ip/8006" 2>/dev/null; then
echo -e " ${GREEN}${NC} Port 8006 is open"
else
echo -e " ${YELLOW}${NC} Port test inconclusive (may require root), continuing with API test..."
fi
# Test Proxmox API authentication
log_test " Testing Proxmox API authentication..."
# Get CSRF token and ticket with timeout
local api_response=$(curl -s -k --connect-timeout 10 --max-time 15 \
-d "username=$PVE_USERNAME&password=$PVE_PASSWORD" \
"$host_url/api2/json/access/ticket" 2>&1)
if echo "$api_response" | grep -q '"data"'; then
local ticket=$(echo "$api_response" | grep -o '"ticket":"[^"]*' | cut -d'"' -f4)
local csrf_token=$(echo "$api_response" | grep -o '"CSRFPreventionToken":"[^"]*' | cut -d'"' -f4)
if [ -n "$ticket" ] && [ -n "$csrf_token" ]; then
echo -e " ${GREEN}${NC} Authentication successful"
# Test API access with ticket
log_test " Testing API access..."
local version_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$host_url/api2/json/version" 2>&1)
if echo "$version_response" | grep -q '"data"'; then
local pve_version=$(echo "$version_response" | grep -o '"version":"[^"]*' | cut -d'"' -f4)
local release=$(echo "$version_response" | grep -o '"release":"[^"]*' | cut -d'"' -f4)
echo -e " ${GREEN}${NC} API access successful"
echo " Proxmox Version: $pve_version"
echo " Release: $release"
# Get cluster status if available
log_test " Testing cluster status..."
local cluster_response=$(curl -s -k -H "Cookie: PVEAuthCookie=$ticket" \
-H "CSRFPreventionToken: $csrf_token" \
"$host_url/api2/json/cluster/status" 2>&1)
if echo "$cluster_response" | grep -q '"data"'; then
echo -e " ${GREEN}${NC} Cluster API accessible"
local node_count=$(echo "$cluster_response" | grep -o '"name":"[^"]*' | wc -l)
echo " Cluster nodes found: $node_count"
else
echo -e " ${YELLOW}${NC} Not in a cluster (standalone node)"
fi
return 0
else
echo -e " ${RED}${NC} API access failed"
echo " Response: $version_response"
return 1
fi
else
echo -e " ${RED}${NC} Failed to extract authentication tokens"
return 1
fi
else
echo -e " ${RED}${NC} Authentication failed"
if echo "$api_response" | grep -q "401"; then
echo " Error: Invalid credentials (check PVE_ROOT_PASS in .env)"
elif echo "$api_response" | grep -q "Connection refused"; then
echo " Error: Connection refused (check if Proxmox is running)"
elif echo "$api_response" | grep -q "Connection timed out\|timed out\|Operation timed out"; then
echo " Error: Connection timed out"
echo " Possible causes:"
echo " - Host is behind a firewall or VPN"
echo " - Host is not accessible from this network"
echo " - Host may be down or unreachable"
echo " Try accessing the web UI directly: $host_url"
elif [ -z "$api_response" ]; then
echo " Error: No response from server (connection timeout or network issue)"
echo " Try accessing the web UI directly: $host_url"
else
echo " Response: $api_response"
fi
return 1
fi
}
main() {
echo "========================================="
echo "Proxmox VE Connection Test"
echo "========================================="
echo ""
log_info "Note: Proxmox uses self-signed SSL certificates by default."
log_info "Browser warnings are normal. The script uses -k flag to bypass certificate validation."
echo ""
# Check if .env file exists
if [ ! -f .env ]; then
log_warn ".env file not found. Using environment variables or defaults."
log_warn "Create .env from .env.example and configure credentials."
echo ""
fi
# Validate required variables
if [ -z "$PVE_PASSWORD" ]; then
log_error "PVE_ROOT_PASS not set"
log_info "Set it in .env file or as environment variable:"
log_info " export PVE_ROOT_PASS=your-password"
exit 1
fi
echo "Configuration:"
echo " Username: $PVE_USERNAME (implied, not stored)"
echo " Password: ${PVE_PASSWORD:0:3}*** (hidden)"
echo ""
local ml110_result=0
local r630_result=0
# Test ML110
if [ -n "$PROXMOX_ML110_URL" ]; then
echo "----------------------------------------"
test_connection "HPE ML110 Gen9" "$PROXMOX_ML110_URL"
ml110_result=$?
echo ""
else
log_warn "PROXMOX_ML110_URL not set, skipping ML110 test"
ml110_result=1
fi
# Test R630 (continue even if ML110 failed)
if [ -n "$PROXMOX_R630_URL" ]; then
echo "----------------------------------------"
test_connection "Dell R630" "$PROXMOX_R630_URL"
r630_result=$?
echo ""
else
log_warn "PROXMOX_R630_URL not set, skipping R630 test"
r630_result=1
fi
# Summary
echo "========================================="
echo "Test Summary"
echo "========================================="
if [ -n "$PROXMOX_ML110_URL" ]; then
if [ $ml110_result -eq 0 ]; then
echo -e "${GREEN}${NC} HPE ML110 Gen9: Connection successful"
else
echo -e "${RED}${NC} HPE ML110 Gen9: Connection failed"
fi
fi
if [ -n "$PROXMOX_R630_URL" ]; then
if [ $r630_result -eq 0 ]; then
echo -e "${GREEN}${NC} Dell R630: Connection successful"
else
echo -e "${RED}${NC} Dell R630: Connection failed"
fi
fi
echo ""
if [ $ml110_result -eq 0 ] && [ $r630_result -eq 0 ]; then
log_info "All connections successful!"
exit 0
else
log_error "Some connections failed. Check your .env configuration."
exit 1
fi
}
main "$@"

210
scripts/utils/test-ssh-access.sh Executable file
View File

@@ -0,0 +1,210 @@
#!/bin/bash
source ~/.bashrc
# Test SSH Access to Proxmox Servers
# Tests SSH connectivity to both ML110 and R630
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Load environment variables
if [ -f "$PROJECT_ROOT/.env" ]; then
set -a
source <(grep -v '^#' "$PROJECT_ROOT/.env" | grep -v '^$' | sed 's/#.*$//' | grep '=')
set +a
fi
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_test() {
echo -e "${BLUE}[TEST]${NC} $1"
}
ML110_IP="${PROXMOX_ML110_IP:-192.168.1.206}"
R630_IP="${PROXMOX_R630_IP:-192.168.1.49}"
test_ssh() {
local host=$1
local name=$2
log_test "Testing SSH to $name ($host)..."
# Test network connectivity first
if ping -c 1 -W 2 "$host" &>/dev/null; then
echo -e " ${GREEN}${NC} Network reachable (ping)"
else
echo -e " ${YELLOW}${NC} Ping failed (may be blocked by firewall)"
fi
# Test SSH port
if timeout 3 bash -c "cat < /dev/null > /dev/tcp/$host/22" 2>/dev/null; then
echo -e " ${GREEN}${NC} SSH port 22 is open"
else
echo -e " ${RED}${NC} SSH port 22 is closed or filtered"
return 1
fi
# Test SSH connection
log_test " Attempting SSH connection..."
if ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 -o BatchMode=yes "root@$host" "echo 'SSH connection successful'" 2>&1 | grep -q "SSH connection successful"; then
echo -e " ${GREEN}${NC} SSH connection successful"
# Test command execution
log_test " Testing command execution..."
local hostname=$(ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$host" "hostname" 2>/dev/null)
if [ -n "$hostname" ]; then
echo -e " ${GREEN}${NC} Command execution works"
echo -e " ${GREEN}${NC} Hostname: $hostname"
# Get system info
local uptime=$(ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$host" "uptime -p" 2>/dev/null || echo "unknown")
local os=$(ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$host" "cat /etc/os-release | grep PRETTY_NAME | cut -d'=' -f2 | tr -d '\"'" 2>/dev/null || echo "unknown")
echo -e " ${GREEN}${NC} Uptime: $uptime"
echo -e " ${GREEN}${NC} OS: $os"
return 0
else
echo -e " ${YELLOW}${NC} SSH works but command execution failed"
return 1
fi
else
echo -e " ${RED}${NC} SSH connection failed"
echo -e " ${YELLOW}Possible reasons:${NC}"
echo -e " - SSH service not running"
echo -e " - Root login disabled"
echo -e " - Authentication failed (need SSH key or password)"
echo -e " - Firewall blocking connection"
return 1
fi
}
test_ssh_with_password() {
local host=$1
local name=$2
local password=$3
log_test "Testing SSH with password authentication to $name ($host)..."
# Check if sshpass is available
if ! command -v sshpass &> /dev/null; then
log_warn "sshpass not installed - cannot test password authentication"
log_info "Install with: sudo apt install sshpass"
return 1
fi
if sshpass -p "$password" ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 "root@$host" "echo 'SSH with password successful'" 2>&1 | grep -q "SSH with password successful"; then
echo -e " ${GREEN}${NC} SSH with password authentication works"
return 0
else
echo -e " ${RED}${NC} SSH with password authentication failed"
return 1
fi
}
main() {
echo "========================================="
echo "SSH Access Test - Proxmox Servers"
echo "========================================="
echo ""
local ml110_ok=false
local r630_ok=false
# Test ML110
log_info "Testing ML110 (HPE ML110 Gen9)..."
if test_ssh "$ML110_IP" "ML110"; then
ml110_ok=true
log_info "✓ ML110 SSH access: WORKING"
else
log_error "✗ ML110 SSH access: FAILED"
# Try with password if available
if [ -n "${PVE_ROOT_PASS:-}" ]; then
log_info "Attempting password authentication..."
if test_ssh_with_password "$ML110_IP" "ML110" "$PVE_ROOT_PASS"; then
ml110_ok=true
log_info "✓ ML110 SSH with password: WORKING"
fi
fi
fi
echo ""
echo "----------------------------------------"
echo ""
# Test R630
log_info "Testing R630 (Dell R630)..."
if test_ssh "$R630_IP" "R630"; then
r630_ok=true
log_info "✓ R630 SSH access: WORKING"
else
log_error "✗ R630 SSH access: FAILED"
# Try with password if available
if [ -n "${PVE_ROOT_PASS:-}" ]; then
log_info "Attempting password authentication..."
if test_ssh_with_password "$R630_IP" "R630" "$PVE_ROOT_PASS"; then
r630_ok=true
log_info "✓ R630 SSH with password: WORKING"
fi
fi
fi
echo ""
echo "========================================="
echo "Summary"
echo "========================================="
echo ""
if [ "$ml110_ok" = true ]; then
log_info "ML110 ($ML110_IP): ✓ SSH ACCESSIBLE"
else
log_error "ML110 ($ML110_IP): ✗ SSH NOT ACCESSIBLE"
log_warn " - Enable SSH: systemctl enable ssh && systemctl start ssh"
log_warn " - Allow root login: Edit /etc/ssh/sshd_config (PermitRootLogin yes)"
log_warn " - Check firewall: iptables -L"
fi
if [ "$r630_ok" = true ]; then
log_info "R630 ($R630_IP): ✓ SSH ACCESSIBLE"
else
log_error "R630 ($R630_IP): ✗ SSH NOT ACCESSIBLE"
log_warn " - Enable SSH: systemctl enable ssh && systemctl start ssh"
log_warn " - Allow root login: Edit /etc/ssh/sshd_config (PermitRootLogin yes)"
log_warn " - Check firewall: iptables -L"
fi
echo ""
if [ "$ml110_ok" = true ] && [ "$r630_ok" = true ]; then
log_info "✓ Both servers have SSH access - ready for template recreation!"
return 0
elif [ "$ml110_ok" = true ]; then
log_warn "Only ML110 has SSH access - can proceed with template recreation"
return 0
else
log_error "No SSH access available - need to enable SSH first"
return 1
fi
}
main "$@"

View File

@@ -0,0 +1,156 @@
#!/bin/bash
source ~/.bashrc
# Validate Deployment
# Post-deployment validation and configuration drift detection
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_check() {
echo -e "${BLUE}[CHECK]${NC} $1"
}
validate_prerequisites() {
log_check "Validating prerequisites..."
if [ -f "$PROJECT_ROOT/scripts/utils/prerequisites-check.sh" ]; then
"$PROJECT_ROOT/scripts/utils/prerequisites-check.sh"
else
log_warn "Prerequisites check script not found"
fi
}
validate_connections() {
log_check "Validating connections..."
local all_valid=true
# Check Proxmox
if [ -f "$PROJECT_ROOT/scripts/utils/test-proxmox-connection.sh" ]; then
if "$PROJECT_ROOT/scripts/utils/test-proxmox-connection.sh" > /dev/null 2>&1; then
log_info "✓ Proxmox connections valid"
else
log_error "✗ Proxmox connections invalid"
all_valid=false
fi
fi
# Check Cloudflare
if [ -f "$PROJECT_ROOT/scripts/utils/test-cloudflare-connection.sh" ]; then
if "$PROJECT_ROOT/scripts/utils/test-cloudflare-connection.sh" > /dev/null 2>&1; then
log_info "✓ Cloudflare connection valid"
else
log_warn "⚠ Cloudflare connection invalid (may not be configured)"
fi
fi
if [ "$all_valid" = false ]; then
return 1
fi
return 0
}
validate_health() {
log_check "Validating component health..."
if [ -f "$PROJECT_ROOT/scripts/health/health-check-all.sh" ]; then
if "$PROJECT_ROOT/scripts/health/health-check-all.sh" > /dev/null 2>&1; then
log_info "✓ All components healthy"
return 0
else
log_error "✗ Some components unhealthy"
return 1
fi
else
log_warn "Health check script not found"
return 0
fi
}
validate_services() {
log_check "Validating services..."
if ! command -v kubectl &> /dev/null; then
log_warn "kubectl not found, skipping service validation"
return 0
fi
if kubectl get nodes &> /dev/null 2>&1; then
log_info "✓ Kubernetes cluster accessible"
# Check for expected namespaces
local namespaces=("blockchain" "monitoring" "hc-stack")
for ns in "${namespaces[@]}"; do
if kubectl get namespace "$ns" &> /dev/null 2>&1; then
log_info "✓ Namespace $ns exists"
else
log_warn "⚠ Namespace $ns not found"
fi
done
else
log_warn "⚠ Kubernetes cluster not accessible"
fi
return 0
}
main() {
echo "========================================="
echo "Deployment Validation"
echo "========================================="
echo ""
local validation_passed=true
validate_prerequisites
echo ""
if ! validate_connections; then
validation_passed=false
fi
echo ""
if ! validate_health; then
validation_passed=false
fi
echo ""
validate_services
echo ""
echo "========================================="
echo "Validation Summary"
echo "========================================="
if [ "$validation_passed" = true ]; then
log_info "✓ Deployment validation passed"
exit 0
else
log_error "✗ Deployment validation failed"
exit 1
fi
}
main "$@"

View File

@@ -0,0 +1,196 @@
#!/bin/bash
source ~/.bashrc
# Apply Install Scripts to VMs via SSH
# This script connects to each VM and runs the appropriate install script
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
log_info() {
echo -e "${GREEN}[INFO]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
log_step() {
echo -e "${BLUE}[STEP]${NC} $1"
}
log_warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
# VM Configuration
declare -A VMS=(
[100]="cloudflare-tunnel:192.168.1.60:setup-cloudflare-tunnel.sh"
[101]="k3s-master:192.168.1.188:setup-k3s.sh"
[102]="git-server:192.168.1.121:setup-git-server.sh"
[103]="observability:192.168.1.82:setup-observability.sh"
)
SSH_USER="${SSH_USER:-ubuntu}"
SSH_KEY="${SSH_KEY:-~/.ssh/id_rsa}"
# Check if VM is reachable
check_vm_reachable() {
local ip=$1
local timeout=5
if ping -c 1 -W $timeout "$ip" > /dev/null 2>&1; then
return 0
else
return 1
fi
}
# Wait for VM to be ready
wait_for_vm() {
local ip=$1
local max_attempts=30
local attempt=0
log_info "Waiting for VM at $ip to be reachable..."
while [ $attempt -lt $max_attempts ]; do
if check_vm_reachable "$ip"; then
log_info "✓ VM is reachable"
return 0
fi
attempt=$((attempt + 1))
echo -n "."
sleep 2
done
echo ""
log_error "VM at $ip is not reachable after $max_attempts attempts"
return 1
}
# Check SSH connectivity
check_ssh() {
local ip=$1
local user=$2
if ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -i "$SSH_KEY" "${user}@${ip}" "echo 'SSH OK'" > /dev/null 2>&1; then
return 0
else
return 1
fi
}
# Wait for SSH
wait_for_ssh() {
local ip=$1
local user=$2
local max_attempts=60
local attempt=0
log_info "Waiting for SSH on $ip..."
while [ $attempt -lt $max_attempts ]; do
if check_ssh "$ip" "$user"; then
log_info "✓ SSH is ready"
return 0
fi
attempt=$((attempt + 1))
echo -n "."
sleep 5
done
echo ""
log_error "SSH not available after $max_attempts attempts"
return 1
}
# Apply install script to VM
apply_install_script() {
local vmid=$1
local name=$2
local ip=$3
local script=$4
log_step "Applying install script to VM $vmid: $name"
# Wait for VM to be ready
if ! wait_for_vm "$ip"; then
log_error "VM not reachable, skipping..."
return 1
fi
# Wait for SSH
if ! wait_for_ssh "$ip" "$SSH_USER"; then
log_error "SSH not available, skipping..."
return 1
fi
# Copy install script to VM
log_info "Copying install script to VM..."
if ! scp -o StrictHostKeyChecking=no -i "$SSH_KEY" "scripts/${script}" "${SSH_USER}@${ip}:/tmp/install-service.sh"; then
log_error "Failed to copy script"
return 1
fi
# Make script executable and run it
log_info "Running install script on VM..."
ssh -o StrictHostKeyChecking=no -i "$SSH_KEY" "${SSH_USER}@${ip}" <<EOF
sudo chmod +x /tmp/install-service.sh
sudo /tmp/install-service.sh
EOF
if [ $? -eq 0 ]; then
log_info "✓ Install script completed successfully"
return 0
else
log_error "✗ Install script failed"
return 1
fi
}
main() {
echo "========================================="
echo "Apply Install Scripts to VMs"
echo "========================================="
echo ""
if [ ! -f "$SSH_KEY" ]; then
log_error "SSH key not found: $SSH_KEY"
log_info "Set SSH_KEY environment variable or create key pair"
exit 1
fi
log_info "Using SSH key: $SSH_KEY"
log_info "SSH user: $SSH_USER"
echo ""
# Apply scripts to each VM
for vmid in 100 101 102 103; do
IFS=':' read -r name ip script <<< "${VMS[$vmid]}"
if apply_install_script "$vmid" "$name" "$ip" "$script"; then
log_info "✓ VM $vmid ($name) setup complete"
else
log_error "✗ Failed to setup VM $vmid"
fi
echo ""
done
log_info "========================================="
log_info "Install Script Application Complete"
log_info "========================================="
echo ""
log_info "All VMs should now have their services installed"
log_info "Check each VM to verify services are running"
}
main "$@"

Some files were not shown because too many files have changed in this diff Show More