Initial Phoenix Sankofa Cloud setup

- Complete project structure with Next.js frontend
- GraphQL API backend with Apollo Server
- Portal application with NextAuth
- Crossplane Proxmox provider
- GitOps configurations
- CI/CD pipelines
- Testing infrastructure (Vitest, Jest, Go tests)
- Error handling and monitoring
- Security hardening
- UI component library
- Documentation
This commit is contained in:
defiQUG
2025-11-28 12:54:33 -08:00
commit 6f28146ac3
229 changed files with 43136 additions and 0 deletions

100
scripts/README.md Normal file
View File

@@ -0,0 +1,100 @@
# Installation Scripts
Automated installation scripts for deploying the hybrid cloud control plane.
## Structure
```
scripts/
├── bootstrap-cluster.sh # Kubernetes cluster bootstrap
├── install-components.sh # Control plane components installation
├── setup-proxmox-agents.sh # Proxmox site agent setup
├── configure-cloudflare.sh # Cloudflare tunnel configuration
├── validate.sh # Post-install validation
└── ansible/ # Ansible playbooks
├── site-playbook.yml # Multi-site deployment
├── inventory.example # Inventory template
└── roles/ # Ansible roles
```
## Usage
### Quick Start
```bash
# 1. Bootstrap Kubernetes cluster
./bootstrap-cluster.sh
# 2. Install control plane components
./install-components.sh
# 3. Setup Proxmox agents (run on each Proxmox node)
./setup-proxmox-agents.sh --site us-east-1 --node pve1
# 4. Configure Cloudflare tunnels
./configure-cloudflare.sh
# 5. Validate installation
./validate.sh
```
### Ansible Deployment
For multi-site deployments, use Ansible:
```bash
cd ansible
cp inventory.example inventory
# Edit inventory with your hosts
ansible-playbook -i inventory site-playbook.yml
```
## Prerequisites
- Linux-based systems (Ubuntu 22.04+, RHEL 8+, Debian 11+)
- Root or sudo access
- Internet connectivity
- Kubernetes cluster (for component installation)
- Proxmox VE 8+ (for agent setup)
- Cloudflare account (for tunnel configuration)
## Script Details
### bootstrap-cluster.sh
Installs and configures Kubernetes cluster (RKE2 or k3s):
- System preparation
- Container runtime installation
- Kubernetes installation
- Network plugin configuration
- Storage class setup
### install-components.sh
Installs all control plane components:
- ArgoCD
- Rancher
- Crossplane
- Vault
- Monitoring stack
- Portal
### setup-proxmox-agents.sh
Configures Proxmox nodes:
- cloudflared installation
- Prometheus exporter installation
- Custom agent installation
- Service configuration
### configure-cloudflare.sh
Sets up Cloudflare tunnels:
- Tunnel creation
- Configuration deployment
- Service startup
- Health checks
### validate.sh
Validates installation:
- Component health checks
- API connectivity tests
- Resource availability
- Network connectivity

View File

@@ -0,0 +1,26 @@
# Ansible Inventory Example
# Copy to inventory and customize with your hosts
[proxmox_site_1]
pve1 ansible_host=10.1.0.10 site=us-east-1
pve2 ansible_host=10.1.0.11 site=us-east-1
pve3 ansible_host=10.1.0.12 site=us-east-1
[proxmox_site_2]
pve4 ansible_host=10.2.0.10 site=eu-west-1
pve5 ansible_host=10.2.0.11 site=eu-west-1
pve6 ansible_host=10.2.0.12 site=eu-west-1
[proxmox_site_3]
pve7 ansible_host=10.3.0.10 site=apac-1
pve8 ansible_host=10.3.0.11 site=apac-1
[proxmox:children]
proxmox_site_1
proxmox_site_2
proxmox_site_3
[proxmox:vars]
ansible_user=root
ansible_ssh_common_args='-o StrictHostKeyChecking=no'

View File

@@ -0,0 +1,126 @@
---
# Ansible Playbook for Multi-Site Deployment
# Deploys agents and configures Proxmox sites
- name: Deploy Hybrid Cloud Control Plane to Multiple Sites
hosts: all
become: yes
vars:
cloudflare_tunnel_token: "{{ vault_cloudflare_tunnel_token }}"
site_name: "{{ inventory_hostname | regex_replace('^pve[0-9]+', 'site') }}"
prometheus_enabled: true
tasks:
- name: Ensure system is up to date
package:
name:
- curl
- wget
- git
- jq
state: present
when: ansible_os_family == "Debian"
- name: Install cloudflared
block:
- name: Check if cloudflared is installed
command: which cloudflared
register: cloudflared_check
changed_when: false
failed_when: false
- name: Download cloudflared
get_url:
url: "https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-{{ ansible_architecture | replace('x86_64', 'amd64') | replace('aarch64', 'arm64') }}"
dest: /usr/local/bin/cloudflared
mode: '0755'
when: cloudflared_check.rc != 0
- name: Create cloudflared directories
file:
path: "{{ item }}"
state: directory
mode: '0755'
loop:
- /etc/cloudflared
- /etc/cloudflared/tunnel-configs
- /var/log/cloudflared
- name: Copy tunnel configuration
template:
src: tunnel-config.j2
dest: /etc/cloudflared/tunnel-configs/{{ site_name }}.yaml
mode: '0644'
vars:
node_name: "{{ inventory_hostname }}"
- name: Create tunnel credentials file
copy:
content: '{"AccountTag":"","TunnelSecret":"","TunnelID":"","TunnelName":"{{ site_name }}-tunnel"}'
dest: /etc/cloudflared/{{ site_name }}-tunnel.json
mode: '0600'
- name: Create cloudflared systemd service
template:
src: cloudflared.service.j2
dest: /etc/systemd/system/cloudflared-tunnel.service
mode: '0644'
vars:
site_name: "{{ site_name }}"
notify: restart cloudflared
- name: Install Prometheus exporter
block:
- name: Install Python pip
package:
name: python3-pip
state: present
when: ansible_os_family == "Debian"
- name: Install pve_exporter
pip:
name: pve_exporter
state: present
when: prometheus_enabled | bool
- name: Create pve_exporter systemd service
template:
src: pve-exporter.service.j2
dest: /etc/systemd/system/pve-exporter.service
mode: '0644'
when: prometheus_enabled | bool
notify: restart pve-exporter
- name: Enable and start services
systemd:
name: "{{ item }}"
enabled: yes
state: started
daemon_reload: yes
loop:
- cloudflared-tunnel
- pve-exporter
when: item != "pve-exporter" or prometheus_enabled | bool
- name: Verify cloudflared is running
systemd:
name: cloudflared-tunnel
register: cloudflared_status
- name: Display tunnel status
debug:
msg: "Cloudflare tunnel is {{ cloudflared_status.status.ActiveState }}"
handlers:
- name: restart cloudflared
systemd:
name: cloudflared-tunnel
state: restarted
daemon_reload: yes
- name: restart pve-exporter
systemd:
name: pve-exporter
state: restarted
daemon_reload: yes

View File

@@ -0,0 +1,16 @@
[Unit]
Description=Cloudflare Tunnel for {{ site_name }}
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/tunnel-configs/{{ site_name }}.yaml run
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,16 @@
[Unit]
Description=Proxmox VE Prometheus Exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/bin/pve_exporter --web.listen-address=0.0.0.0:9221
Restart=on-failure
RestartSec=5s
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,50 @@
# Cloudflare Tunnel Configuration for {{ site_name }}
# Generated by Ansible
tunnel: {{ site_name }}-tunnel
credentials-file: /etc/cloudflared/{{ site_name }}-tunnel.json
ingress:
# Proxmox Web UI
- hostname: {{ node_name }}.yourdomain.com
service: https://{{ node_name }}.local:8006
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
tls:
skipVerify: true
httpHostHeader: {{ node_name }}.local:8006
# Proxmox API
- hostname: {{ node_name }}-api.yourdomain.com
service: https://{{ node_name }}.local:8006
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
tls:
skipVerify: true
httpHostHeader: {{ node_name }}.local:8006
# Prometheus Exporter
- hostname: {{ node_name }}-metrics.yourdomain.com
service: http://localhost:9221
originRequest:
noHappyEyeballs: true
connectTimeout: 30s
# Catch-all rule (must be last)
- service: http_status:404
# Logging
loglevel: info
logfile: /var/log/cloudflared/{{ site_name }}-tunnel.log
# Metrics
metrics: 0.0.0.0:9090
# Health check
health-probe:
enabled: true
path: /health
port: 8080

167
scripts/bootstrap-cluster.sh Executable file
View File

@@ -0,0 +1,167 @@
#!/bin/bash
set -euo pipefail
# Kubernetes Cluster Bootstrap Script
# Supports RKE2 and k3s
K8S_DISTRO="${K8S_DISTRO:-rke2}"
K8S_VERSION="${K8S_VERSION:-latest}"
NODE_TYPE="${NODE_TYPE:-server}"
MASTER_NODES="${MASTER_NODES:-}"
TOKEN="${TOKEN:-}"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
}
error() {
log "ERROR: $*"
exit 1
}
install_rke2() {
log "Installing RKE2 ${K8S_VERSION}..."
# Install RKE2
curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION="${K8S_VERSION}" sh -
# Configure RKE2
mkdir -p /etc/rancher/rke2
if [ "${NODE_TYPE}" = "server" ]; then
cat > /etc/rancher/rke2/config.yaml <<EOF
token: ${TOKEN:-$(openssl rand -hex 32)}
cluster-cidr: "10.42.0.0/16"
service-cidr: "10.43.0.0/16"
cluster-dns: "10.43.0.10"
EOF
# Enable required features
systemctl enable rke2-server.service
systemctl start rke2-server.service
else
cat > /etc/rancher/rke2/config.yaml <<EOF
server: https://${MASTER_NODES}:9345
token: ${TOKEN}
EOF
systemctl enable rke2-agent.service
systemctl start rke2-agent.service
fi
# Wait for service to be ready
log "Waiting for RKE2 to be ready..."
sleep 30
# Install kubectl
if [ "${NODE_TYPE}" = "server" ]; then
mkdir -p /usr/local/bin
cp /var/lib/rancher/rke2/bin/kubectl /usr/local/bin/kubectl
chmod +x /usr/local/bin/kubectl
# Configure kubeconfig
mkdir -p ~/.kube
cp /etc/rancher/rke2/rke2.yaml ~/.kube/config
chmod 600 ~/.kube/config
fi
}
install_k3s() {
log "Installing k3s ${K8S_VERSION}..."
if [ "${NODE_TYPE}" = "server" ]; then
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="${K8S_VERSION}" sh -s - server \
--cluster-init \
--cluster-cidr 10.42.0.0/16 \
--service-cidr 10.43.0.0/16
# Wait for k3s to be ready
log "Waiting for k3s to be ready..."
sleep 30
# Configure kubeconfig
mkdir -p ~/.kube
cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
chmod 600 ~/.kube/config
else
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="${K8S_VERSION}" K3S_URL=https://${MASTER_NODES}:6443 K3S_TOKEN=${TOKEN} sh -
fi
}
setup_system() {
log "Setting up system prerequisites..."
# Disable swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load required kernel modules
modprobe overlay
modprobe br_netfilter
# Configure sysctl
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl --system
# Install required packages
if command -v apt-get &> /dev/null; then
apt-get update
apt-get install -y curl wget git jq
elif command -v yum &> /dev/null; then
yum install -y curl wget git jq
fi
}
install_network_plugin() {
log "Installing network plugin (Cilium)..."
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.14.0/install/kubernetes/quick-install.yaml
log "Waiting for Cilium to be ready..."
kubectl wait --for=condition=ready pod -l k8s-app=cilium -n kube-system --timeout=300s
}
install_storage_class() {
log "Installing local-path storage class..."
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.24/deploy/local-path-storage.yaml
# Set as default
kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
}
main() {
log "Starting Kubernetes cluster bootstrap..."
setup_system
case "${K8S_DISTRO}" in
rke2)
install_rke2
;;
k3s)
install_k3s
;;
*)
error "Unsupported Kubernetes distribution: ${K8S_DISTRO}"
;;
esac
if [ "${NODE_TYPE}" = "server" ]; then
install_network_plugin
install_storage_class
log "Kubernetes cluster bootstrap completed!"
log "Kubeconfig location: ~/.kube/config"
kubectl get nodes
else
log "Agent node setup completed!"
fi
}
main "$@"

147
scripts/configure-cloudflare.sh Executable file
View File

@@ -0,0 +1,147 @@
#!/bin/bash
set -euo pipefail
# Cloudflare Tunnel Configuration Script
CLOUDFLARE_API_TOKEN="${CLOUDFLARE_API_TOKEN:-}"
ZONE_ID="${ZONE_ID:-}"
ACCOUNT_ID="${ACCOUNT_ID:-}"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
}
error() {
log "ERROR: $*"
exit 1
}
check_prerequisites() {
if [ -z "${CLOUDFLARE_API_TOKEN}" ]; then
error "CLOUDFLARE_API_TOKEN environment variable is required"
fi
if [ -z "${ZONE_ID}" ]; then
error "ZONE_ID environment variable is required"
fi
if [ -z "${ACCOUNT_ID}" ]; then
error "ACCOUNT_ID environment variable is required"
fi
if ! command -v cloudflared &> /dev/null; then
error "cloudflared is not installed. Install it first."
fi
}
create_tunnel() {
local tunnel_name=$1
log "Creating Cloudflare tunnel: ${tunnel_name}"
# Create tunnel via API
TUNNEL_ID=$(curl -s -X POST \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cfd_tunnel" \
-d "{\"name\":\"${tunnel_name}\",\"config_src\":\"local\"}" \
| jq -r '.result.id')
if [ -z "${TUNNEL_ID}" ] || [ "${TUNNEL_ID}" = "null" ]; then
error "Failed to create tunnel ${tunnel_name}"
fi
log "Tunnel created with ID: ${TUNNEL_ID}"
echo "${TUNNEL_ID}"
}
get_tunnel_token() {
local tunnel_id=$1
log "Getting tunnel token for ${tunnel_id}..."
TOKEN=$(curl -s -X GET \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
"https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/cfd_tunnel/${tunnel_id}/token" \
| jq -r '.result.token')
if [ -z "${TOKEN}" ] || [ "${TOKEN}" = "null" ]; then
error "Failed to get tunnel token"
fi
echo "${TOKEN}"
}
configure_dns() {
local hostname=$1
local tunnel_id=$2
log "Configuring DNS for ${hostname}..."
# Create CNAME record
curl -s -X POST \
-H "Authorization: Bearer ${CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dns_records" \
-d "{
\"type\": \"CNAME\",
\"name\": \"${hostname}\",
\"content\": \"${tunnel_id}.cfargotunnel.com\",
\"ttl\": 1,
\"proxied\": true
}" > /dev/null
log "DNS record created for ${hostname}"
}
setup_control_plane_tunnel() {
log "Setting up control plane tunnel..."
TUNNEL_ID=$(create_tunnel "control-plane-tunnel")
TUNNEL_TOKEN=$(get_tunnel_token "${TUNNEL_ID}")
log "Control plane tunnel token: ${TUNNEL_TOKEN}"
log "Save this token securely and use it when running cloudflared on control plane nodes"
# Configure DNS for control plane services
configure_dns "portal" "${TUNNEL_ID}"
configure_dns "rancher" "${TUNNEL_ID}"
configure_dns "argocd" "${TUNNEL_ID}"
configure_dns "grafana" "${TUNNEL_ID}"
configure_dns "vault" "${TUNNEL_ID}"
configure_dns "keycloak" "${TUNNEL_ID}"
}
setup_proxmox_tunnels() {
local sites=("site-1" "site-2" "site-3")
for site in "${sites[@]}"; do
log "Setting up tunnel for ${site}..."
TUNNEL_ID=$(create_tunnel "proxmox-${site}-tunnel")
TUNNEL_TOKEN=$(get_tunnel_token "${TUNNEL_ID}")
log "${site} tunnel token: ${TUNNEL_TOKEN}"
log "Save this token and use it when running setup-proxmox-agents.sh on ${site} nodes"
done
}
main() {
log "Starting Cloudflare tunnel configuration..."
check_prerequisites
setup_control_plane_tunnel
setup_proxmox_tunnels
log ""
log "Cloudflare tunnel configuration completed!"
log ""
log "Next steps:"
log "1. Save all tunnel tokens securely"
log "2. Use tokens when running cloudflared on respective nodes"
log "3. Verify DNS records are created correctly"
log "4. Test tunnel connectivity"
}
main "$@"

160
scripts/install-components.sh Executable file
View File

@@ -0,0 +1,160 @@
#!/bin/bash
set -euo pipefail
# Control Plane Components Installation Script
GITOPS_REPO="${GITOPS_REPO:-https://github.com/yourorg/hybrid-cloud-gitops}"
GITOPS_BRANCH="${GITOPS_BRANCH:-main}"
ARGOCD_NAMESPACE="${ARGOCD_NAMESPACE:-argocd}"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
}
error() {
log "ERROR: $*"
exit 1
}
check_prerequisites() {
log "Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
error "kubectl is not installed"
fi
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
fi
}
install_argocd() {
log "Installing ArgoCD..."
kubectl create namespace ${ARGOCD_NAMESPACE} --dry-run=client -o yaml | kubectl apply -f -
kubectl apply -n ${ARGOCD_NAMESPACE} -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
log "Waiting for ArgoCD to be ready..."
kubectl wait --for=condition=available deployment/argocd-server -n ${ARGOCD_NAMESPACE} --timeout=600s
# Get initial admin password
ARGOCD_PASSWORD=$(kubectl -n ${ARGOCD_NAMESPACE} get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
log "ArgoCD admin password: ${ARGOCD_PASSWORD}"
log "Save this password securely!"
}
install_argocd_applications() {
log "Installing ArgoCD applications from GitOps repo..."
# Apply root application
kubectl apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root-apps
namespace: ${ARGOCD_NAMESPACE}
spec:
project: default
source:
repoURL: ${GITOPS_REPO}
targetRevision: ${GITOPS_BRANCH}
path: gitops/apps
destination:
server: https://kubernetes.default.svc
namespace: ${ARGOCD_NAMESPACE}
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
EOF
log "ArgoCD applications will sync automatically from GitOps repo"
}
install_crossplane_provider() {
log "Installing Crossplane Proxmox provider..."
# Install CRDs
if [ -d "../crossplane-provider-proxmox/config/crd/bases" ]; then
kubectl apply -f ../crossplane-provider-proxmox/config/crd/bases/
else
log "Warning: Crossplane provider CRDs not found, skipping..."
fi
# Install provider
if [ -f "../crossplane-provider-proxmox/config/provider.yaml" ]; then
kubectl apply -f ../crossplane-provider-proxmox/config/provider.yaml
else
log "Warning: Crossplane provider manifest not found, skipping..."
fi
}
wait_for_components() {
log "Waiting for all components to be ready..."
local components=(
"argocd/argocd-server"
"rancher-system/rancher"
"crossplane-system/crossplane"
"vault/vault"
"monitoring/kube-prometheus-stack"
"portal/portal"
)
for component in "${components[@]}"; do
IFS='/' read -r namespace deployment <<< "${component}"
if kubectl get deployment "${deployment}" -n "${namespace}" &> /dev/null; then
log "Waiting for ${deployment} in ${namespace}..."
kubectl wait --for=condition=available "deployment/${deployment}" -n "${namespace}" --timeout=600s || true
fi
done
}
print_access_info() {
log "=== Access Information ==="
# ArgoCD
ARGOCD_PASSWORD=$(kubectl -n ${ARGOCD_NAMESPACE} get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" 2>/dev/null | base64 -d || echo "N/A")
log "ArgoCD:"
log " URL: https://argocd.yourdomain.com"
log " Username: admin"
log " Password: ${ARGOCD_PASSWORD}"
# Rancher
log "Rancher:"
log " URL: https://rancher.yourdomain.com"
# Portal
log "Portal:"
log " URL: https://portal.yourdomain.com"
# Grafana
GRAFANA_PASSWORD=$(kubectl -n monitoring get secret kube-prometheus-stack-grafana -o jsonpath="{.data.admin-password}" 2>/dev/null | base64 -d || echo "admin")
log "Grafana:"
log " URL: https://grafana.yourdomain.com"
log " Username: admin"
log " Password: ${GRAFANA_PASSWORD}"
}
main() {
log "Starting control plane components installation..."
check_prerequisites
install_argocd
install_argocd_applications
install_crossplane_provider
log "Waiting for components to be ready (this may take several minutes)..."
wait_for_components
print_access_info
log "Installation completed!"
log "Note: Some components may take additional time to fully sync from GitOps"
}
main "$@"

199
scripts/setup-proxmox-agents.sh Executable file
View File

@@ -0,0 +1,199 @@
#!/bin/bash
set -euo pipefail
# Proxmox Agent Setup Script
SITE="${SITE:-}"
NODE="${NODE:-}"
CLOUDFLARE_TUNNEL_TOKEN="${CLOUDFLARE_TUNNEL_TOKEN:-}"
PROMETHEUS_ENABLED="${PROMETHEUS_ENABLED:-true}"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
}
error() {
log "ERROR: $*"
exit 1
}
check_prerequisites() {
if [ -z "${SITE}" ]; then
error "SITE environment variable is required"
fi
if [ -z "${NODE}" ]; then
error "NODE environment variable is required"
fi
if ! command -v pvesh &> /dev/null; then
error "This script must be run on a Proxmox node"
fi
}
install_cloudflared() {
log "Installing cloudflared..."
if command -v cloudflared &> /dev/null; then
log "cloudflared is already installed"
return
fi
# Download and install cloudflared
ARCH=$(uname -m)
case "${ARCH}" in
x86_64)
ARCH="amd64"
;;
aarch64)
ARCH="arm64"
;;
*)
error "Unsupported architecture: ${ARCH}"
;;
esac
CLOUDFLARED_VERSION="2023.10.0"
wget -q "https://github.com/cloudflare/cloudflared/releases/download/${CLOUDFLARED_VERSION}/cloudflared-linux-${ARCH}" -O /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared
log "cloudflared installed successfully"
}
configure_cloudflared_tunnel() {
log "Configuring Cloudflare tunnel..."
if [ -z "${CLOUDFLARE_TUNNEL_TOKEN}" ]; then
log "Warning: CLOUDFLARE_TUNNEL_TOKEN not set, skipping tunnel configuration"
return
fi
# Create tunnel config directory
mkdir -p /etc/cloudflared
# Create tunnel credentials
cat > /etc/cloudflared/${SITE}-tunnel.json <<EOF
{"AccountTag":"","TunnelSecret":"","TunnelID":"","TunnelName":"${SITE}-tunnel"}
EOF
# Create systemd service
cat > /etc/systemd/system/cloudflared-tunnel.service <<EOF
[Unit]
Description=Cloudflare Tunnel
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/bin/cloudflared tunnel --config /etc/cloudflared/tunnel-configs/${SITE}.yaml run
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
# Copy tunnel config (should be provided separately)
mkdir -p /etc/cloudflared/tunnel-configs
if [ -f "../cloudflare/tunnel-configs/proxmox-${SITE}.yaml" ]; then
cp "../cloudflare/tunnel-configs/proxmox-${SITE}.yaml" "/etc/cloudflared/tunnel-configs/${SITE}.yaml"
else
log "Warning: Tunnel config file not found, creating basic config..."
cat > "/etc/cloudflared/tunnel-configs/${SITE}.yaml" <<EOF
tunnel: ${SITE}-tunnel
credentials-file: /etc/cloudflared/${SITE}-tunnel.json
ingress:
- hostname: ${NODE}.yourdomain.com
service: https://localhost:8006
originRequest:
tls:
skipVerify: true
- service: http_status:404
EOF
fi
systemctl daemon-reload
systemctl enable cloudflared-tunnel.service
systemctl start cloudflared-tunnel.service
log "Cloudflare tunnel configured and started"
}
install_prometheus_exporter() {
if [ "${PROMETHEUS_ENABLED}" != "true" ]; then
log "Prometheus exporter disabled, skipping..."
return
fi
log "Installing Prometheus exporter (pve_exporter)..."
# Check if pve_exporter is already installed
if command -v pve_exporter &> /dev/null; then
log "pve_exporter is already installed"
return
fi
# Install pve_exporter via pip or download binary
if command -v pip3 &> /dev/null; then
pip3 install pve_exporter
else
log "Warning: pip3 not found, please install pve_exporter manually"
return
fi
# Create systemd service
cat > /etc/systemd/system/pve-exporter.service <<EOF
[Unit]
Description=Proxmox VE Prometheus Exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/bin/pve_exporter --web.listen-address=0.0.0.0:9221
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable pve-exporter.service
systemctl start pve-exporter.service
log "Prometheus exporter installed and started"
}
configure_proxmox_api() {
log "Configuring Proxmox API access..."
# Create API token for Crossplane provider
# This should be done manually or via Proxmox API
log "Note: Create an API token in Proxmox web UI:"
log " Datacenter -> Permissions -> API Tokens"
log " Token ID: crossplane-${SITE}"
log " User: root@pam or dedicated service account"
log " Permissions: Administrator or specific VM permissions"
}
main() {
log "Starting Proxmox agent setup for site ${SITE}, node ${NODE}..."
check_prerequisites
install_cloudflared
configure_cloudflared_tunnel
install_prometheus_exporter
configure_proxmox_api
log "Proxmox agent setup completed!"
log ""
log "Next steps:"
log "1. Verify Cloudflare tunnel: systemctl status cloudflared-tunnel"
log "2. Verify Prometheus exporter: curl http://localhost:9221/metrics"
log "3. Create API token in Proxmox web UI for Crossplane provider"
}
main "$@"

199
scripts/validate.sh Executable file
View File

@@ -0,0 +1,199 @@
#!/bin/bash
set -euo pipefail
# Post-Install Validation Script
ERRORS=0
WARNINGS=0
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" >&2
}
error() {
log "ERROR: $*"
((ERRORS++))
}
warning() {
log "WARNING: $*"
((WARNINGS++))
}
success() {
log "$*"
}
check_kubectl() {
log "Checking kubectl connectivity..."
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster"
return 1
fi
success "Kubernetes cluster is accessible"
return 0
}
check_namespaces() {
log "Checking required namespaces..."
local namespaces=(
"argocd"
"rancher-system"
"crossplane-system"
"vault"
"monitoring"
"portal"
"keycloak"
)
for ns in "${namespaces[@]}"; do
if kubectl get namespace "${ns}" &> /dev/null; then
success "Namespace ${ns} exists"
else
error "Namespace ${ns} does not exist"
fi
done
}
check_deployments() {
log "Checking deployments..."
local deployments=(
"argocd/argocd-server"
"rancher-system/rancher"
"crossplane-system/crossplane"
"vault/vault"
"monitoring/kube-prometheus-stack"
"portal/portal"
)
for deployment in "${deployments[@]}"; do
IFS='/' read -r namespace name <<< "${deployment}"
if kubectl get deployment "${name}" -n "${namespace}" &> /dev/null; then
READY=$(kubectl get deployment "${name}" -n "${namespace}" -o jsonpath='{.status.readyReplicas}')
DESIRED=$(kubectl get deployment "${name}" -n "${namespace}" -o jsonpath='{.status.replicas}')
if [ "${READY}" = "${DESIRED}" ] && [ "${READY}" != "0" ]; then
success "Deployment ${name} in ${namespace} is ready (${READY}/${DESIRED})"
else
warning "Deployment ${name} in ${namespace} is not ready (${READY}/${DESIRED})"
fi
else
error "Deployment ${name} in ${namespace} does not exist"
fi
done
}
check_argocd() {
log "Checking ArgoCD..."
if kubectl get deployment argocd-server -n argocd &> /dev/null; then
# Check if ArgoCD is accessible
if kubectl port-forward -n argocd svc/argocd-server 8080:443 &> /dev/null &
then
sleep 2
if curl -k -s https://localhost:8080 &> /dev/null; then
success "ArgoCD server is accessible"
else
warning "ArgoCD server may not be fully ready"
fi
pkill -f "port-forward.*argocd-server" || true
fi
else
error "ArgoCD server not found"
fi
}
check_crossplane() {
log "Checking Crossplane..."
if kubectl get deployment crossplane -n crossplane-system &> /dev/null; then
# Check if provider is installed
if kubectl get providerconfig -n crossplane-system &> /dev/null; then
success "Crossplane provider configs found"
else
warning "No Crossplane provider configs found"
fi
else
error "Crossplane not found"
fi
}
check_proxmox_connectivity() {
log "Checking Proxmox connectivity (if configured)..."
# This would check if Proxmox sites are reachable via tunnels
# Implementation depends on your setup
warning "Proxmox connectivity check not implemented"
}
check_cloudflare_tunnels() {
log "Checking Cloudflare tunnels..."
# Check if cloudflared processes are running on nodes
# This is a simplified check
if command -v cloudflared &> /dev/null; then
if systemctl is-active --quiet cloudflared-tunnel 2>/dev/null; then
success "Cloudflare tunnel service is active"
else
warning "Cloudflare tunnel service may not be running"
fi
else
warning "cloudflared not found (may not be installed on this node)"
fi
}
check_storage() {
log "Checking storage classes..."
if kubectl get storageclass &> /dev/null; then
DEFAULT_SC=$(kubectl get storageclass -o jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}')
if [ -n "${DEFAULT_SC}" ]; then
success "Default storage class: ${DEFAULT_SC}"
else
warning "No default storage class found"
fi
else
error "Cannot list storage classes"
fi
}
print_summary() {
log ""
log "=== Validation Summary ==="
log "Errors: ${ERRORS}"
log "Warnings: ${WARNINGS}"
log ""
if [ "${ERRORS}" -eq 0 ]; then
log "✓ All critical checks passed!"
if [ "${WARNINGS}" -gt 0 ]; then
log "⚠ Some warnings were found, but installation appears functional"
fi
return 0
else
log "✗ Some errors were found. Please review and fix them."
return 1
fi
}
main() {
log "Starting post-install validation..."
log ""
check_kubectl
check_namespaces
check_deployments
check_argocd
check_crossplane
check_proxmox_connectivity
check_cloudflare_tunnels
check_storage
print_summary
}
main "$@"