Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements

- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
This commit is contained in:
defiQUG
2025-12-12 18:01:35 -08:00
parent e01131efaf
commit 9daf1fd378
968 changed files with 160890 additions and 1092 deletions

View File

@@ -0,0 +1,240 @@
# Infrastructure Monitoring
Comprehensive monitoring solutions for all infrastructure components in Sankofa Phoenix.
## Overview
This directory contains monitoring components including custom Prometheus exporters, Grafana dashboards, and alerting rules for infrastructure monitoring.
## Components
### Exporters (`exporters/`)
Custom Prometheus exporters for:
- Proxmox VE metrics
- TP-Link Omada metrics
- Network switch/router metrics
- Infrastructure health checks
### Dashboards (`dashboards/`)
Grafana dashboards for:
- Infrastructure overview
- Proxmox cluster health
- Network performance
- Omada controller status
- Site-level monitoring
## Exporters
### Proxmox Exporter
The Proxmox exporter (`pve_exporter`) provides metrics for:
- VM status and resource usage
- Node health and performance
- Storage pool utilization
- Network interface statistics
- Cluster status
**Installation:**
```bash
pip install pve_exporter
```
**Configuration:**
```yaml
exporter:
listen_address: 0.0.0.0:9221
proxmox:
endpoint: https://pve1.sankofa.nexus:8006
username: monitoring@pam
password: ${PROXMOX_PASSWORD}
```
### Omada Exporter
Custom exporter for TP-Link Omada Controller metrics:
- Access point status
- Client device counts
- Network throughput
- Controller health
**See**: `exporters/omada_exporter/` for implementation
### Network Exporter
SNMP-based exporter for network devices:
- Switch port statistics
- Router interface metrics
- VLAN utilization
- Network topology changes
**See**: `exporters/network_exporter/` for implementation
## Dashboards
### Infrastructure Overview
Comprehensive dashboard showing:
- All sites status
- Resource utilization
- Health scores
- Alert summary
**Location**: `dashboards/infrastructure-overview.json`
### Proxmox Cluster
Dashboard for Proxmox clusters:
- Cluster health
- Node performance
- VM resource usage
- Storage utilization
**Location**: `dashboards/proxmox-cluster.json`
### Network Performance
Network performance dashboard:
- Bandwidth utilization
- Latency metrics
- Error rates
- Top talkers
**Location**: `dashboards/network-performance.json`
### Omada Controller
Omada-specific dashboard:
- Controller status
- Access point health
- Client statistics
- Network policies
**Location**: `dashboards/omada-controller.json`
## Installation
### Deploy Exporters
```bash
# Deploy all exporters
kubectl apply -f exporters/manifests/
# Or deploy individually
kubectl apply -f exporters/manifests/proxmox-exporter.yaml
kubectl apply -f exporters/manifests/omada-exporter.yaml
```
### Import Dashboards
```bash
# Import all dashboards to Grafana
./scripts/import-dashboards.sh
# Or import individually
grafana-cli admin import-dashboard dashboards/infrastructure-overview.json
```
## Configuration
### Prometheus Scrape Configuration
```yaml
scrape_configs:
- job_name: 'proxmox'
static_configs:
- targets:
- 'pve-exporter.monitoring.svc.cluster.local:9221'
- job_name: 'omada'
static_configs:
- targets:
- 'omada-exporter.monitoring.svc.cluster.local:9222'
- job_name: 'network'
static_configs:
- targets:
- 'network-exporter.monitoring.svc.cluster.local:9223'
```
### Alerting Rules
Alert rules are defined in `exporters/alert-rules/`:
- `proxmox-alerts.yaml`: Proxmox cluster alerts
- `omada-alerts.yaml`: Omada controller alerts
- `network-alerts.yaml`: Network infrastructure alerts
## Metrics
### Proxmox Metrics
- `pve_node_status`: Node status (0=offline, 1=online)
- `pve_vm_status`: VM status
- `pve_storage_used_bytes`: Storage usage
- `pve_network_rx_bytes`: Network receive bytes
- `pve_network_tx_bytes`: Network transmit bytes
### Omada Metrics
- `omada_ap_status`: Access point status
- `omada_clients_total`: Total client count
- `omada_throughput_bytes`: Network throughput
- `omada_controller_status`: Controller health
### Network Metrics
- `network_port_status`: Switch port status
- `network_port_rx_bytes`: Port receive bytes
- `network_port_tx_bytes`: Port transmit bytes
- `network_vlan_utilization`: VLAN utilization
## Alerts
### Critical Alerts
- Proxmox cluster node down
- Omada controller unreachable
- Network switch offline
- High resource utilization (>90%)
### Warning Alerts
- High resource utilization (>80%)
- Network latency spikes
- Access point offline
- Storage pool >80% full
## Troubleshooting
### Exporter Issues
```bash
# Check exporter status
kubectl get pods -n monitoring -l app=proxmox-exporter
# View exporter logs
kubectl logs -n monitoring -l app=proxmox-exporter
# Test exporter endpoint
curl http://proxmox-exporter.monitoring.svc.cluster.local:9221/metrics
```
### Dashboard Issues
```bash
# Verify dashboard import
grafana-cli admin ls-dashboard
# Check dashboard data sources
# In Grafana UI: Configuration > Data Sources
```
## Related Documentation
- [Proxmox Management](../proxmox/README.md)
- [Omada Management](../omada/README.md)
- [Network Management](../network/README.md)
- [Infrastructure Management](../README.md)

View File

@@ -0,0 +1,85 @@
{
"dashboard": {
"title": "Proxmox Cluster Overview",
"tags": ["proxmox", "infrastructure"],
"timezone": "browser",
"schemaVersion": 16,
"version": 1,
"refresh": "30s",
"panels": [
{
"id": 1,
"title": "Cluster Nodes Status",
"type": "stat",
"targets": [
{
"expr": "up{job=\"pve_exporter\"}",
"legendFormat": "{{instance}}"
}
],
"gridPos": {"h": 4, "w": 6, "x": 0, "y": 0}
},
{
"id": 2,
"title": "Total VMs",
"type": "stat",
"targets": [
{
"expr": "count(pve_vm_info)",
"legendFormat": "VMs"
}
],
"gridPos": {"h": 4, "w": 6, "x": 6, "y": 0}
},
{
"id": 3,
"title": "Running VMs",
"type": "stat",
"targets": [
{
"expr": "count(pve_vm_info{status=\"running\"})",
"legendFormat": "Running"
}
],
"gridPos": {"h": 4, "w": 6, "x": 12, "y": 0}
},
{
"id": 4,
"title": "CPU Usage by Node",
"type": "graph",
"targets": [
{
"expr": "pve_node_cpu_usage",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 4}
},
{
"id": 5,
"title": "Memory Usage by Node",
"type": "graph",
"targets": [
{
"expr": "pve_node_memory_usage",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 4}
},
{
"id": 6,
"title": "Storage Usage",
"type": "graph",
"targets": [
{
"expr": "pve_storage_usage",
"legendFormat": "{{storage}}"
}
],
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 12}
}
]
}
}

View File

@@ -0,0 +1,131 @@
{
"dashboard": {
"title": "Proxmox Node Details",
"tags": ["proxmox", "node", "infrastructure"],
"timezone": "browser",
"schemaVersion": 16,
"version": 1,
"refresh": "30s",
"panels": [
{
"id": 1,
"title": "Node Status",
"type": "stat",
"targets": [
{
"expr": "up{job=\"pve_exporter\",instance=~\"$node\"}",
"legendFormat": "{{instance}}"
}
],
"gridPos": {"h": 4, "w": 6, "x": 0, "y": 0}
},
{
"id": 2,
"title": "CPU Usage",
"type": "gauge",
"targets": [
{
"expr": "pve_node_cpu_usage{node=~\"$node\"}",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 4, "w": 6, "x": 6, "y": 0}
},
{
"id": 3,
"title": "Memory Usage",
"type": "gauge",
"targets": [
{
"expr": "pve_node_memory_usage{node=~\"$node\"}",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 4, "w": 6, "x": 12, "y": 0}
},
{
"id": 4,
"title": "CPU Usage Over Time",
"type": "graph",
"targets": [
{
"expr": "pve_node_cpu_usage{node=~\"$node\"}",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 4}
},
{
"id": 5,
"title": "Memory Usage Over Time",
"type": "graph",
"targets": [
{
"expr": "pve_node_memory_usage{node=~\"$node\"}",
"legendFormat": "{{node}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 4}
},
{
"id": 6,
"title": "Storage Usage by Pool",
"type": "graph",
"targets": [
{
"expr": "pve_storage_usage{node=~\"$node\"}",
"legendFormat": "{{storage}}"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 12}
},
{
"id": 7,
"title": "Network I/O",
"type": "graph",
"targets": [
{
"expr": "pve_node_net_in{node=~\"$node\"}",
"legendFormat": "{{node}} - In"
},
{
"expr": "pve_node_net_out{node=~\"$node\"}",
"legendFormat": "{{node}} - Out"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 12}
},
{
"id": 8,
"title": "Disk I/O",
"type": "graph",
"targets": [
{
"expr": "pve_node_disk_read{node=~\"$node\"}",
"legendFormat": "{{node}} - Read"
},
{
"expr": "pve_node_disk_write{node=~\"$node\"}",
"legendFormat": "{{node}} - Write"
}
],
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 20}
}
],
"templating": {
"list": [
{
"name": "node",
"type": "query",
"query": "label_values(pve_node_info, node)",
"current": {
"text": "All",
"value": "$__all"
},
"options": []
}
]
}
}
}

View File

@@ -0,0 +1,82 @@
{
"dashboard": {
"title": "Proxmox VMs",
"tags": ["proxmox", "vms"],
"timezone": "browser",
"schemaVersion": 16,
"version": 1,
"refresh": "30s",
"panels": [
{
"id": 1,
"title": "VM CPU Usage",
"type": "graph",
"targets": [
{
"expr": "pve_vm_cpu_usage",
"legendFormat": "{{name}} ({{vmid}})"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"id": 2,
"title": "VM Memory Usage",
"type": "graph",
"targets": [
{
"expr": "pve_vm_memory_usage",
"legendFormat": "{{name}} ({{vmid}})"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"id": 3,
"title": "VM Network I/O",
"type": "graph",
"targets": [
{
"expr": "pve_vm_net_in",
"legendFormat": "{{name}} - In"
},
{
"expr": "pve_vm_net_out",
"legendFormat": "{{name}} - Out"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 8}
},
{
"id": 4,
"title": "VM Disk I/O",
"type": "graph",
"targets": [
{
"expr": "pve_vm_disk_read",
"legendFormat": "{{name}} - Read"
},
{
"expr": "pve_vm_disk_write",
"legendFormat": "{{name}} - Write"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 8}
},
{
"id": 5,
"title": "VM Status",
"type": "table",
"targets": [
{
"expr": "pve_vm_info",
"format": "table",
"instant": true
}
],
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 16}
}
]
}
}