Files
loc_az_hci/README.md

479 lines
15 KiB
Markdown
Raw Permalink Normal View History

# Proxmox VE → Azure Arc → Hybrid Cloud Stack
Complete end-to-end implementation package for transforming two Proxmox VE hosts into a fully Azure-integrated Hybrid Cloud stack with high availability, Kubernetes orchestration, GitOps workflows, and blockchain infrastructure services.
## 🎯 Overview
This project provides a comprehensive blueprint and automation scripts to deploy:
- **Proxmox VE Cluster**: 2-node high-availability cluster with shared storage
- **Azure Arc Integration**: Full visibility and management from Azure Portal
- **Kubernetes (K3s)**: Lightweight Kubernetes cluster for container orchestration
- **GitOps Workflow**: Declarative infrastructure and application management
- **Private Git/DevOps**: Self-hosted Git repository (Gitea/GitLab)
- **Hybrid Cloud Stack**: Complete blockchain and monitoring services
## 🏗️ Architecture
```
Azure Portal
Azure Arc (Servers, Kubernetes, GitOps)
Proxmox VE Cluster (2 Nodes)
Kubernetes (K3s) + Applications
HC Stack Services (Besu, Firefly, Chainlink, Blockscout, Cacti, NGINX)
```
See [Architecture Documentation](docs/architecture.md) for detailed architecture overview.
## 🖥️ Azure Stack HCI Architecture
This project now includes a complete **Azure Stack HCI integration** with Cloudflare Zero Trust, comprehensive network segmentation, and centralized storage management.
### Key Components
- **Router/Switch/Storage Controller Server**: New server acting as router, switch, and storage controller
- 4× Spectrum WAN connections (multi-WAN load balancing)
- OpenWrt VM for network routing and firewall
- Storage Spaces Direct for 4× external storage shelves
- Intel QAT 8970 for crypto acceleration
- **Proxmox VE Hosts**: Existing HPE ML110 Gen9 and Dell R630
- VLAN bridges mapped to network schema
- Storage mounts from Router server
- Azure Arc Connected Machine agents
- **Ubuntu Service VMs**: Cloudflare Tunnel, reverse proxy, observability, CI/CD
- All VMs with Azure Arc agents
- VLAN-segmented network access
- **Cloudflare Zero Trust**: Secure external access without inbound ports
- Tunnel for WAC, Proxmox UI, dashboards, Git, CI
- SSO/MFA policies
- WAF protection
- **Azure Arc Governance**: Complete Azure integration
- Policy enforcement
- Monitoring and Defender
- Update Management
### Network Topology
- **VLAN 10**: Storage (10.10.10.0/24)
- **VLAN 20**: Compute (10.10.20.0/24)
- **VLAN 30**: App Tier (10.10.30.0/24)
- **VLAN 40**: Observability (10.10.40.0/24)
- **VLAN 50**: Dev/Test (10.10.50.0/24)
- **VLAN 60**: Management (10.10.60.0/24)
- **VLAN 99**: DMZ (10.10.99.0/24)
### Documentation
- **[Complete Architecture](docs/complete-architecture.md)**: Full Azure Stack HCI architecture
- **[Hardware BOM](docs/hardware-bom.md)**: Complete bill of materials
- **[PCIe Allocation](docs/pcie-allocation.md)**: Slot allocation map
- **[Network Topology](docs/network-topology.md)**: VLAN/IP schema and routing
- **[Bring-Up Checklist](docs/bring-up-checklist.md)**: Day-one installation guide
- **[Cloudflare Integration](docs/cloudflare-integration.md)**: Tunnel and Zero Trust setup
- **[Azure Arc Onboarding](docs/azure-arc-onboarding.md)**: Agent installation and governance
### Quick Start (Azure Stack HCI)
1. **Hardware Setup**: Install Router server with all PCIe cards
2. **OS Installation**: Windows Server Core or Proxmox VE
3. **Driver Installation**: Run driver installation scripts
4. **Network Configuration**: Configure OpenWrt and VLANs
5. **Storage Configuration**: Flash HBAs to IT mode, configure S2D
6. **Azure Arc Onboarding**: Install agents on all hosts/VMs
7. **Cloudflare Setup**: Configure Tunnel and Zero Trust
8. **Service Deployment**: Deploy Ubuntu VMs and services
See [Bring-Up Checklist](docs/bring-up-checklist.md) for detailed steps.
## 📋 Prerequisites
### Hardware Requirements
- **2 Proxmox VE hosts** with:
- Proxmox VE 7.0+ installed
- Minimum 8GB RAM per node (16GB+ recommended)
- Static IP addresses
- Network connectivity between nodes
- Internet access for Azure Arc connectivity
### Software Requirements
- Azure subscription with Contributor role
- Azure CLI installed and authenticated
- kubectl (for Kubernetes management)
- SSH access to all nodes
- NFS server (optional, for shared storage)
### Network Requirements
- Static IP addresses for all nodes
- DNS resolution (or hosts file configuration)
- Outbound HTTPS (443) for Azure Arc connectivity
- Cluster communication ports (5404-5412 UDP)
## 🚀 Quick Start
### 1. Clone Repository
```bash
git clone <repository-url>
cd loc_az_hci
```
### 2. Configure Environment Variables
Create a `.env` file from the template:
```bash
cp .env.example .env
```
Edit `.env` and fill in your credentials:
- **Azure**: Subscription ID, Tenant ID, and optionally Service Principal credentials
- **Cloudflare**: API Token and Account Email
- **Proxmox**: `PVE_ROOT_PASS` (shared root password) and URLs for each host
- ML110: `PROXMOX_ML110_URL`
- R630: `PROXMOX_R630_URL`
**Note**: Proxmox uses self-signed SSL certificates by default. Browser security warnings are normal. For production, use Cloudflare Tunnel (handles SSL termination) or configure proper certificates.
**Important**: Never commit `.env` to version control. It's already in `.gitignore`.
Load environment variables in your shell:
```bash
# Source the .env file (if your scripts support it)
export $(cat .env | grep -v '^#' | xargs)
```
Or use a tool like `direnv` or `dotenv` to automatically load `.env` files.
### 3. Configure Proxmox Cluster
**On Node 1**:
```bash
export NODE_IP=192.168.1.10
export NODE_GATEWAY=192.168.1.1
export NODE_HOSTNAME=pve-node-1
./infrastructure/proxmox/network-config.sh
./infrastructure/proxmox/cluster-setup.sh
```
**On Node 2**:
```bash
export NODE_IP=192.168.1.11
export NODE_GATEWAY=192.168.1.1
export NODE_HOSTNAME=pve-node-2
export CLUSTER_NODE_IP=192.168.1.10
./infrastructure/proxmox/network-config.sh
export NODE_ROLE=join
./infrastructure/proxmox/cluster-setup.sh
```
### 4. Onboard to Azure Arc
**On each Proxmox node**:
```bash
export RESOURCE_GROUP=HC-Stack
export TENANT_ID=$(az account show --query tenantId -o tsv)
export SUBSCRIPTION_ID=$(az account show --query id -o tsv)
export LOCATION=eastus
./scripts/azure-arc/onboard-proxmox-hosts.sh
```
### 5. Deploy Kubernetes
**On K3s VM**:
```bash
./infrastructure/kubernetes/k3s-install.sh
export RESOURCE_GROUP=HC-Stack
export CLUSTER_NAME=proxmox-k3s-cluster
./infrastructure/kubernetes/arc-onboard-k8s.sh
```
### 6. Deploy Git Server
**Option A: Gitea (Recommended)**:
```bash
./infrastructure/gitops/gitea-deploy.sh
```
**Option B: GitLab CE**:
```bash
./infrastructure/gitops/gitlab-deploy.sh
```
### 7. Configure GitOps
1. Create Git repository in your Git server
2. Copy `gitops/` directory to repository
3. Configure GitOps in Azure Portal or using Flux CLI
### 8. Deploy HC Stack Services
Deploy via GitOps (recommended) or manually:
```bash
# Manual deployment
helm install besu ./gitops/apps/besu -n blockchain
helm install firefly ./gitops/apps/firefly -n blockchain
helm install chainlink-ccip ./gitops/apps/chainlink-ccip -n blockchain
helm install blockscout ./gitops/apps/blockscout -n blockchain
helm install cacti ./gitops/apps/cacti -n monitoring
helm install nginx-proxy ./gitops/apps/nginx-proxy -n hc-stack
```
## 📁 Project Structure
```
loc_az_hci/
├── infrastructure/
│ ├── proxmox/ # Proxmox cluster setup scripts
│ ├── kubernetes/ # K3s installation scripts
│ └── gitops/ # Git server deployment scripts
├── scripts/
│ ├── azure-arc/ # Azure Arc onboarding scripts
│ └── utils/ # Utility scripts
├── terraform/
│ ├── proxmox/ # Proxmox Terraform modules
│ ├── azure-arc/ # Azure Arc Terraform modules
│ └── kubernetes/ # Kubernetes Terraform modules
├── gitops/
│ ├── infrastructure/ # Base infrastructure manifests
│ └── apps/ # Application Helm charts
│ ├── besu/
│ ├── firefly/
│ ├── chainlink-ccip/
│ ├── blockscout/
│ ├── cacti/
│ └── nginx-proxy/
├── docker-compose/
│ ├── gitea.yml # Gitea Docker Compose
│ └── gitlab.yml # GitLab Docker Compose
├── docs/
│ ├── architecture.md # Architecture documentation
│ ├── network-topology.md
│ ├── deployment-guide.md
│ └── runbooks/ # Operational runbooks
├── diagrams/
│ ├── architecture.mmd
│ ├── network-topology.mmd
│ └── deployment-flow.mmd
└── config/
├── azure-arc-config.yaml
└── gitops-config.yaml
├── .env.example # Environment variables template
└── .gitignore # Git ignore rules (includes .env)
```
## 📚 Documentation
- **[Architecture Overview](docs/architecture.md)**: Complete system architecture
- **[Network Topology](docs/network-topology.md)**: Network design and configuration
- **[Deployment Guide](docs/deployment-guide.md)**: Step-by-step deployment instructions
- **[Runbooks](docs/runbooks/)**: Operational procedures
- [Proxmox Operations](docs/runbooks/proxmox-operations.md)
- [Azure Arc Troubleshooting](docs/runbooks/azure-arc-troubleshooting.md)
- [GitOps Workflow](docs/runbooks/gitops-workflow.md)
## 🔧 Configuration
### Environment Variables (.env)
This project uses a `.env` file to manage credentials securely. **Never commit `.env` to version control.**
1. **Copy the template:**
```bash
cp .env.example .env
```
2. **Edit `.env` with your credentials:**
- Azure: `AZURE_SUBSCRIPTION_ID`, `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`
- Cloudflare: `CLOUDFLARE_API_KEY` (or `CLOUDFLARE_API_TOKEN`), `CLOUDFLARE_ACCOUNT_ID`, `CLOUDFLARE_ZONE_ID`, `CLOUDFLARE_DOMAIN`, `CLOUDFLARE_TUNNEL_TOKEN`
**Note**: Cloudflare API Key and Tunnel Token are configured. Zero Trust features may require additional subscription/permissions.
- Proxmox: `PVE_ROOT_PASS` (shared root password for all instances)
- Proxmox ML110: `PROXMOX_ML110_URL` (use internal IP: `192.168.1.206:8006` for local network)
- Proxmox R630: `PROXMOX_R630_URL` (use internal IP: `192.168.1.49:8006` for local network)
**Note**:
- The username `root@pam` is implied and should not be stored. For production, use RBAC accounts and API tokens instead of root credentials.
- Use internal IPs (192.168.x.x) for local network access. External IPs are available for VPN/public access.
3. **Load environment variables:**
```bash
# In bash scripts, source the .env file
if [ -f .env ]; then
export $(cat .env | grep -v '^#' | xargs)
fi
```
See `.env.example` for all available configuration options.
### Azure Arc Configuration
Edit `config/azure-arc-config.yaml` with your Azure credentials (or use environment variables from `.env`):
```yaml
azure:
subscription_id: "your-subscription-id"
tenant_id: "your-tenant-id"
resource_group: "HC-Stack"
location: "eastus"
```
**Note**: Scripts will use environment variables from `.env` if available, which takes precedence over YAML config files.
### GitOps Configuration
Edit `config/gitops-config.yaml` with your Git repository details:
```yaml
git:
repository: "http://git.local:3000/user/gitops-repo.git"
branch: "main"
path: "gitops/"
```
## 🛠️ Tools and Scripts
### Prerequisites Check
```bash
./scripts/utils/prerequisites-check.sh
```
### Proxmox Operations
- `infrastructure/proxmox/network-config.sh`: Configure network
- `infrastructure/proxmox/cluster-setup.sh`: Create/join cluster
- `infrastructure/proxmox/nfs-storage.sh`: Configure NFS storage
### Azure Arc Operations
- `scripts/azure-arc/onboard-proxmox-hosts.sh`: Onboard Proxmox hosts
- `scripts/azure-arc/onboard-vms.sh`: Onboard VMs
- `scripts/azure-arc/resource-bridge-setup.sh`: Setup Resource Bridge
### Kubernetes Operations
- `infrastructure/kubernetes/k3s-install.sh`: Install K3s
- `infrastructure/kubernetes/arc-onboard-k8s.sh`: Onboard to Azure Arc
### Git/DevOps Operations
- `infrastructure/gitops/gitea-deploy.sh`: Deploy Gitea
- `infrastructure/gitops/gitlab-deploy.sh`: Deploy GitLab
- `infrastructure/gitops/azure-devops-agent.sh`: Setup Azure DevOps agent
## 🎨 Diagrams
View architecture diagrams:
- [Architecture Diagram](diagrams/architecture.mmd)
- [Network Topology](diagrams/network-topology.mmd)
- [Deployment Flow](diagrams/deployment-flow.mmd)
## 🔒 Security
- Network isolation and firewall rules
- Azure Arc managed identities and RBAC
- Kubernetes RBAC and network policies
- TLS/SSL with Cert-Manager
- Secrets management via `.env` file (excluded from version control)
- Proxmox VE RBAC best practices (see [Proxmox RBAC Guide](docs/security/proxmox-rbac.md))
- Consider Azure Key Vault integration for production deployments
## 📊 Monitoring
- **Cacti**: Network and system monitoring
- **Azure Monitor**: Metrics and logs via Azure Arc
- **Kubernetes Metrics**: Pod and service metrics
- **Azure Defender**: Security monitoring
## 🔄 High Availability
- Proxmox 2-node cluster with shared storage
- VM high availability with automatic failover
- Kubernetes multiple replicas for stateless services
- Load balancing via NGINX Ingress
## 🚨 Troubleshooting
See runbooks for common issues:
- [Azure Arc Troubleshooting](docs/runbooks/azure-arc-troubleshooting.md)
- [Proxmox Operations](docs/runbooks/proxmox-operations.md)
- [GitOps Workflow](docs/runbooks/gitops-workflow.md)
## 🤝 Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
## 📝 License
This project is provided as-is for educational and deployment purposes.
## 🙏 Acknowledgments
- Proxmox VE team for excellent virtualization platform
- Microsoft Azure Arc team for hybrid cloud capabilities
- Kubernetes and K3s communities
- All open-source projects used in this stack
## 📞 Support
For issues and questions:
1. Check the [Documentation](docs/)
2. Review [Runbooks](docs/runbooks/)
3. Open an issue in the repository
## 🎯 Next Steps
After deployment:
1. Review and customize configurations
2. Set up monitoring and alerting
3. Configure backup and disaster recovery
4. Implement security policies
5. Plan for scaling and expansion
---
**Happy Deploying! 🚀**
---
## Archived Projects
This project contains archived content from related projects:
### PanTel (6G/GPU Archive)
- **Archive Location**: Archive beginning with `6g_gpu*` in this repository
- **Project**: PanTel telecommunications and connectivity infrastructure project
- **Joint Venture**: PanTel is a joint venture between Sankofa and PANDA (Pan-African Network for Digital Advancement)
- **Status**: Archived content - see [pan-tel](../pan-tel/) project directory for project information
- **Note**: This content is archived here and will be unpacked to the `pan-tel` project directory when ready for integration into the panda_monorepo
---