- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control. - Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities. - Created .gitmodules to include OpenZeppelin contracts as a submodule. - Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment. - Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks. - Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring. - Created scripts for resource import and usage validation across non-US regions. - Added tests for CCIP error handling and integration to ensure robust functionality. - Included various new files and directories for the orchestration portal and deployment scripts.
287 lines
7.2 KiB
Markdown
287 lines
7.2 KiB
Markdown
# Kubernetes Cluster & Pod Density Mapping - 36-Region Blueprint
|
||
|
||
## Overview
|
||
|
||
This document maps Kubernetes cluster configurations and pod densities to the 36-region global deployment blueprint.
|
||
|
||
---
|
||
|
||
## 🖥️ VM Configuration per Region
|
||
|
||
### Primary Regions (12 regions) - 8 vCPUs each
|
||
|
||
**Cluster Resources:**
|
||
- **System Node Pool:** 2 × Standard_D2s_v3
|
||
- 4 vCPUs total
|
||
- 16 GB RAM total
|
||
- **Validator Node Pool:** 2 × Standard_B2s
|
||
- 4 vCPUs total
|
||
- 8 GB RAM total
|
||
- **Total Capacity:** 8 vCPUs, 24 GB RAM per region
|
||
|
||
### Remaining Regions (24 regions) - 6 vCPUs each
|
||
|
||
**Cluster Resources:**
|
||
- **System Node Pool:** 2 × Standard_D2s_v3
|
||
- 4 vCPUs total
|
||
- 16 GB RAM total
|
||
- **Validator Node Pool:** 1 × Standard_B2s
|
||
- 2 vCPUs total
|
||
- 4 GB RAM total
|
||
- **Total Capacity:** 6 vCPUs, 20 GB RAM per region
|
||
|
||
---
|
||
|
||
## 📦 Pod Density Mapping
|
||
|
||
### Pods per Region (All 36 regions)
|
||
|
||
#### 1. Hyperledger Besu Network
|
||
|
||
**Validators:** 4 pods per region
|
||
- **CPU request:** 2 cores per pod
|
||
- **Memory request:** 4 Gi per pod
|
||
- **Total CPU:** 8 cores per region
|
||
- **Total Memory:** 16 Gi per region
|
||
- **Placement:** On validator node pool
|
||
|
||
**Sentries:** 3 pods per region
|
||
- **CPU request:** 2 cores per pod
|
||
- **Memory request:** 4 Gi per pod
|
||
- **Total CPU:** 6 cores per region
|
||
- **Total Memory:** 12 Gi per region
|
||
- **Placement:** On system node pool
|
||
|
||
**RPC Nodes:** 3 pods per region
|
||
- **CPU request:** 4 cores per pod
|
||
- **Memory request:** 8 Gi per pod
|
||
- **Total CPU:** 12 cores per region
|
||
- **Total Memory:** 24 Gi per region
|
||
- **Placement:** On system node pool
|
||
|
||
**Besu Subtotal:** 10 pods per region
|
||
|
||
#### 2. Application Stack
|
||
|
||
**Blockscout:**
|
||
- Explorer: 1 pod (1 CPU, 2 Gi mem)
|
||
- Database: 1 pod (0.5 CPU, 1 Gi mem)
|
||
|
||
**FireFly:**
|
||
- Core: 1 pod (1 CPU, 2 Gi mem)
|
||
- PostgreSQL: 1 pod (0.5 CPU, 1 Gi mem)
|
||
- IPFS: 1 pod (0.5 CPU, 1 Gi mem)
|
||
|
||
**Cacti:**
|
||
- API Server: 1 pod (1 CPU, 2 Gi mem)
|
||
- Besu Connector: 1 pod (0.5 CPU, 1 Gi mem)
|
||
|
||
**Monitoring Stack:**
|
||
- Prometheus: 1 pod (0.5 CPU, 2 Gi mem)
|
||
- Grafana: 1 pod (0.5 CPU, 1 Gi mem)
|
||
- Loki: 1 pod (0.5 CPU, 1 Gi mem)
|
||
- Alertmanager: 1 pod (0.1 CPU, 0.25 Gi mem)
|
||
|
||
**API Gateway:**
|
||
- Nginx Gateway: 2 pods (0.1 CPU, 0.128 Gi mem each)
|
||
|
||
**Application Subtotal:** 13 pods per region
|
||
|
||
---
|
||
|
||
## 📊 Total Pod Density
|
||
|
||
### Per Region Totals
|
||
|
||
- **Besu Network:** 10 pods
|
||
- **Application Stack:** 13 pods
|
||
- **Total Pods per Region:** 23 pods
|
||
|
||
### Global Totals (36 regions)
|
||
|
||
- **Total Pods:** 828 pods (36 regions × 23 pods)
|
||
- **Besu Pods:** 360 pods (36 × 10)
|
||
- **Application Pods:** 468 pods (36 × 13)
|
||
|
||
---
|
||
|
||
## 🎯 Pod Placement Strategy
|
||
|
||
### Primary Regions (8 vCPUs capacity)
|
||
|
||
**Validator Node Pool (4 vCPUs, 8 GB RAM):**
|
||
- **Besu Validators:** 4 pods
|
||
- CPU request: 8 cores total
|
||
- Memory request: 16 Gi total
|
||
- **Note:** Requires resource overcommit (4 vCPUs available, 8 cores requested)
|
||
|
||
**System Node Pool (4 vCPUs, 16 GB RAM):**
|
||
- **Besu Sentries:** 3 pods (6 cores, 12 Gi)
|
||
- **Besu RPC:** 3 pods (12 cores, 24 Gi)
|
||
- **Application Stack:** 13 pods (~5.2 cores, ~13 Gi)
|
||
- **Total Requests:** ~23.2 cores, ~49 Gi
|
||
- **Note:** Requires resource overcommit (4 vCPUs available, ~23 cores requested)
|
||
|
||
### Remaining Regions (6 vCPUs capacity)
|
||
|
||
**Validator Node Pool (2 vCPUs, 4 GB RAM):**
|
||
- **Besu Validators:** 4 pods
|
||
- CPU request: 8 cores total
|
||
- Memory request: 16 Gi total
|
||
- **Note:** Requires resource overcommit (2 vCPUs available, 8 cores requested)
|
||
|
||
**System Node Pool (4 vCPUs, 16 GB RAM):**
|
||
- **Besu Sentries:** 3 pods (6 cores, 12 Gi)
|
||
- **Besu RPC:** 3 pods (12 cores, 24 Gi)
|
||
- **Application Stack:** 13 pods (~5.2 cores, ~13 Gi)
|
||
- **Total Requests:** ~23.2 cores, ~49 Gi
|
||
- **Note:** Requires resource overcommit (4 vCPUs available, ~23 cores requested)
|
||
|
||
---
|
||
|
||
## ⚠️ Resource Overcommit Considerations
|
||
|
||
### Kubernetes Resource Overcommit
|
||
|
||
**CPU Overcommit:**
|
||
- Validator nodes: 2-4x overcommit (requested vs. available)
|
||
- System nodes: 4-6x overcommit (requested vs. available)
|
||
|
||
**Memory Overcommit:**
|
||
- Validator nodes: 2x overcommit (requested vs. available)
|
||
- System nodes: 3x overcommit (requested vs. available)
|
||
|
||
**Best Practices:**
|
||
- Use resource limits to prevent pod starvation
|
||
- Configure QoS classes (Guaranteed, Burstable, BestEffort)
|
||
- Monitor actual resource usage and adjust requests/limits
|
||
- Consider horizontal pod autoscaling (HPA) for non-critical workloads
|
||
|
||
---
|
||
|
||
## 🔧 Recommended Adjustments
|
||
|
||
### Option 1: Reduce Pod Resource Requests
|
||
|
||
**Reduce Besu RPC CPU requests:**
|
||
- From 4 cores to 2 cores per pod
|
||
- Total: 6 cores per region (instead of 12)
|
||
- Helps fit within system node capacity
|
||
|
||
**Reduce Application Stack requests:**
|
||
- Use smaller resource requests where possible
|
||
- Some pods can run with lower CPU allocation
|
||
|
||
### Option 2: Increase System Node Count
|
||
|
||
**Primary regions:**
|
||
- Increase system nodes to 3 (6 vCPUs, 24 GB RAM)
|
||
- Total: 10 vCPUs per region (at quota limit)
|
||
|
||
**Remaining regions:**
|
||
- Increase system nodes to 3 (6 vCPUs, 24 GB RAM)
|
||
- Total: 8 vCPUs per region (within quota)
|
||
|
||
### Option 3: Hybrid Approach
|
||
|
||
- Use current VM counts (2 system nodes)
|
||
- Reduce resource requests for non-critical pods
|
||
- Enable HPA for dynamic scaling
|
||
- Accept moderate overcommit with proper limits
|
||
|
||
---
|
||
|
||
## 📋 AKS Cluster Configuration
|
||
|
||
### Cluster Settings
|
||
|
||
```yaml
|
||
cluster_config:
|
||
name: "${region}-aks-main"
|
||
kubernetes_version: "1.32"
|
||
network_plugin: "azure"
|
||
network_policy: "azure"
|
||
|
||
default_node_pool:
|
||
name: "system"
|
||
node_count: 2
|
||
vm_size: "Standard_D2s_v3"
|
||
os_disk_size_gb: 128
|
||
|
||
validator_node_pool:
|
||
name: "validators"
|
||
node_count: 1 # or 2 for primary regions
|
||
vm_size: "Standard_B2s"
|
||
os_disk_size_gb: 512
|
||
node_taints:
|
||
- "role=validator:NoSchedule"
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 Deployment Sequence
|
||
|
||
### Phase 1: Infrastructure Deployment
|
||
|
||
1. Deploy AKS clusters in all 36 regions
|
||
2. Configure system node pools (2 nodes each)
|
||
3. Configure validator node pools (1-2 nodes based on region type)
|
||
4. Verify cluster connectivity and health
|
||
|
||
### Phase 2: Besu Network Deployment
|
||
|
||
1. Deploy Besu validators (4 pods per region)
|
||
2. Deploy Besu sentries (3 pods per region)
|
||
3. Deploy Besu RPC nodes (3 pods per region)
|
||
4. Configure geo-aware validator selection
|
||
5. Initialize IBFT 2.0 consensus
|
||
|
||
### Phase 3: Application Stack Deployment
|
||
|
||
1. Deploy Blockscout (explorer + database)
|
||
2. Deploy FireFly (core + PostgreSQL + IPFS)
|
||
3. Deploy Cacti (API + connector)
|
||
4. Deploy monitoring stack
|
||
5. Deploy API gateway
|
||
|
||
### Phase 4: Verification & Optimization
|
||
|
||
1. Verify all pods are running
|
||
2. Monitor resource utilization
|
||
3. Adjust resource requests/limits as needed
|
||
4. Enable autoscaling where appropriate
|
||
5. Test geo-aware consensus
|
||
|
||
---
|
||
|
||
## 📊 Resource Summary
|
||
|
||
### Global Totals
|
||
|
||
| Resource | Total (36 regions) |
|
||
|----------|-------------------|
|
||
| **VMs** | 120 |
|
||
| **vCPUs** | 240 |
|
||
| **RAM** | ~440 GB |
|
||
| **Pods** | 828 |
|
||
| **Besu Pods** | 360 |
|
||
| **Application Pods** | 468 |
|
||
|
||
### Per Region Averages
|
||
|
||
| Resource | Primary (12) | Remaining (24) |
|
||
|----------|-------------|----------------|
|
||
| **VMs** | 4 | 3 |
|
||
| **vCPUs** | 8 | 6 |
|
||
| **RAM** | 24 GB | 20 GB |
|
||
| **Pods** | 23 | 23 |
|
||
|
||
---
|
||
|
||
## 📚 References
|
||
|
||
- [36-Region Blueprint](./36-REGION-BLUEPRINT.md)
|
||
- [Geo-Aware Committee Configuration](./GEO-AWARE-COMMITTEE-CONFIG.md)
|
||
- [Deployment Checklist](./DEPLOYMENT_CHECKLIST.md)
|
||
|