smom-dbis-138/docs/azure/KUBERNETES-36REGION-MAPPING.md

# Kubernetes Cluster & Pod Density Mapping - 36-Region Blueprint

## Overview

This document maps Kubernetes cluster configurations and pod densities to the 36-region global deployment blueprint.

---

## 🖥️ VM Configuration per Region

### Primary Regions (12 regions) - 8 vCPUs each

**Cluster Resources:**
- **System Node Pool:** 2 × Standard_D2s_v3
  - 4 vCPUs total
  - 16 GB RAM total
- **Validator Node Pool:** 2 × Standard_B2s
  - 4 vCPUs total
  - 8 GB RAM total
- **Total Capacity:** 8 vCPUs, 24 GB RAM per region

### Remaining Regions (24 regions) - 6 vCPUs each

**Cluster Resources:**
- **System Node Pool:** 2 × Standard_D2s_v3
  - 4 vCPUs total
  - 16 GB RAM total
- **Validator Node Pool:** 1 × Standard_B2s
  - 2 vCPUs total
  - 4 GB RAM total
- **Total Capacity:** 6 vCPUs, 20 GB RAM per region

---

## 📦 Pod Density Mapping

### Pods per Region (All 36 regions)

#### 1. Hyperledger Besu Network

**Validators:** 4 pods per region
- **CPU request:** 2 cores per pod
- **Memory request:** 4 Gi per pod
- **Total CPU:** 8 cores per region
- **Total Memory:** 16 Gi per region
- **Placement:** On validator node pool

**Sentries:** 3 pods per region
- **CPU request:** 2 cores per pod
- **Memory request:** 4 Gi per pod
- **Total CPU:** 6 cores per region
- **Total Memory:** 12 Gi per region
- **Placement:** On system node pool

**RPC Nodes:** 3 pods per region
- **CPU request:** 4 cores per pod
- **Memory request:** 8 Gi per pod
- **Total CPU:** 12 cores per region
- **Total Memory:** 24 Gi per region
- **Placement:** On system node pool

**Besu Subtotal:** 10 pods per region

#### 2. Application Stack

**Blockscout:**
- Explorer: 1 pod (1 CPU, 2 Gi mem)
- Database: 1 pod (0.5 CPU, 1 Gi mem)

**FireFly:**
- Core: 1 pod (1 CPU, 2 Gi mem)
- PostgreSQL: 1 pod (0.5 CPU, 1 Gi mem)
- IPFS: 1 pod (0.5 CPU, 1 Gi mem)

**Cacti:**
- API Server: 1 pod (1 CPU, 2 Gi mem)
- Besu Connector: 1 pod (0.5 CPU, 1 Gi mem)

**Monitoring Stack:**
- Prometheus: 1 pod (0.5 CPU, 2 Gi mem)
- Grafana: 1 pod (0.5 CPU, 1 Gi mem)
- Loki: 1 pod (0.5 CPU, 1 Gi mem)
- Alertmanager: 1 pod (0.1 CPU, 0.25 Gi mem)

**API Gateway:**
- Nginx Gateway: 2 pods (0.1 CPU, 0.128 Gi mem each)

**Application Subtotal:** 13 pods per region

---

## 📊 Total Pod Density

### Per Region Totals

- **Besu Network:** 10 pods
- **Application Stack:** 13 pods
- **Total Pods per Region:** 23 pods

### Global Totals (36 regions)

- **Total Pods:** 828 pods (36 regions × 23 pods)
- **Besu Pods:** 360 pods (36 × 10)
- **Application Pods:** 468 pods (36 × 13)

---

## 🎯 Pod Placement Strategy

### Primary Regions (8 vCPUs capacity)

**Validator Node Pool (4 vCPUs, 8 GB RAM):**
- **Besu Validators:** 4 pods
  - CPU request: 8 cores total
  - Memory request: 16 Gi total
  - **Note:** Requires resource overcommit (4 vCPUs available, 8 cores requested)

**System Node Pool (4 vCPUs, 16 GB RAM):**
- **Besu Sentries:** 3 pods (6 cores, 12 Gi)
- **Besu RPC:** 3 pods (12 cores, 24 Gi)
- **Application Stack:** 13 pods (~5.2 cores, ~13 Gi)
- **Total Requests:** ~23.2 cores, ~49 Gi
- **Note:** Requires resource overcommit (4 vCPUs available, ~23 cores requested)

### Remaining Regions (6 vCPUs capacity)

**Validator Node Pool (2 vCPUs, 4 GB RAM):**
- **Besu Validators:** 4 pods
  - CPU request: 8 cores total
  - Memory request: 16 Gi total
  - **Note:** Requires resource overcommit (2 vCPUs available, 8 cores requested)

**System Node Pool (4 vCPUs, 16 GB RAM):**
- **Besu Sentries:** 3 pods (6 cores, 12 Gi)
- **Besu RPC:** 3 pods (12 cores, 24 Gi)
- **Application Stack:** 13 pods (~5.2 cores, ~13 Gi)
- **Total Requests:** ~23.2 cores, ~49 Gi
- **Note:** Requires resource overcommit (4 vCPUs available, ~23 cores requested)

---

## ⚠️ Resource Overcommit Considerations

### Kubernetes Resource Overcommit

**CPU Overcommit:**
- Validator nodes: 2-4x overcommit (requested vs. available)
- System nodes: 4-6x overcommit (requested vs. available)

**Memory Overcommit:**
- Validator nodes: 2x overcommit (requested vs. available)
- System nodes: 3x overcommit (requested vs. available)

**Best Practices:**
- Use resource limits to prevent pod starvation
- Configure QoS classes (Guaranteed, Burstable, BestEffort)
- Monitor actual resource usage and adjust requests/limits
- Consider horizontal pod autoscaling (HPA) for non-critical workloads

---

## 🔧 Recommended Adjustments

### Option 1: Reduce Pod Resource Requests

**Reduce Besu RPC CPU requests:**
- From 4 cores to 2 cores per pod
- Total: 6 cores per region (instead of 12)
- Helps fit within system node capacity

**Reduce Application Stack requests:**
- Use smaller resource requests where possible
- Some pods can run with lower CPU allocation

### Option 2: Increase System Node Count

**Primary regions:**
- Increase system nodes to 3 (6 vCPUs, 24 GB RAM)
- Total: 10 vCPUs per region (at quota limit)

**Remaining regions:**
- Increase system nodes to 3 (6 vCPUs, 24 GB RAM)
- Total: 8 vCPUs per region (within quota)

### Option 3: Hybrid Approach

- Use current VM counts (2 system nodes)
- Reduce resource requests for non-critical pods
- Enable HPA for dynamic scaling
- Accept moderate overcommit with proper limits

---

## 📋 AKS Cluster Configuration

### Cluster Settings

```yaml
cluster_config:
  name: "${region}-aks-main"
  kubernetes_version: "1.32"
  network_plugin: "azure"
  network_policy: "azure"

  default_node_pool:
    name: "system"
    node_count: 2
    vm_size: "Standard_D2s_v3"
    os_disk_size_gb: 128

  validator_node_pool:
    name: "validators"
    node_count: 1  # or 2 for primary regions
    vm_size: "Standard_B2s"
    os_disk_size_gb: 512
    node_taints:
      - "role=validator:NoSchedule"
```

---

## 🚀 Deployment Sequence

### Phase 1: Infrastructure Deployment

1. Deploy AKS clusters in all 36 regions
2. Configure system node pools (2 nodes each)
3. Configure validator node pools (1-2 nodes based on region type)
4. Verify cluster connectivity and health

### Phase 2: Besu Network Deployment

1. Deploy Besu validators (4 pods per region)
2. Deploy Besu sentries (3 pods per region)
3. Deploy Besu RPC nodes (3 pods per region)
4. Configure geo-aware validator selection
5. Initialize IBFT 2.0 consensus

### Phase 3: Application Stack Deployment

1. Deploy Blockscout (explorer + database)
2. Deploy FireFly (core + PostgreSQL + IPFS)
3. Deploy Cacti (API + connector)
4. Deploy monitoring stack
5. Deploy API gateway

### Phase 4: Verification & Optimization

1. Verify all pods are running
2. Monitor resource utilization
3. Adjust resource requests/limits as needed
4. Enable autoscaling where appropriate
5. Test geo-aware consensus

---

## 📊 Resource Summary

### Global Totals

| Resource | Total (36 regions) |
|----------|-------------------|
| **VMs** | 120 |
| **vCPUs** | 240 |
| **RAM** | ~440 GB |
| **Pods** | 828 |
| **Besu Pods** | 360 |
| **Application Pods** | 468 |

### Per Region Averages

| Resource | Primary (12) | Remaining (24) |
|----------|-------------|----------------|
| **VMs** | 4 | 3 |
| **vCPUs** | 8 | 6 |
| **RAM** | 24 GB | 20 GB |
| **Pods** | 23 | 23 |

---

## 📚 References

- [36-Region Blueprint](./36-REGION-BLUEPRINT.md)
- [Geo-Aware Committee Configuration](./GEO-AWARE-COMMITTEE-CONFIG.md)
- [Deployment Checklist](./DEPLOYMENT_CHECKLIST.md)