- Introduced Aggregator.sol for Chainlink-compatible oracle functionality, including round-based updates and access control. - Added OracleWithCCIP.sol to extend Aggregator with CCIP cross-chain messaging capabilities. - Created .gitmodules to include OpenZeppelin contracts as a submodule. - Developed a comprehensive deployment guide in NEXT_STEPS_COMPLETE_GUIDE.md for Phase 2 and smart contract deployment. - Implemented Vite configuration for the orchestration portal, supporting both Vue and React frameworks. - Added server-side logic for the Multi-Cloud Orchestration Portal, including API endpoints for environment management and monitoring. - Created scripts for resource import and usage validation across non-US regions. - Added tests for CCIP error handling and integration to ensure robust functionality. - Included various new files and directories for the orchestration portal and deployment scripts.
122 lines
3.3 KiB
Markdown
122 lines
3.3 KiB
Markdown
# Deployment Monitoring Guide
|
||
|
||
## Overview
|
||
|
||
Full deployment monitoring system for Chain-138 multi-region deployment with real-time status tracking.
|
||
|
||
## Monitoring Tools
|
||
|
||
### 1. Deployment Dashboard
|
||
```bash
|
||
./scripts/deployment/deployment-dashboard.sh
|
||
```
|
||
- **Purpose**: Comprehensive one-time status view
|
||
- **Updates**: Static (run manually)
|
||
- **Shows**: Infrastructure, clusters, resource groups, progress
|
||
|
||
### 2. Continuous Monitoring
|
||
```bash
|
||
./scripts/deployment/monitor-continuous.sh
|
||
```
|
||
- **Purpose**: Continuous real-time monitoring
|
||
- **Updates**: Every 15 seconds
|
||
- **Shows**: Full dashboard + Terraform log tail
|
||
|
||
### 3. Live Monitoring
|
||
```bash
|
||
./scripts/deployment/monitor-deployment-live.sh
|
||
```
|
||
- **Purpose**: Live updates with full details
|
||
- **Updates**: Every 15 seconds
|
||
- **Shows**: Complete status with log tail
|
||
|
||
### 4. Detailed Monitoring
|
||
```bash
|
||
./scripts/deployment/monitor-deployment.sh
|
||
```
|
||
- **Purpose**: Detailed per-region monitoring
|
||
- **Updates**: Every 30 seconds
|
||
- **Shows**: Individual cluster status per region
|
||
|
||
## Current Deployment Status
|
||
|
||
### Infrastructure
|
||
- **Terraform**: Running (PID varies)
|
||
- **Resource Groups**: 175 created
|
||
- **Expected**: 144 (6 per region × 24 regions)
|
||
- **Status**: Over-provisioned (includes managed resource groups)
|
||
|
||
### AKS Clusters
|
||
- **Total Regions**: 24
|
||
- **Ready**: 0-1 (varies)
|
||
- **Failed**: 8
|
||
- **Canceled**: 16
|
||
- **Creating**: 0
|
||
- **Not Found**: Varies
|
||
|
||
### Issues
|
||
1. **State Lock**: Terraform state locked (another process running)
|
||
2. **Failed Clusters**: 8 clusters in Failed state
|
||
3. **Canceled Clusters**: 16 clusters in Canceled state
|
||
4. **Deletion Issues**: Clusters can't be deleted easily (Azure limitation)
|
||
|
||
## Monitoring Commands
|
||
|
||
### Quick Status
|
||
```bash
|
||
./scripts/deployment/deployment-dashboard.sh
|
||
```
|
||
|
||
### Continuous Monitoring
|
||
```bash
|
||
./scripts/deployment/monitor-continuous.sh
|
||
```
|
||
|
||
### Terraform Log
|
||
```bash
|
||
tail -f /tmp/terraform-apply-retry.log
|
||
# OR
|
||
tail -f /tmp/terraform-apply-final-clean.log
|
||
```
|
||
|
||
### Cluster Status
|
||
```bash
|
||
az aks list --subscription fc08d829-4f14-413d-ab27-ce024425db0b --query "[?contains(name, 'az-p-')].{name:name, state:provisioningState, power:powerState.code}" -o table
|
||
```
|
||
|
||
## Troubleshooting
|
||
|
||
### Issue: State Lock
|
||
**Symptom**: `Error acquiring the state lock`
|
||
**Solution**: Wait for current Terraform process to complete, or force unlock:
|
||
```bash
|
||
cd terraform/well-architected/cloud-sovereignty
|
||
terraform force-unlock <LOCK_ID>
|
||
```
|
||
|
||
### Issue: Failed/Canceled Clusters
|
||
**Symptom**: Clusters in Failed or Canceled state
|
||
**Solution**:
|
||
1. Wait for clusters to be deleted automatically
|
||
2. Or manually delete via Azure Portal
|
||
3. Re-run Terraform deployment
|
||
|
||
### Issue: Clusters Not Deleting
|
||
**Symptom**: Clusters stuck in deletion
|
||
**Solution**: Check for dependencies, wait longer, or delete via Azure Portal
|
||
|
||
## Next Steps
|
||
|
||
1. **Monitor Deployment**: Use continuous monitoring
|
||
2. **Wait for Completion**: Let Terraform finish
|
||
3. **Verify Clusters**: Check cluster status
|
||
4. **Run Next Steps**: Once clusters are ready
|
||
|
||
## Files
|
||
|
||
- **Dashboard**: `scripts/deployment/deployment-dashboard.sh`
|
||
- **Continuous**: `scripts/deployment/monitor-continuous.sh`
|
||
- **Live**: `scripts/deployment/monitor-deployment-live.sh`
|
||
- **Terraform Log**: `/tmp/terraform-apply-retry.log`
|
||
- **Final Log**: `/tmp/terraform-apply-final-clean.log`
|