Files
proxmox/docs/archive/NEXT_STEPS_BOOT_VALIDATED_SET.md

659 lines
19 KiB
Markdown

# Next Steps: Script-Based Deployment & Boot Node for Validated Set
This document outlines the complete set of next steps needed to build out a working and functional deployment system using **EITHER**:
- **Script-based approach**: Automated scripts to deploy and configure the validated set
- **Boot node approach**: A dedicated boot node to bootstrap the network discovery
Both approaches can be used together or separately, depending on network requirements.
## Overview
The goal is to create a comprehensive, production-ready deployment system that:
1. ✅ Deploys containers (already done)
2. 🔄 Properly bootstraps the network using **scripts** OR **boot node** (or both)
3. ✅ Validates and verifies the entire deployment
4. ✅ Ensures all validators are properly configured and connected
5. ✅ Provides end-to-end orchestration scripts
## Two Approaches: Script vs Boot Node
### Approach 1: Script-Based Deployment
**Use when:** Private/permissioned network with known static nodes, no external discovery needed
**Characteristics:**
- Uses `static-nodes.json` for peer discovery
- Scripts orchestrate deployment and configuration
- No dedicated boot node required
- All nodes listed statically
- Faster initial setup
- **Recommended for your current setup** (validators ↔ sentries topology)
### Approach 2: Boot Node Deployment
**Use when:** Network needs dynamic peer discovery, external nodes will join later
**Characteristics:**
- Dedicated boot node for initial discovery
- Other nodes connect to boot node first
- Boot node helps discover additional peers
- More flexible for network expansion
- Required for public/open networks
- Can be combined with static nodes
### Approach 3: Hybrid (Script + Boot Node)
**Use when:** Best of both worlds - script orchestration + boot node for discovery
**Characteristics:**
- Scripts handle deployment and configuration
- Boot node provides discovery service
- Static nodes for critical connections
- Boot node for dynamic discovery
- Most flexible approach
---
## Phase 1: Script-Based Deployment (Primary Approach)
### 1.1 Create Validated Set Deployment Script
**File:** `scripts/deployment/deploy-validated-set.sh`
**Purpose:** Script-based deployment that orchestrates the entire validated set without requiring a boot node
**Functionality:**
- Deploy all containers (validators, sentries, RPC)
- Copy configuration files (genesis, static-nodes, permissions)
- Copy validator keys
- Start services in correct order (sentries → validators → RPC)
- Validate deployment
- Generate deployment report
**Key Features:**
- Uses static-nodes.json (no boot node needed)
- Sequential startup orchestration
- Comprehensive validation
- Error handling and rollback
- Detailed logging
**Status:****NOT CREATED** (This is the PRIMARY script-based approach)
**Alternative:** If boot node is desired, see Phase 1A below
---
## Phase 1A: Boot Node Deployment (Optional)
### 1A.1 Create Boot Node Deployment Script
**File:** `scripts/deployment/deploy-boot-node.sh`
**Purpose:** Deploy and configure a dedicated boot node (optional - only if using boot node approach)
**Functionality:**
- Deploy container with boot node configuration
- Configure as discovery/bootstrap node
- Expose only P2P port (30303) - no RPC
- Generate and export enode for use by other nodes
- Ensure boot node starts first before other nodes
**Key Features:**
- Special configuration for boot node (if separate)
- OR configure first validator (106) as boot node
- Generate boot node enode for inclusion in genesis or static-nodes
- Health checks to ensure boot node is ready before proceeding
**Status:****NOT CREATED** (Optional - only if boot node approach is chosen)
**Decision Point:** Do you need a boot node, or can you use script-based static-nodes approach?
---
### 1.2 Create Network Bootstrap Script
**File:** `scripts/network/bootstrap-network.sh`
**Purpose:** Orchestrate the initial network bootstrap sequence (works with EITHER script-based or boot node approach)
**Functionality (Script-Based Approach):**
1. Extract enodes from all deployed containers
2. Generate static-nodes.json with all validator enodes
3. Deploy static-nodes.json to all nodes
4. Start nodes in sequence (sentries → validators → RPC)
5. Verify peer connections
6. Validate network is operational
**Functionality (Boot Node Approach):**
1. Start boot node first
2. Wait for boot node to be ready (P2P listening, enode available)
3. Extract boot node enode
4. Update static-nodes.json for all other nodes with boot node enode
5. Deploy static-nodes.json to all nodes
6. Start remaining nodes in sequence (sentries, then validators, then RPC)
7. Verify peer connections
**Key Features:**
- Supports both script-based and boot node approaches
- Sequential startup with dependencies
- Health checks between steps
- Automatic enode extraction
- Updates static-nodes.json dynamically
- Validates peer connections after startup
**Status:****NOT CREATED**
**Recommendation:** Start with script-based approach (simpler for permissioned networks)
---
## Phase 2: Validated Set Deployment
### 2.1 Create Validator Set Validation Script
**File:** `scripts/validation/validate-validator-set.sh`
**Purpose:** Validate that all validators are properly configured and can participate in consensus
**Functionality:**
- Check validator keys exist and are accessible
- Verify validator addresses match configuration
- Validate validator keys are loaded by Besu
- Check validators are in genesis (if static) or validator contract
- Verify validator services are running
- Check validators can connect to each other
- Validate consensus is active (blocks being produced)
**Key Features:**
- Comprehensive validator health checks
- Validator key validation
- Consensus participation verification
- Network connectivity checks
- QBFT-specific validation
**Status:****NOT CREATED**
---
### 2.2 Create Validator Registration Script
**File:** `scripts/validation/register-validators.sh`
**Purpose:** Register validators in the validator contract (for dynamic validator management)
**Functionality:**
- Read validator addresses from key files
- Submit validator registration transactions
- Verify validator registration on-chain
- Wait for epoch change if needed
- Validate validators are active in consensus
**Key Features:**
- Smart contract interaction
- Transaction submission and verification
- Epoch management
- Validator set verification
**Status:****NOT CREATED** (Note: Only needed if using dynamic validator management via contract)
---
### 2.3 Create Deployment Validation Orchestrator
**File:** `scripts/validation/validate-deployment.sh`
**Purpose:** Comprehensive end-to-end validation of the entire deployment
**Functionality:**
1. Validate container deployment (all containers running)
2. Validate network connectivity (P2P, RPC)
3. Validate configuration files (genesis, static-nodes, permissions)
4. Validate validator set (keys, addresses, consensus participation)
5. Validate sentry connectivity (can connect to validators)
6. Validate RPC endpoints (can query blockchain state)
7. Validate allowlist configuration (permissions-nodes.toml)
8. Generate validation report
**Key Features:**
- Multi-phase validation
- Comprehensive checks
- Detailed reporting
- Error collection and reporting
- Exit codes for CI/CD integration
**Status:** ⚠️ **PARTIAL** (Some validation exists, but not comprehensive)
---
## Phase 3: End-to-End Deployment Orchestration
### 3.1 Create Complete Deployment Orchestrator
**File:** `scripts/deployment/deploy-validated-set.sh`
**Purpose:** Single script that orchestrates the entire validated set deployment
**Functionality:**
1. **Pre-deployment Validation**
- Check prerequisites
- Validate configuration
- Check resources
- Verify no conflicts
2. **Deploy Containers**
- Deploy boot node (or first validator)
- Deploy remaining validators
- Deploy sentries
- Deploy RPC nodes
3. **Bootstrap Network**
- Start boot node
- Extract boot node enode
- Update static-nodes.json
- Deploy configuration files
- Start remaining nodes in correct order
4. **Configure Validators**
- Copy validator keys
- Register validators (if dynamic)
- Verify validator set
5. **Post-Deployment Validation**
- Run comprehensive validation
- Verify consensus is active
- Check all services
- Generate deployment report
6. **Rollback on Failure**
- Clean up partial deployments
- Restore previous state if needed
**Key Features:**
- Single command deployment
- Error handling and rollback
- Progress reporting
- Detailed logging
- Validation at each step
**Status:****NOT CREATED**
---
### 3.2 Create Quick Bootstrap Script
**File:** `scripts/deployment/bootstrap-quick.sh`
**Purpose:** Quick bootstrap for existing deployed containers
**Functionality:**
- Assume containers already deployed
- Extract boot node enode
- Update static-nodes.json
- Deploy updated configs
- Restart services in correct order
- Verify connectivity
**Use Case:** When containers are deployed but network needs to be bootstrapped/rebootstrapped
**Status:****NOT CREATED**
---
## Phase 4: Health Checks & Monitoring
### 4.1 Create Node Health Check Script
**File:** `scripts/health/check-node-health.sh`
**Purpose:** Check health of individual nodes
**Functionality:**
- Container status
- Service status (systemd)
- Process status
- P2P connectivity
- RPC availability (if enabled)
- Block sync status
- Peer count
- Consensus participation (for validators)
**Key Features:**
- Per-node health checks
- Detailed status output
- JSON output option (for monitoring)
- Exit codes for alerts
**Status:** ⚠️ **PARTIAL** (Some checks exist in other scripts)
---
### 4.2 Create Network Health Dashboard Script
**File:** `scripts/health/network-health-dashboard.sh`
**Purpose:** Display comprehensive network health overview
**Functionality:**
- All nodes status table
- Peer connectivity matrix
- Block height comparison
- Consensus status
- Validator participation
- Error summary
**Key Features:**
- Human-readable dashboard
- Color-coded status
- Quick problem identification
- Summary statistics
**Status:****NOT CREATED**
---
## Phase 5: Configuration Management
### 5.1 Create Configuration Generator
**File:** `scripts/config/generate-configs.sh`
**Purpose:** Generate all configuration files from templates
**Functionality:**
- Generate genesis.json (if needed)
- Generate static-nodes.json from live nodes
- Generate permissions-nodes.toml
- Generate node-specific config files (config-validator.toml, etc.)
- Validate generated configs
**Key Features:**
- Template-based generation
- Dynamic enode extraction
- Validation of generated files
- Backup of existing configs
**Status:** ⚠️ **PARTIAL** (Some config generation exists for allowlist)
---
### 5.2 Create Configuration Validator
**File:** `scripts/config/validate-configs.sh`
**Purpose:** Validate all configuration files before deployment
**Functionality:**
- Validate JSON/TOML syntax
- Validate genesis.json structure
- Validate static-nodes.json (enode format, node IDs)
- Validate permissions-nodes.toml
- Check for missing files
- Verify file permissions
**Key Features:**
- Pre-deployment validation
- Detailed error messages
- Report generation
**Status:****NOT CREATED**
---
## Phase 6: Documentation & Runbooks
### 6.1 Create Boot Node Runbook
**File:** `docs/BOOT_NODE_RUNBOOK.md`
**Purpose:** Detailed runbook for boot node setup and troubleshooting
**Contents:**
- Boot node concept explanation
- Setup instructions
- Configuration details
- Troubleshooting guide
- Best practices
**Status:****NOT CREATED**
---
### 6.2 Create Validated Set Deployment Guide
**File:** `docs/VALIDATED_SET_DEPLOYMENT_GUIDE.md`
**Purpose:** Step-by-step guide for deploying a validated set
**Contents:**
- Prerequisites
- Deployment steps
- Validation procedures
- Troubleshooting
- Rollback procedures
**Status:****NOT CREATED**
---
### 6.3 Create Network Bootstrap Guide
**File:** `docs/NETWORK_BOOTSTRAP_GUIDE.md`
**Purpose:** Guide for bootstrapping the network from scratch
**Contents:**
- Bootstrap sequence
- Node startup order
- Configuration updates
- Verification steps
- Common issues
**Status:****NOT CREATED**
---
## Phase 7: Testing & Validation
### 7.1 Create Integration Test Suite
**File:** `scripts/test/test-deployment.sh`
**Purpose:** Automated integration tests for deployment
**Functionality:**
- Test container deployment
- Test network bootstrap
- Test validator connectivity
- Test consensus functionality
- Test RPC endpoints
- Test rollback procedures
**Key Features:**
- Automated testing
- Test reports
- CI/CD integration
**Status:****NOT CREATED**
---
### 7.2 Create Smoke Tests
**File:** `scripts/test/smoke-tests.sh`
**Purpose:** Quick smoke tests after deployment
**Functionality:**
- Basic connectivity checks
- Service status checks
- RPC endpoint checks
- Quick consensus check
**Key Features:**
- Fast execution
- Critical path validation
- Exit codes for automation
**Status:****NOT CREATED**
---
## Implementation Priority
### High Priority (Critical Path) - Script-Based Approach
1.**Validated Set Deployment Script** (`deploy-validated-set.sh`) - **PRIMARY**
2.**Network Bootstrap Script** (`bootstrap-network.sh`) - Script-based mode
3.**Deployment Validation Orchestrator** (`validate-deployment.sh`)
4.**Validator Set Validation** (`validate-validator-set.sh`)
### Optional (Boot Node Approach)
5. ⚠️ **Boot Node Deployment Script** (`deploy-boot-node.sh`) - Only if boot node needed
6. ⚠️ **Network Bootstrap Script** - Boot node mode (enhance existing script)
### Medium Priority (Important Features)
5. ⚠️ **Validator Set Validation** (`validate-validator-set.sh`)
6. ⚠️ **Node Health Checks** (`check-node-health.sh`)
7. ⚠️ **Configuration Generator** (enhance existing)
8. ⚠️ **Quick Bootstrap Script** (`bootstrap-quick.sh`)
### Low Priority (Nice to Have)
9. 📝 **Network Health Dashboard** (`network-health-dashboard.sh`)
10. 📝 **Validator Registration** (`register-validators.sh`) - only if using dynamic validators
11. 📝 **Configuration Validator** (`validate-configs.sh`)
12. 📝 **Documentation** (runbooks and guides)
13. 📝 **Test Suites** (integration and smoke tests)
---
## Recommended Implementation Order
### Week 1: Core Infrastructure (Script-Based)
1. Create `deploy-validated-set.sh` - **Primary script-based deployment**
2. Create `bootstrap-network.sh` - Script-based mode (uses static-nodes)
3. Enhance existing `validate-deployment.sh`
4. Create `validate-validator-set.sh`
### Optional: Boot Node Support (If Needed)
5. Create `deploy-boot-node.sh` - Only if boot node approach is chosen
6. Enhance `bootstrap-network.sh` - Add boot node mode support
### Week 2: Orchestration
4. Create `deploy-validated-set.sh`
5. Create `validate-validator-set.sh`
6. Create `bootstrap-quick.sh`
### Week 3: Health & Monitoring
7. Create `check-node-health.sh`
8. Create `network-health-dashboard.sh`
9. Enhance configuration generation scripts
### Week 4: Documentation & Testing
10. Create documentation (runbooks, guides)
11. Create test suites
12. Final validation and testing
---
## Existing Assets to Leverage
### Already Implemented
- ✅ Container deployment scripts (`deploy-besu-nodes.sh`, etc.)
- ✅ Configuration copying (`copy-besu-config.sh`)
- ✅ Allowlist management (`besu-*.sh` scripts)
- ✅ Network utilities (`update-static-nodes.sh`)
- ✅ Basic validation scripts (`validate-ml110-deployment.sh`)
- ✅ Deployment status checks (`check-deployments.sh`)
### Can Be Enhanced
- ⚠️ `validate-deployment.sh` - needs comprehensive validator set validation
- ⚠️ `deploy-all.sh` - needs boot node support and sequential startup
- ⚠️ Configuration generation - needs boot node enode integration
---
## Success Criteria
A successful implementation should provide:
1. **Single Command Deployment**
```bash
./scripts/deployment/deploy-validated-set.sh
```
- Deploys all containers
- Bootstraps network correctly
- Validates entire deployment
- Reports success/failure
2. **Network Bootstrap**
- Boot node starts first
- Other nodes connect successfully
- All validators participate in consensus
- Network is fully operational
3. **Validation**
- All validators are validated and active
- Network connectivity verified
- Consensus is functional
- RPC endpoints are working
4. **Documentation**
- Complete runbooks for all procedures
- Troubleshooting guides
- Best practices documented
---
## Quick Start Checklist
### Script-Based Approach (Recommended for Your Setup)
- [ ] Review existing deployment scripts
- [ ] Create `deploy-validated-set.sh` - Main deployment orchestrator
- [ ] Create `bootstrap-network.sh` - Script-based mode (static-nodes)
- [ ] Create `validate-validator-set.sh` - Validator validation
- [ ] Enhance existing `validate-deployment.sh`
- [ ] Test deployment sequence on test environment
- [ ] Document procedures
- [ ] Test end-to-end on production-like environment
### Boot Node Approach (Optional - Only If Needed)
- [ ] Decide if boot node is needed (probably not for permissioned network)
- [ ] Design boot node strategy (separate node vs first validator)
- [ ] Create `deploy-boot-node.sh` (if using dedicated boot node)
- [ ] Enhance `bootstrap-network.sh` with boot node mode
- [ ] Test boot node bootstrap sequence
- [ ] Document boot node procedures
---
## Notes
### Script-Based vs Boot Node Decision
**For Your Current Setup (Permissioned Network with Validators ↔ Sentries):**
- ✅ **Recommend: Script-Based Approach**
- You already use `static-nodes.json` for peer discovery
- All nodes are known and static
- No external discovery needed
- Simpler and faster deployment
- Script orchestrates everything using static configuration
-**Boot Node Not Required**
- Boot nodes are for dynamic peer discovery
- Public/open networks that need discovery
- Your network is permissioned with known validators
- Static-nodes.json already serves the bootstrap purpose
**When to Use Boot Node:**
- Network will expand with external nodes
- Dynamic peer discovery needed
- Public network deployment
- Combining static + dynamic discovery
### Other Notes
- **Validator Registration**: Only needed if using dynamic validator management via smart contract. If validators are statically defined in genesis, skip this step.
- **Sequential Startup**: Critical for network bootstrap. Nodes must start in correct order: sentries → validators → RPC nodes (for script-based) OR boot node → sentries → validators → RPC nodes (for boot node approach).
- **Validation**: Comprehensive validation should happen at multiple stages: pre-deployment, post-deployment, and ongoing health checks.