Files
proxmox/docs/04-configuration/BESU_ARCHIVE_NODES.md

459 lines
10 KiB
Markdown
Raw Normal View History

# Besu Archive Node Configuration Guide
**Last Updated:** 2026-01-31
**Document Version:** 1.0
**Status:** Active Documentation
---
**Date**: 2026-01-17
**Purpose**: Guide for configuring and managing Besu archive nodes (sentry nodes)
---
## Overview
Sentry nodes are configured as **full archive nodes** to maintain complete blockchain history for archival purposes. This guide documents archive node configuration, storage requirements, and management.
---
## Archive Node Configuration
### Current Sentry Configuration
**Node Type**: Sentry (Full Archive)
**Key Configuration**:
```toml
# Archive node configuration
sync-mode="FULL" # Full blockchain sync
logging="INFO" # Detailed logs for archival
# RPC Configuration (internal only)
rpc-http-enabled=true
rpc-http-api=["ETH","NET","WEB3","ADMIN"]
# Network
discovery-enabled=true # Open P2P discovery
max-peers=25
# Permissioning
permissions-nodes-config-file-enabled=true
```
**File**: `smom-dbis-138-proxmox/templates/besu-configs/config-sentry.toml`
---
## Archive Node Requirements
### 1. Sync Mode: FULL
```toml
sync-mode="FULL"
```
**Verification**: ✅ All sentry configs use `sync-mode="FULL"`
**Purpose**:
- Maintains complete blockchain history
- Enables historical state queries
- Required for full archive functionality
---
### 2. Logging: INFO
```toml
logging="INFO"
```
**Verification**: ✅ All sentry configs use `logging="INFO"`
**Rationale**:
- Detailed logs for archival purposes
- Better debugging for archive queries
- Necessary for historical analysis
**Trade-off**: Higher I/O overhead (~10-20%) compared to WARN logging
---
### 3. No Pruning
**Current Configuration**: ✅ Pruning not enabled (default: full archive)
**Verification**: No `pruning-enabled` or `pruning-blocks-retained` options in sentry configs
**Purpose**:
- Keep all historical data
- Enable unlimited historical queries
- Maintain complete blockchain archive
**Note**: If storage becomes an issue, consider enabling pruning with high retention, but this reduces archive completeness.
---
### 4. RPC APIs for Archive Queries
**Current APIs**: `["ETH","NET","WEB3","ADMIN"]`
**Archive-Relevant APIs**:
- `ETH`: Standard Ethereum APIs (including historical queries)
- `ADMIN`: Administrative operations
**Verification**: ✅ Appropriate APIs enabled for archive access
---
## Storage Requirements
### Archive Database Growth
**Estimation** (per Besu documentation):
- **Block data**: ~2-5 KB per block
- **State data**: Variable (grows with contract storage)
- **Transaction receipts**: ~500 bytes per transaction
**Growth Rate**:
- **Current network**: ~20 blocks/minute = ~1,200 blocks/hour
- **Block data growth**: ~2.4-6 MB/hour = ~58-144 MB/day
- **With state data**: Significantly higher (contract storage)
**Storage Requirements**:
| Time Period | Estimated Storage | Notes |
|-------------|-------------------|-------|
| **1 month** | ~10-50 GB | Depends on transaction volume |
| **3 months** | ~30-150 GB | Linear growth expected |
| **1 year** | ~100-500 GB | State data may be higher |
| **5 years** | ~500 GB - 2.5 TB | Long-term archival |
**Current Assessment**: Monitor storage usage and plan for growth
---
### Storage Planning
**Recommendations**:
1. **Initial Allocation**:
- Minimum: 500 GB per archive node
- Recommended: 1-2 TB per archive node
2. **Growth Planning**:
- Monitor storage usage monthly
- Plan expansion before reaching 80% capacity
- Consider separate volumes for archive data
3. **Backup Strategy**:
- Regular backups of archive database
- Offsite backup for disaster recovery
- Retention policy for backups
---
## Archive Node Verification
### Configuration Verification
```bash
# Verify sync mode is FULL
grep "sync-mode" /etc/besu/config-sentry.toml
# Expected: sync-mode="FULL"
# Verify logging is INFO
grep "logging" /etc/besu/config-sentry.toml
# Expected: logging="INFO"
# Verify no pruning options
grep -i "pruning" /etc/besu/config-sentry.toml
# Expected: No output (pruning not enabled = full archive)
```
**Current Status**: ✅ All sentry configs verified as archive nodes
---
### Functional Verification
**Check Archive Status**:
```bash
# Check sync status
curl -X POST http://localhost:8545 \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
# Expected: false (fully synced)
# Check latest block
curl -X POST http://localhost:8545 \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
# Test historical query (verify archive capability)
curl -X POST http://localhost:8545 \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_getBalance","params":["0x...","0x100"],"id":1}'
# Should return historical balance (archive nodes only)
```
---
## Archive Node Management
### Storage Management
**Monitor Storage Usage**:
```bash
# Check database size
du -sh /data/besu/database/
# Check disk usage
df -h /data/besu/
# Monitor growth over time
# (set up monitoring alerts at 80% capacity)
```
**Storage Expansion**:
1. Plan expansion when approaching 80% capacity
2. Backup archive data before expansion
3. Expand volume or add storage
4. Verify Besu continues operating
---
### Backup and Recovery
**Backup Strategy**:
1. **Database Backup**:
- Full database backup weekly
- Incremental backups daily
- Offsite backup monthly
2. **Configuration Backup**:
- Backup config files
- Backup permission files
- Backup node keys
3. **Recovery Procedures**:
- Document recovery steps
- Test recovery procedures
- Maintain recovery runbook
---
### Performance Optimization
**Archive Node Performance**:
1. **Storage Performance**:
- Use SSD for archive database (high read I/O)
- Consider NVMe for high-performance requirements
- Monitor I/O performance
2. **Memory Optimization**:
- Higher heap size (8-12 GB) for archive nodes
- Cache frequently accessed historical data
- Monitor memory usage for historical queries
3. **Query Optimization**:
- Index historical data appropriately
- Monitor query performance
- Optimize frequently used historical queries
---
## Archive vs. Pruned Nodes
### Full Archive (Current Configuration)
**Characteristics**:
- ✅ Complete blockchain history
- ✅ All historical state queries supported
- ✅ Unlimited historical access
- ⚠️ Higher storage requirements
- ⚠️ Higher memory requirements
**Use Case**: ✅ Sentry nodes (archival purposes)
---
### Pruned Nodes (Not Recommended for Sentries)
**Configuration**:
```toml
pruning-enabled=true
pruning-blocks-retained=1024 # Keep last 1024 blocks
```
**Characteristics**:
- ❌ Limited historical data
- ❌ Historical queries may fail
- ✅ Lower storage requirements
- ✅ Lower memory requirements
**Use Case**: Non-archive RPC nodes (if storage is concern)
**Note**: **Do NOT enable pruning on sentry nodes** - they are archive nodes.
---
## Alternative: Pruning Configuration (If Storage Becomes Issue)
**Only consider if storage is a critical constraint**:
```toml
# Enable pruning with high retention (NOT RECOMMENDED for full archive)
pruning-enabled=true
pruning-blocks-retained=100000 # Keep last 100,000 blocks (~70 days at 2s/block)
```
**Warning**: This reduces archive completeness. Prefer expanding storage instead.
---
## Monitoring Archive Nodes
### Key Metrics
1. **Sync Status**:
- Fully synced (archive complete)
- Syncing (catching up)
- Lag (blocks behind)
2. **Storage Usage**:
- Database size
- Disk usage
- Growth rate
3. **Query Performance**:
- Historical query latency
- Query success rate
- Archive query volume
4. **Resource Usage**:
- Memory usage (historical queries)
- Disk I/O (read-heavy)
- CPU usage (query processing)
---
## Archive Node Strategy
### Current Implementation
**Sentry nodes = Full archive nodes**
- Complete blockchain history
- Detailed logs (INFO)
- Full sync mode
- No pruning
**Validators = Non-archive**
- Minimal logs (WARN)
- Full sync (consensus requirement)
- Not archive nodes (no historical queries)
**RPC nodes = Non-archive (most)**
- Minimal logs (WARN)
- Full sync (currently)
- Not archive nodes (API serving)
---
### Archive Node Distribution
**Current**:
- **Archive Nodes**: 4 sentries (VMIDs 1500-1503)
- **Non-Archive Nodes**: Validators + RPC nodes
**Recommendation**: ✅ Appropriate distribution
- Sentries handle archival
- Other nodes run lean
- Centralized archive management
---
## Storage Planning Example
### Example: 1 Year Archive Growth
**Assumptions**:
- Block time: 2 seconds
- Blocks per day: 43,200
- Blocks per year: ~15.7 million
- Block data: ~3 KB per block (average)
- State data: Variable (depends on contracts)
**Estimation**:
- Block data: 15.7M × 3 KB ≈ 47 GB/year
- State data: 50-200 GB/year (varies widely)
- **Total**: ~100-250 GB/year per archive node
**Planning**:
- Initial: 1 TB allocation
- Year 1: ~750 GB remaining
- Year 2: ~500 GB remaining
- Year 3: ~250 GB remaining
- **Action**: Plan expansion by year 3
---
## Best Practices
### 1. Storage Monitoring
- Monitor disk usage weekly
- Set alerts at 80% capacity
- Plan expansion proactively
### 2. Archive Verification
- Verify archive queries work
- Test historical state access
- Confirm sync status regularly
### 3. Backup Strategy
- Regular database backups
- Test recovery procedures
- Offsite backup for disaster recovery
### 4. Performance Monitoring
- Monitor query performance
- Track storage growth
- Optimize if performance degrades
---
## Related Documentation
- `docs/04-configuration/BESU_CONFIGURATION_GUIDE.md` - Configuration reference
- `docs/04-configuration/BESU_PERFORMANCE_TUNING.md` - Performance tuning
- `docs/04-configuration/BESU_PATH_REFERENCE.md` - Path structure
---
## Summary
### Archive Node Status
**Configuration Verified**:
- All sentry nodes configured as full archive
- `sync-mode="FULL"`
- `logging="INFO"`
- No pruning enabled ✅
**Storage Planning**:
- Monitor growth regularly
- Plan expansion proactively
- Maintain backup strategy
**Performance**:
- Appropriate memory allocation
- SSD recommended for archive database
- Monitor query performance
---
**Last Updated**: 2026-01-17
**Status**: Archive Configuration Verified