Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands - CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround - CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check - NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere - MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates - LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference Co-authored-by: Cursor <cursoragent@cursor.com>
387 lines
8.9 KiB
Markdown
387 lines
8.9 KiB
Markdown
# Besu Performance Tuning Guide
|
|
|
|
**Last Updated:** 2026-01-31
|
|
**Document Version:** 1.0
|
|
**Status:** Active Documentation
|
|
|
|
---
|
|
|
|
**Date**: 2026-01-17
|
|
**Purpose**: Performance optimization recommendations for Besu nodes
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This guide provides performance tuning recommendations for Besu nodes based on network size, node type, and operational requirements.
|
|
|
|
---
|
|
|
|
## Network Size Analysis
|
|
|
|
### Current Network Topology
|
|
|
|
- **Validators**: 5 nodes (VMIDs 1000-1004)
|
|
- **Sentries**: 4 nodes (VMIDs 1500-1503)
|
|
- **RPC Nodes**: 10+ nodes (VMIDs 2500+)
|
|
- **Total Nodes**: ~19-20 active nodes
|
|
|
|
### Expected Growth
|
|
|
|
- **Near-term**: 20-30 nodes
|
|
- **Medium-term**: 30-50 nodes
|
|
- **Long-term**: 50-100 nodes
|
|
|
|
---
|
|
|
|
## Performance Configuration Options
|
|
|
|
### max-peers
|
|
|
|
**Current Settings**:
|
|
- Validators: `25` peers
|
|
- Sentries: `25` peers
|
|
- RPC (Standard): `25` peers
|
|
- RPC (ThirdWeb): `50` peers
|
|
|
|
**Recommended Settings by Network Size**:
|
|
|
|
| Network Size | Validators | Sentries | RPC (Standard) | RPC (High Traffic) |
|
|
|--------------|------------|----------|----------------|-------------------|
|
|
| **10-20 nodes** | 15-20 | 20-25 | 20-25 | 30-40 |
|
|
| **20-50 nodes** | 20-25 | 25-30 | 25-30 | 40-50 |
|
|
| **50-100 nodes** | 25-30 | 30-40 | 30-40 | 50-75 |
|
|
| **100+ nodes** | 30-40 | 40-50 | 40-50 | 75-100 |
|
|
|
|
**Rationale**:
|
|
- **Validators**: Fewer peers needed (only sentries and other validators)
|
|
- **Sentries**: Moderate peers (handle P2P traffic for validators)
|
|
- **RPC Standard**: Moderate peers (serve API requests)
|
|
- **RPC High Traffic**: Higher peers (ThirdWeb, high-volume applications)
|
|
|
|
**Current Assessment**: ✅ Appropriate for current network size (20 nodes)
|
|
|
|
---
|
|
|
|
### P2P Configuration
|
|
|
|
```toml
|
|
# P2P host binding
|
|
p2p-host="0.0.0.0"
|
|
p2p-port=30303
|
|
|
|
# Maximum peer connections
|
|
max-peers=25
|
|
|
|
# Discovery
|
|
discovery-enabled=true # or false for isolated nodes
|
|
```
|
|
|
|
**Tuning Guidelines**:
|
|
- **Discovery enabled**: For public-facing nodes (sentries, public RPC)
|
|
- **Discovery disabled**: For internal-only nodes (validators, core RPC)
|
|
- **Max peers**: Balance between connectivity and resource usage
|
|
|
|
---
|
|
|
|
### Sync Mode Configuration
|
|
|
|
```toml
|
|
sync-mode="FULL"
|
|
```
|
|
|
|
**Options**:
|
|
- `FULL`: Full blockchain sync (validators, archive nodes)
|
|
- `FAST`: Fast sync (non-archive RPC nodes)
|
|
- `SNAP`: Snapshot sync (if available, fastest bootstrap)
|
|
|
|
**Recommendations**:
|
|
- ✅ **Validators**: `FULL` (required for consensus)
|
|
- ✅ **Sentries (Archive)**: `FULL` (archive nodes)
|
|
- ⚠️ **RPC Nodes**: Consider `FAST` for non-archive nodes (better performance)
|
|
|
|
**Note**: Current configs all use `FULL`. Consider `FAST` for non-archive RPC nodes if storage is a concern.
|
|
|
|
---
|
|
|
|
### Logging Configuration
|
|
|
|
```toml
|
|
logging="WARN" # Validators and RPC
|
|
logging="INFO" # Sentry archive nodes
|
|
```
|
|
|
|
**Performance Impact**:
|
|
- **INFO logging**: ~10-20% I/O overhead
|
|
- **WARN logging**: Minimal I/O overhead (<5%)
|
|
- **DEBUG logging**: High I/O overhead (30-50%)
|
|
|
|
**Recommendation**: ✅ Current settings are optimal
|
|
- Validators/RPC: `WARN` (minimal overhead)
|
|
- Sentry archive: `INFO` (detailed logs for archival)
|
|
|
|
---
|
|
|
|
### RPC Configuration
|
|
|
|
#### HTTP-RPC Timeout
|
|
|
|
```toml
|
|
# ThirdWeb RPC uses extended timeout
|
|
rpc-http-timeout=60
|
|
```
|
|
|
|
**Default**: 60 seconds (Besu default)
|
|
|
|
**Tuning**:
|
|
- **Standard RPC**: Default (60s) is appropriate
|
|
- **High-volume RPC**: May need longer timeout for complex queries
|
|
- **Public RPC**: Default is sufficient
|
|
|
|
**Recommendation**: ✅ Current settings appropriate
|
|
|
|
---
|
|
|
|
#### WebSocket Configuration
|
|
|
|
```toml
|
|
rpc-ws-enabled=true
|
|
rpc-ws-port=8546
|
|
```
|
|
|
|
**Performance Considerations**:
|
|
- WebSocket connections consume memory
|
|
- Recommended for real-time applications (ThirdWeb, dApps)
|
|
- Not needed for simple read-only public RPC
|
|
|
|
**Current Usage**: ✅ Appropriate (enabled where needed, disabled for public RPC)
|
|
|
|
---
|
|
|
|
### Metrics Configuration
|
|
|
|
```toml
|
|
metrics-enabled=true
|
|
metrics-port=9545
|
|
metrics-host="0.0.0.0"
|
|
```
|
|
|
|
**Performance Impact**: Minimal (<2% overhead)
|
|
|
|
**Recommendation**: ✅ Keep enabled on all nodes for monitoring
|
|
|
|
---
|
|
|
|
## Resource Recommendations
|
|
|
|
### Memory (JVM Heap)
|
|
|
|
**Current Settings** (from deployment scripts):
|
|
- Validators: `-Xmx4g -Xms4g`
|
|
- Sentries: `-Xmx6g -Xms6g` (archive nodes need more)
|
|
- RPC: `-Xmx6g -Xms6g`
|
|
|
|
**Recommended by Node Type**:
|
|
|
|
| Node Type | Heap Size | Rationale |
|
|
|-----------|-----------|-----------|
|
|
| **Validator** | 4-8GB | Consensus operations, transaction pool |
|
|
| **Sentry (Archive)** | 8-12GB | Full archive database, historical queries |
|
|
| **RPC (Standard)** | 4-8GB | API serving, standard sync |
|
|
| **RPC (High Traffic)** | 8-12GB | High request volume, complex queries |
|
|
|
|
**Current Assessment**: ✅ Appropriate for current workload
|
|
|
|
---
|
|
|
|
### CPU
|
|
|
|
**Recommendations**:
|
|
- **Validators**: 4+ vCPUs (consensus is CPU-intensive)
|
|
- **Sentries**: 4-8 vCPUs (P2P relay, archive queries)
|
|
- **RPC**: 4-8 vCPUs (API serving, request handling)
|
|
|
|
**Current VM Sizes**:
|
|
- Validators: `Standard_D4_v2` (4 vCPUs) ✅
|
|
- Sentries: `Standard_D4_v2` (4 vCPUs) ✅
|
|
- RPC: `Standard_D8s_v6` (8 vCPUs) ✅
|
|
|
|
**Assessment**: ✅ Current sizing is appropriate
|
|
|
|
---
|
|
|
|
### Disk I/O
|
|
|
|
**Archive Nodes (Sentries)**:
|
|
- High read I/O (historical queries)
|
|
- SSD recommended for archive database
|
|
- Consider high IOPS for archive nodes
|
|
|
|
**Validators/RPC**:
|
|
- Moderate I/O (recent block data)
|
|
- Standard storage sufficient
|
|
|
|
---
|
|
|
|
## Performance Monitoring
|
|
|
|
### Key Metrics to Monitor
|
|
|
|
1. **Peer Connections**:
|
|
- Active peer count vs. `max-peers`
|
|
- Peer connection churn
|
|
- Peer latency
|
|
|
|
2. **Block Sync**:
|
|
- Sync status (in-sync vs. syncing)
|
|
- Block import rate
|
|
- Sync lag (blocks behind)
|
|
|
|
3. **RPC Performance**:
|
|
- Request rate (requests/second)
|
|
- Response latency (p50, p95, p99)
|
|
- Error rate
|
|
|
|
4. **Resource Usage**:
|
|
- Memory usage (heap utilization)
|
|
- CPU usage
|
|
- Disk I/O (read/write rates)
|
|
|
|
5. **Transaction Pool**:
|
|
- Transaction pool size
|
|
- Transaction processing rate
|
|
|
|
---
|
|
|
|
## Tuning Recommendations by Network Growth
|
|
|
|
### Phase 1: Current (20 nodes)
|
|
|
|
**Current Settings**: ✅ Appropriate
|
|
- `max-peers=25` for most nodes
|
|
- `max-peers=50` for ThirdWeb RPC
|
|
- `sync-mode="FULL"` for all nodes
|
|
|
|
**No changes needed** at current scale.
|
|
|
|
---
|
|
|
|
### Phase 2: Medium Growth (30-50 nodes)
|
|
|
|
**Recommended Adjustments**:
|
|
1. Increase `max-peers` to 30-35 for sentries
|
|
2. Increase `max-peers` to 30-35 for high-traffic RPC
|
|
3. Monitor peer connection health
|
|
4. Consider `FAST` sync for non-archive RPC nodes
|
|
|
|
---
|
|
|
|
### Phase 3: Large Growth (50-100 nodes)
|
|
|
|
**Recommended Adjustments**:
|
|
1. Increase `max-peers` to 40-50 for sentries
|
|
2. Increase `max-peers` to 50-75 for high-traffic RPC
|
|
3. Review JVM heap sizes (may need increase)
|
|
4. Monitor and optimize database performance
|
|
5. Consider horizontal scaling for RPC nodes
|
|
|
|
---
|
|
|
|
## Network-Specific Tuning
|
|
|
|
### Validator Network
|
|
|
|
**Characteristics**: Consensus-critical, low latency needed
|
|
|
|
**Tuning**:
|
|
- Lower `max-peers` (only sentries + validators)
|
|
- Prioritize stable peer connections
|
|
- Monitor consensus performance (block time, round time)
|
|
|
|
**Current**: ✅ Optimized for consensus performance
|
|
|
|
---
|
|
|
|
### Sentry Network
|
|
|
|
**Characteristics**: P2P relay, full archive
|
|
|
|
**Tuning**:
|
|
- Moderate `max-peers` (handle P2P traffic)
|
|
- Archive database optimization
|
|
- Higher memory for historical queries
|
|
|
|
**Current**: ✅ Configured for archive + P2P relay
|
|
|
|
---
|
|
|
|
### RPC Network
|
|
|
|
**Characteristics**: API serving, variable traffic
|
|
|
|
**Tuning**:
|
|
- Variable `max-peers` by traffic level
|
|
- WebSocket configuration based on use case
|
|
- RPC timeout based on query complexity
|
|
|
|
**Current**: ✅ Varied appropriately by use case
|
|
|
|
---
|
|
|
|
## Performance Optimization Checklist
|
|
|
|
### Initial Setup
|
|
- ✅ JVM heap size appropriate for node type
|
|
- ✅ `max-peers` configured for network size
|
|
- ✅ Logging level optimized (WARN for most, INFO for archive)
|
|
- ✅ Sync mode appropriate (FULL for archive, consider FAST for non-archive)
|
|
|
|
### Ongoing Monitoring
|
|
- ⏳ Monitor peer connection health
|
|
- ⏳ Track RPC request latency
|
|
- ⏳ Monitor memory/CPU usage
|
|
- ⏳ Check block sync status
|
|
|
|
### Optimization
|
|
- ⏳ Adjust `max-peers` based on network growth
|
|
- ⏳ Tune JVM GC settings if needed
|
|
- ⏳ Optimize database performance for archive nodes
|
|
- ⏳ Scale resources if performance degrades
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### 1. Start Conservative
|
|
- Begin with recommended settings
|
|
- Monitor performance
|
|
- Adjust based on actual workload
|
|
|
|
### 2. Scale Gradually
|
|
- Increase `max-peers` incrementally
|
|
- Monitor impact of changes
|
|
- Revert if issues occur
|
|
|
|
### 3. Monitor First, Tune Second
|
|
- Collect performance metrics
|
|
- Identify bottlenecks
|
|
- Tune specific issues
|
|
|
|
### 4. Document Changes
|
|
- Track configuration changes
|
|
- Document performance impact
|
|
- Maintain configuration history
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
- `docs/04-configuration/BESU_CONFIGURATION_GUIDE.md` - Configuration reference
|
|
- `docs/04-configuration/RPC_CONFIG_ANALYSIS.md` - RPC configuration analysis
|
|
- Monitoring dashboards (Grafana/Prometheus)
|
|
|
|
---
|
|
|
|
**Last Updated**: 2026-01-17
|
|
**Status**: Performance Tuning Guide
|