# Besu Configuration Deployment Monitoring Guide **Last Updated:** 2026-01-31 **Document Version:** 1.0 **Status:** Active Documentation --- **Date**: 2026-01-17 **Purpose**: Guide for monitoring Besu configuration deployments and verifying correct operation --- ## Overview After deploying cleaned Besu configurations to running nodes, monitor the deployment to ensure services start correctly, configuration changes are applied, and no issues arise. --- ## Post-Deployment Monitoring Period **Recommended**: 24-48 hours after deployment **Intensive Monitoring**: First 4-6 hours **Standard Monitoring**: 24-48 hours **Ongoing Monitoring**: Regular health checks --- ## Monitoring Checklist ### Immediate (0-1 hour after deployment) - [ ] Verify all services started successfully - [ ] Check for configuration errors in logs - [ ] Verify no restart loops - [ ] Check logging levels are correct - [ ] Test RPC endpoints (if applicable) ### Short-term (1-6 hours after deployment) - [ ] Monitor service status - [ ] Check for configuration-related errors - [ ] Verify network connectivity - [ ] Test consensus participation (validators) - [ ] Test archive queries (sentries) ### Medium-term (6-48 hours after deployment) - [ ] Monitor resource usage (memory, CPU, disk) - [ ] Check peer connections - [ ] Verify sync status - [ ] Monitor for performance issues - [ ] Check metrics endpoints --- ## Service Status Verification ### Check Systemd Service Status ```bash # For each node (example for validator 1000) pct exec 1000 -- systemctl status besu-validator.service # Check if service is active pct exec 1000 -- systemctl is-active besu-validator.service # Expected: "active" # Check service logs pct exec 1000 -- journalctl -u besu-validator.service -n 50 --no-pager ``` ### Verify No Restart Loops ```bash # Check restart count (should be 0 or low after deployment) pct exec 1000 -- systemctl show besu-validator.service | grep NRestart # Expected: NRestart=0 or low number # Check for frequent restarts pct exec 1000 -- journalctl -u besu-validator.service --since "1 hour ago" | grep "Started\|Stopped" | tail -10 ``` --- ## Configuration Verification ### Verify Logging Levels **Validators and RPC**: Should log at `WARN` level **Sentry nodes**: Should log at `INFO` level ```bash # Check Besu logs for logging level (should show WARN or INFO) pct exec 1000 -- journalctl -u besu-validator.service -n 20 | grep -i "log\|WARN\|INFO" # Validators/RPC: Should see WARN-level messages (minimal logs) # Sentries: Should see INFO-level messages (detailed logs) ``` ### Check for Configuration Errors ```bash # Look for configuration errors pct exec 1000 -- journalctl -u besu-validator.service | grep -i "error\|unknown option\|configuration" # Should NOT see: # - "Unknown options in TOML configuration file" # - "Configuration error" # - Deprecated option warnings ``` --- ## Functional Verification ### Validator Nodes **Check Consensus Participation**: ```bash # Verify validator is synced curl -X POST http://192.168.11.100:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' # Expected: false (fully synced) # Note: Validators have RPC disabled, so use internal tools or metrics ``` **Check Metrics** (validators enable metrics): ```bash curl http://192.168.11.100:9545/metrics | grep besu_blocks_total ``` ### Sentry Nodes (Archive) **Check Archive Functionality**: ```bash # Test historical query (verify archive mode) curl -X POST http://192.168.11.150:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_getBalance","params":["0x0000000000000000000000000000000000000000","0x100"],"id":1}' # Should return historical balance (archive nodes only) ``` **Check Sync Status**: ```bash curl -X POST http://192.168.11.150:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' # Expected: false (fully synced) ``` ### RPC Nodes **Test RPC Endpoints**: ```bash # Test HTTP-RPC curl -X POST http://192.168.11.250:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' # Test chain ID curl -X POST http://192.168.11.250:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' # Expected: "0x8a" (138 in hex) ``` **Verify Logging Level** (should be WARN, minimal logs): ```bash # Check logs show minimal output (WARN level) pct exec 2500 -- journalctl -u besu-rpc.service -n 20 --no-pager # Should see mostly warnings/errors, not info messages ``` --- ## Network Connectivity ### Peer Connections **Check Peer Count**: ```bash # Via metrics (if available) curl http://192.168.11.150:9545/metrics | grep besu_peers # Via logs (look for peer connection messages) pct exec 1500 -- journalctl -u besu-sentry.service | grep -i "peer\|connected" ``` **Expected**: - Validators: Connected to sentries (and other validators) - Sentries: Connected to validators and external peers - RPC: Connected to internal peers (sentries/validators) --- ## Performance Monitoring ### Resource Usage **Memory Usage**: ```bash # Check Besu process memory pct exec 1000 -- ps aux | grep besu | awk '{print $4,$11}' # Check systemd memory limit pct exec 1000 -- systemctl show besu-validator.service | grep MemoryMax ``` **CPU Usage**: ```bash # Monitor CPU usage pct exec 1000 -- top -bn1 | grep besu ``` **Disk I/O**: ```bash # Check disk usage pct exec 1500 -- df -h /data/besu # Check database size pct exec 1500 -- du -sh /data/besu/database/ ``` --- ## Configuration Drift Detection ### Compare Running Configs to Templates ```bash # Use audit script ./scripts/audit-besu-configs.sh # Manual comparison # 1. Copy running config from node pct exec 1000 -- cat /etc/besu/config-validator.toml > /tmp/running-config.toml # 2. Compare to template diff /tmp/running-config.toml smom-dbis-138-proxmox/templates/besu-configs/config-validator.toml ``` **Expected**: Running configs should match templates (after deployment) --- ## Troubleshooting ### Issue: Service Fails to Start **Symptoms**: - Service status: `failed` or `inactive` - Frequent restarts - Configuration errors in logs **Diagnosis**: ```bash # Check service status pct exec 1000 -- systemctl status besu-validator.service # Check logs for errors pct exec 1000 -- journalctl -u besu-validator.service -n 100 --no-pager ``` **Common Causes**: 1. Configuration syntax error 2. Deprecated options still present 3. Invalid option values 4. Missing required files (genesis.json, etc.) **Resolution**: 1. Validate config with `validate-besu-config.sh` 2. Check for deprecated options 3. Review Besu logs for specific errors 4. Restore from backup if needed --- ### Issue: Configuration Not Applied **Symptoms**: - Logging level unchanged - Service running but with old settings **Diagnosis**: ```bash # Check if config file was updated pct exec 1000 -- stat /etc/besu/config-validator.toml # Check actual logging level in Besu logs pct exec 1000 -- journalctl -u besu-validator.service | grep -i "logging\|WARN\|INFO" ``` **Resolution**: 1. Verify config file was copied correctly 2. Ensure service was restarted after config update 3. Check for file permission issues 4. Verify Besu is reading correct config file --- ### Issue: Logging Level Incorrect **Symptoms**: - Validators showing INFO logs (should be WARN) - RPC nodes showing INFO logs (should be WARN) - Sentries showing WARN logs (should be INFO) **Diagnosis**: ```bash # Check config file logging setting pct exec 1000 -- grep "^logging" /etc/besu/config-validator.toml # Expected: logging="WARN" for validators # Check actual log output pct exec 1000 -- journalctl -u besu-validator.service -n 20 # Should see minimal logs (WARN level) ``` **Resolution**: 1. Verify config file has correct `logging="WARN"` or `logging="INFO"` 2. Ensure service was restarted 3. Clear log cache if needed: `journalctl --vacuum-time=1s` --- ## Monitoring Scripts ### Automated Monitoring Create monitoring script to check all nodes: ```bash #!/bin/bash # monitor-besu-deployment.sh NODES=(1000 1001 1002 1003 1004 1500 1501 1502 1503 2500 2501 2502) for vmid in "${NODES[@]}"; do echo "Checking VMID $vmid..." # Check service status status=$(pct exec $vmid -- systemctl is-active besu-*.service 2>/dev/null || echo "unknown") echo " Service status: $status" # Check for errors in logs errors=$(pct exec $vmid -- journalctl -u besu-*.service --since "1 hour ago" | grep -i "error" | wc -l) echo " Errors in last hour: $errors" # Check restart count restarts=$(pct exec $vmid -- systemctl show besu-*.service | grep NRestart | cut -d= -f2 | head -1) echo " Restart count: $restarts" done ``` --- ## Success Criteria ### Deployment Successful If: ✅ **All services running**: - Systemd status: `active` - No restart loops - Services stable for 24+ hours ✅ **Configuration applied**: - Logging levels correct (WARN for validators/RPC, INFO for sentries) - No deprecated options in use - All configs match templates ✅ **Functionality verified**: - Validators participating in consensus - Sentries providing archive queries - RPC nodes serving API requests - Network connectivity normal ✅ **No errors**: - No configuration errors in logs - No "Unknown options" errors - Services starting cleanly --- ## Monitoring Timeline ### Hour 0-1: Immediate Verification - Service status - Configuration errors - Basic functionality ### Hour 1-6: Intensive Monitoring - Service stability - Performance metrics - Network connectivity - Detailed verification ### Hour 6-24: Standard Monitoring - Ongoing health checks - Resource usage - Performance trends ### Day 2+: Ongoing Monitoring - Regular health checks - Performance monitoring - Configuration drift detection --- ## Post-Deployment Checklist - [ ] All services running (validators, sentries, RPC) - [ ] No configuration errors in logs - [ ] Logging levels correct (WARN/INFO as appropriate) - [ ] No restart loops - [ ] Validators participating in consensus - [ ] Sentries providing archive queries - [ ] RPC nodes serving API requests - [ ] Network connectivity normal - [ ] Peer connections healthy - [ ] Resource usage within expected ranges - [ ] Configuration drift: None detected --- ## Related Documentation - `scripts/deploy-besu-configs.sh` - Deployment script - `scripts/audit-besu-configs.sh` - Configuration audit - `scripts/validate-besu-config.sh` - Configuration validation - `docs/04-configuration/BESU_CONFIGURATION_GUIDE.md` - Configuration reference --- **Last Updated**: 2026-01-17 **Status**: Monitoring Guide