Files
proxmox/docs/03-deployment/DEPLOYMENT_RUNBOOK.md
defiQUG fbda1b4beb
Some checks failed
Deploy to Phoenix / deploy (push) Has been cancelled
docs: Ledger Live integration, contract deploy learnings, NEXT_STEPS updates
- ADD_CHAIN138_TO_LEDGER_LIVE: Ledger form done; public code review repo bis-innovations/LedgerLive; init/push commands
- CONTRACT_DEPLOYMENT_RUNBOOK: Chain 138 gas price 1 gwei, 36-addr check, TransactionMirror workaround
- CONTRACT_*: AddressMapper, MirrorManager deployed 2026-02-12; 36-address on-chain check
- NEXT_STEPS_FOR_YOU: Ledger done; steps completable now (no LAN); run-completable-tasks-from-anywhere
- MASTER_INDEX, OPERATOR_OPTIONAL, SMART_CONTRACTS_INVENTORY_SIMPLE: updates
- LEDGER_BLOCKCHAIN_INTEGRATION_COMPLETE: bis-innovations/LedgerLive reference

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-12 15:46:57 -08:00

459 lines
8.9 KiB
Markdown

# Deployment Runbook
**Last Updated:** 2026-01-31
**Document Version:** 1.0
**Status:** Active Documentation
---
## SolaceScanScout Explorer - Production Deployment Guide
**Last Updated**: $(date)
**Version**: 1.0.0
---
## Table of Contents
1. [Pre-Deployment Checklist](#pre-deployment-checklist)
2. [Environment Setup](#environment-setup)
3. [Database Migration](#database-migration)
4. [Service Deployment](#service-deployment)
5. [Health Checks](#health-checks)
6. [Rollback Procedures](#rollback-procedures)
7. [Post-Deployment Verification](#post-deployment-verification)
8. [Troubleshooting](#troubleshooting)
---
## Pre-Deployment Checklist
### Infrastructure Requirements
- [ ] Kubernetes cluster (AKS) or VM infrastructure ready
- [ ] PostgreSQL 16+ with TimescaleDB extension
- [ ] Redis cluster (for production cache/rate limiting)
- [ ] Elasticsearch/OpenSearch cluster
- [ ] Load balancer configured
- [ ] SSL certificates provisioned
- [ ] DNS records configured
- [ ] Monitoring stack deployed (Prometheus, Grafana)
### Configuration
- [ ] Environment variables configured
- [ ] Secrets stored in Key Vault
- [ ] Database credentials verified
- [ ] Redis connection string verified
- [ ] RPC endpoint URLs verified
- [ ] JWT secret configured (strong random value)
### Code & Artifacts
- [ ] All tests passing
- [ ] Docker images built and tagged
- [ ] Images pushed to container registry
- [ ] Database migrations reviewed
- [ ] Rollback plan documented
---
## Environment Setup
### 1. Set Environment Variables
```bash
# Database
export DB_HOST=postgres.example.com
export DB_PORT=5432
export DB_USER=explorer
export DB_PASSWORD=<from-key-vault>
export DB_NAME=explorer
# Redis (for production)
export REDIS_URL=redis://redis.example.com:6379
# RPC
export RPC_URL=https://rpc.d-bis.org
export WS_URL=wss://rpc.d-bis.org
# Application
export CHAIN_ID=138
export PORT=8080
export JWT_SECRET=<strong-random-secret>
# Optional
export LOG_LEVEL=info
export ENABLE_METRICS=true
```
### 2. Verify Secrets
```bash
# Test database connection
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;"
# Test Redis connection
redis-cli -u $REDIS_URL ping
# Test RPC endpoint
curl -X POST $RPC_URL \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
```
---
## Database Migration
### 1. Backup Existing Database
```bash
# Create backup
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > backup_$(date +%Y%m%d_%H%M%S).sql
# Verify backup
ls -lh backup_*.sql
```
### 2. Run Migrations
```bash
cd explorer-monorepo/backend/database/migrations
# Review pending migrations
go run migrate.go --status
# Run migrations
go run migrate.go --up
# Verify migration
go run migrate.go --status
```
### 3. Verify Schema
```bash
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\dt"
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\d blocks"
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\d transactions"
```
---
## Service Deployment
### Option 1: Kubernetes Deployment
#### 1. Deploy API Server
```bash
kubectl apply -f k8s/api-server-deployment.yaml
kubectl apply -f k8s/api-server-service.yaml
kubectl apply -f k8s/api-server-ingress.yaml
# Verify deployment
kubectl get pods -l app=api-server
kubectl logs -f deployment/api-server
```
#### 2. Deploy Indexer
```bash
kubectl apply -f k8s/indexer-deployment.yaml
# Verify deployment
kubectl get pods -l app=indexer
kubectl logs -f deployment/indexer
```
#### 3. Rolling Update
```bash
# Update image
kubectl set image deployment/api-server api-server=registry.example.com/explorer-api:v1.1.0
# Monitor rollout
kubectl rollout status deployment/api-server
# Rollback if needed
kubectl rollout undo deployment/api-server
```
### Option 2: Docker Compose Deployment
```bash
cd explorer-monorepo/deployment
# Start services
docker-compose up -d
# Verify services
docker-compose ps
docker-compose logs -f api-server
```
---
## Health Checks
### 1. API Health Endpoint
```bash
# Check health
curl https://api.d-bis.org/health
# Expected response
{
"status": "ok",
"timestamp": "2024-01-01T00:00:00Z",
"database": "connected"
}
```
### 2. Service Health
```bash
# Kubernetes
kubectl get pods
kubectl describe pod <pod-name>
# Docker
docker ps
docker inspect <container-id>
```
### 3. Database Connectivity
```bash
# From API server
curl https://api.d-bis.org/health | jq .database
# Direct check
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT COUNT(*) FROM blocks;"
```
### 4. Redis Connectivity
```bash
# Test Redis
redis-cli -u $REDIS_URL ping
# Check cache stats
redis-cli -u $REDIS_URL INFO stats
```
---
## Rollback Procedures
### Quick Rollback (Kubernetes)
```bash
# Rollback to previous version
kubectl rollout undo deployment/api-server
kubectl rollout undo deployment/indexer
# Verify rollback
kubectl rollout status deployment/api-server
```
### Database Rollback
```bash
# Restore from backup
psql -h $DB_HOST -U $DB_USER -d $DB_NAME < backup_YYYYMMDD_HHMMSS.sql
# Or rollback migrations
cd explorer-monorepo/backend/database/migrations
go run migrate.go --down 1
```
### Full Rollback
```bash
# 1. Stop new services
kubectl scale deployment/api-server --replicas=0
kubectl scale deployment/indexer --replicas=0
# 2. Restore database
psql -h $DB_HOST -U $DB_USER -d $DB_NAME < backup_YYYYMMDD_HHMMSS.sql
# 3. Start previous version
kubectl set image deployment/api-server api-server=registry.example.com/explorer-api:v1.0.0
kubectl scale deployment/api-server --replicas=3
```
---
## Post-Deployment Verification
### 1. Functional Tests
```bash
# Test Track 1 endpoints (public)
curl https://api.d-bis.org/api/v1/track1/blocks/latest
# Test search
curl https://api.d-bis.org/api/v1/search?q=1000
# Test health
curl https://api.d-bis.org/health
```
### 2. Performance Tests
```bash
# Load test
ab -n 1000 -c 10 https://api.d-bis.org/api/v1/track1/blocks/latest
# Check response times
curl -w "@curl-format.txt" -o /dev/null -s https://api.d-bis.org/api/v1/track1/blocks/latest
```
### 3. Monitoring
- [ ] Check Grafana dashboards
- [ ] Verify Prometheus metrics
- [ ] Check error rates
- [ ] Monitor response times
- [ ] Check database connection pool
- [ ] Verify Redis cache hit rate
---
## Troubleshooting
### Common Issues
#### 1. Database Connection Errors
**Symptoms**: 500 errors, "database connection failed"
**Resolution**:
```bash
# Check database status
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;"
# Check connection pool
# Review database/migrations for connection pool settings
# Restart service
kubectl rollout restart deployment/api-server
```
#### 2. Redis Connection Errors
**Symptoms**: Cache misses, rate limiting not working
**Resolution**:
```bash
# Test Redis connection
redis-cli -u $REDIS_URL ping
# Check Redis logs
kubectl logs -l app=redis
# Fallback to in-memory (temporary)
# Remove REDIS_URL from environment
```
#### 3. High Memory Usage
**Symptoms**: OOM kills, slow responses
**Resolution**:
```bash
# Check memory usage
kubectl top pods
# Increase memory limits
kubectl set resources deployment/api-server --limits=memory=2Gi
# Review cache TTL settings
```
#### 4. Slow Response Times
**Symptoms**: High latency, timeout errors
**Resolution**:
```bash
# Check database query performance
psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "EXPLAIN ANALYZE SELECT * FROM blocks LIMIT 10;"
# Check indexer lag
curl https://api.d-bis.org/api/v1/track2/stats
# Review connection pool settings
```
---
## Emergency Procedures
### Service Outage
1. **Immediate Actions**:
- Check service status: `kubectl get pods`
- Check logs: `kubectl logs -f deployment/api-server`
- Check database: `psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;"`
- Check Redis: `redis-cli -u $REDIS_URL ping`
2. **Quick Recovery**:
- Restart services: `kubectl rollout restart deployment/api-server`
- Scale up: `kubectl scale deployment/api-server --replicas=5`
- Rollback if needed: `kubectl rollout undo deployment/api-server`
3. **Communication**:
- Update status page
- Notify team via Slack/email
- Document incident
### Data Corruption
1. **Immediate Actions**:
- Stop writes: `kubectl scale deployment/api-server --replicas=0`
- Backup current state: `pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > emergency_backup.sql`
2. **Recovery**:
- Restore from last known good backup
- Verify data integrity
- Resume services
---
## Maintenance Windows
### Scheduled Maintenance
1. **Pre-Maintenance**:
- Notify users 24 hours in advance
- Create maintenance mode flag
- Prepare rollback plan
2. **During Maintenance**:
- Enable maintenance mode
- Perform updates
- Run health checks
3. **Post-Maintenance**:
- Disable maintenance mode
- Verify all services
- Monitor for issues
---
## Contact Information
- **On-Call Engineer**: Check PagerDuty
- **Slack Channel**: #explorer-deployments
- **Emergency**: [Emergency Contact]
---
**Document Version**: 1.0.0
**Last Reviewed**: $(date)
**Next Review**: $(date -d "+3 months")