# Deployment Runbook **Last Updated:** 2026-01-31 **Document Version:** 1.0 **Status:** Active Documentation --- ## SolaceScanScout Explorer - Production Deployment Guide **Last Updated**: $(date) **Version**: 1.0.0 --- ## Table of Contents 1. [Pre-Deployment Checklist](#pre-deployment-checklist) 2. [Environment Setup](#environment-setup) 3. [Database Migration](#database-migration) 4. [Service Deployment](#service-deployment) 5. [Health Checks](#health-checks) 6. [Rollback Procedures](#rollback-procedures) 7. [Post-Deployment Verification](#post-deployment-verification) 8. [Troubleshooting](#troubleshooting) --- ## Pre-Deployment Checklist ### Infrastructure Requirements - [ ] Kubernetes cluster (AKS) or VM infrastructure ready - [ ] PostgreSQL 16+ with TimescaleDB extension - [ ] Redis cluster (for production cache/rate limiting) - [ ] Elasticsearch/OpenSearch cluster - [ ] Load balancer configured - [ ] SSL certificates provisioned - [ ] DNS records configured - [ ] Monitoring stack deployed (Prometheus, Grafana) ### Configuration - [ ] Environment variables configured - [ ] Secrets stored in Key Vault - [ ] Database credentials verified - [ ] Redis connection string verified - [ ] RPC endpoint URLs verified - [ ] JWT secret configured (strong random value) ### Code & Artifacts - [ ] All tests passing - [ ] Docker images built and tagged - [ ] Images pushed to container registry - [ ] Database migrations reviewed - [ ] Rollback plan documented --- ## Environment Setup ### 1. Set Environment Variables ```bash # Database export DB_HOST=postgres.example.com export DB_PORT=5432 export DB_USER=explorer export DB_PASSWORD= export DB_NAME=explorer # Redis (for production) export REDIS_URL=redis://redis.example.com:6379 # RPC export RPC_URL=https://rpc.d-bis.org export WS_URL=wss://rpc.d-bis.org # Application export CHAIN_ID=138 export PORT=8080 export JWT_SECRET= # Optional export LOG_LEVEL=info export ENABLE_METRICS=true ``` ### 2. Verify Secrets ```bash # Test database connection psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;" # Test Redis connection redis-cli -u $REDIS_URL ping # Test RPC endpoint curl -X POST $RPC_URL \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' ``` --- ## Database Migration ### 1. Backup Existing Database ```bash # Create backup pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > backup_$(date +%Y%m%d_%H%M%S).sql # Verify backup ls -lh backup_*.sql ``` ### 2. Run Migrations ```bash cd explorer-monorepo/backend/database/migrations # Review pending migrations go run migrate.go --status # Run migrations go run migrate.go --up # Verify migration go run migrate.go --status ``` ### 3. Verify Schema ```bash psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\dt" psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\d blocks" psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "\d transactions" ``` --- ## Service Deployment ### Option 1: Kubernetes Deployment #### 1. Deploy API Server ```bash kubectl apply -f k8s/api-server-deployment.yaml kubectl apply -f k8s/api-server-service.yaml kubectl apply -f k8s/api-server-ingress.yaml # Verify deployment kubectl get pods -l app=api-server kubectl logs -f deployment/api-server ``` #### 2. Deploy Indexer ```bash kubectl apply -f k8s/indexer-deployment.yaml # Verify deployment kubectl get pods -l app=indexer kubectl logs -f deployment/indexer ``` #### 3. Rolling Update ```bash # Update image kubectl set image deployment/api-server api-server=registry.example.com/explorer-api:v1.1.0 # Monitor rollout kubectl rollout status deployment/api-server # Rollback if needed kubectl rollout undo deployment/api-server ``` ### Option 2: Docker Compose Deployment ```bash cd explorer-monorepo/deployment # Start services docker-compose up -d # Verify services docker-compose ps docker-compose logs -f api-server ``` --- ## Health Checks ### 1. API Health Endpoint ```bash # Check health curl https://api.d-bis.org/health # Expected response { "status": "ok", "timestamp": "2024-01-01T00:00:00Z", "database": "connected" } ``` ### 2. Service Health ```bash # Kubernetes kubectl get pods kubectl describe pod # Docker docker ps docker inspect ``` ### 3. Database Connectivity ```bash # From API server curl https://api.d-bis.org/health | jq .database # Direct check psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT COUNT(*) FROM blocks;" ``` ### 4. Redis Connectivity ```bash # Test Redis redis-cli -u $REDIS_URL ping # Check cache stats redis-cli -u $REDIS_URL INFO stats ``` --- ## Rollback Procedures ### Quick Rollback (Kubernetes) ```bash # Rollback to previous version kubectl rollout undo deployment/api-server kubectl rollout undo deployment/indexer # Verify rollback kubectl rollout status deployment/api-server ``` ### Database Rollback ```bash # Restore from backup psql -h $DB_HOST -U $DB_USER -d $DB_NAME < backup_YYYYMMDD_HHMMSS.sql # Or rollback migrations cd explorer-monorepo/backend/database/migrations go run migrate.go --down 1 ``` ### Full Rollback ```bash # 1. Stop new services kubectl scale deployment/api-server --replicas=0 kubectl scale deployment/indexer --replicas=0 # 2. Restore database psql -h $DB_HOST -U $DB_USER -d $DB_NAME < backup_YYYYMMDD_HHMMSS.sql # 3. Start previous version kubectl set image deployment/api-server api-server=registry.example.com/explorer-api:v1.0.0 kubectl scale deployment/api-server --replicas=3 ``` --- ## Post-Deployment Verification ### 1. Functional Tests ```bash # Test Track 1 endpoints (public) curl https://api.d-bis.org/api/v1/track1/blocks/latest # Test search curl https://api.d-bis.org/api/v1/search?q=1000 # Test health curl https://api.d-bis.org/health ``` ### 2. Performance Tests ```bash # Load test ab -n 1000 -c 10 https://api.d-bis.org/api/v1/track1/blocks/latest # Check response times curl -w "@curl-format.txt" -o /dev/null -s https://api.d-bis.org/api/v1/track1/blocks/latest ``` ### 3. Monitoring - [ ] Check Grafana dashboards - [ ] Verify Prometheus metrics - [ ] Check error rates - [ ] Monitor response times - [ ] Check database connection pool - [ ] Verify Redis cache hit rate --- ## Troubleshooting ### Common Issues #### 1. Database Connection Errors **Symptoms**: 500 errors, "database connection failed" **Resolution**: ```bash # Check database status psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;" # Check connection pool # Review database/migrations for connection pool settings # Restart service kubectl rollout restart deployment/api-server ``` #### 2. Redis Connection Errors **Symptoms**: Cache misses, rate limiting not working **Resolution**: ```bash # Test Redis connection redis-cli -u $REDIS_URL ping # Check Redis logs kubectl logs -l app=redis # Fallback to in-memory (temporary) # Remove REDIS_URL from environment ``` #### 3. High Memory Usage **Symptoms**: OOM kills, slow responses **Resolution**: ```bash # Check memory usage kubectl top pods # Increase memory limits kubectl set resources deployment/api-server --limits=memory=2Gi # Review cache TTL settings ``` #### 4. Slow Response Times **Symptoms**: High latency, timeout errors **Resolution**: ```bash # Check database query performance psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "EXPLAIN ANALYZE SELECT * FROM blocks LIMIT 10;" # Check indexer lag curl https://api.d-bis.org/api/v1/track2/stats # Review connection pool settings ``` --- ## Emergency Procedures ### Service Outage 1. **Immediate Actions**: - Check service status: `kubectl get pods` - Check logs: `kubectl logs -f deployment/api-server` - Check database: `psql -h $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT 1;"` - Check Redis: `redis-cli -u $REDIS_URL ping` 2. **Quick Recovery**: - Restart services: `kubectl rollout restart deployment/api-server` - Scale up: `kubectl scale deployment/api-server --replicas=5` - Rollback if needed: `kubectl rollout undo deployment/api-server` 3. **Communication**: - Update status page - Notify team via Slack/email - Document incident ### Data Corruption 1. **Immediate Actions**: - Stop writes: `kubectl scale deployment/api-server --replicas=0` - Backup current state: `pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > emergency_backup.sql` 2. **Recovery**: - Restore from last known good backup - Verify data integrity - Resume services --- ## Maintenance Windows ### Scheduled Maintenance 1. **Pre-Maintenance**: - Notify users 24 hours in advance - Create maintenance mode flag - Prepare rollback plan 2. **During Maintenance**: - Enable maintenance mode - Perform updates - Run health checks 3. **Post-Maintenance**: - Disable maintenance mode - Verify all services - Monitor for issues --- ## Contact Information - **On-Call Engineer**: Check PagerDuty - **Slack Channel**: #explorer-deployments - **Emergency**: [Emergency Contact] --- **Document Version**: 1.0.0 **Last Reviewed**: $(date) **Next Review**: $(date -d "+3 months")