- Add Legal Office of the Master seal (SVG design with Maltese Cross, scales of justice, legal scroll) - Create legal-office-manifest-template.json for Legal Office credentials - Update SEAL_MAPPING.md and DESIGN_GUIDE.md with Legal Office seal documentation - Complete Azure CDN infrastructure deployment: - Resource group, storage account, and container created - 17 PNG seal files uploaded to Azure Blob Storage - All manifest templates updated with Azure URLs - Configuration files generated (azure-cdn-config.env) - Add comprehensive Azure CDN setup scripts and documentation - Fix manifest URL generation to prevent double slashes - Verify all seals accessible via HTTPS
406 lines
10 KiB
Markdown
406 lines
10 KiB
Markdown
# Entra VerifiedID Operational Runbook
|
|
|
|
This runbook provides operational procedures for managing the Entra VerifiedID integration.
|
|
|
|
## Table of Contents
|
|
|
|
1. [Daily Operations](#daily-operations)
|
|
2. [Monitoring](#monitoring)
|
|
3. [Troubleshooting](#troubleshooting)
|
|
4. [Common Operations](#common-operations)
|
|
5. [Emergency Procedures](#emergency-procedures)
|
|
|
|
## Daily Operations
|
|
|
|
### Health Checks
|
|
|
|
**Check Service Health**
|
|
```bash
|
|
curl https://api.theorder.org/health
|
|
```
|
|
|
|
**Check Entra Client Status**
|
|
```bash
|
|
# Check logs for Entra client initialization
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep -i entra
|
|
```
|
|
|
|
**Verify Metrics Collection**
|
|
```bash
|
|
curl https://api.theorder.org/metrics | grep entra
|
|
```
|
|
|
|
### Key Metrics to Monitor
|
|
|
|
1. **Issuance Success Rate**: Should be >95%
|
|
```promql
|
|
rate(entra_credentials_issued_total{status="success"}[5m]) /
|
|
rate(entra_credentials_issued_total[5m])
|
|
```
|
|
|
|
2. **API Latency**: p95 should be <5 seconds
|
|
```promql
|
|
histogram_quantile(0.95, entra_api_request_duration_seconds_bucket{operation="issueCredential"})
|
|
```
|
|
|
|
3. **Error Rate**: Should be <5%
|
|
```promql
|
|
rate(entra_api_errors_total[5m]) / rate(entra_api_requests_total[5m])
|
|
```
|
|
|
|
4. **Webhook Processing**: Should process all webhooks
|
|
```promql
|
|
rate(entra_webhooks_received_total[5m])
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### Grafana Dashboard
|
|
|
|
Access the Entra VerifiedID dashboard at: `https://grafana.theorder.org/d/entra-verifiedid`
|
|
|
|
**Key Panels:**
|
|
- Issuance Success Rate (gauge)
|
|
- API Request Rate (graph)
|
|
- Error Rate by Operation (graph)
|
|
- Issuance Duration (histogram)
|
|
- Webhook Events (graph)
|
|
- Active Requests (gauge)
|
|
|
|
### Alerts
|
|
|
|
**Critical Alerts:**
|
|
- `EntraIssuanceErrorRateHigh`: Error rate >10%
|
|
- `EntraIssuanceLatencyHigh`: p95 latency >10 seconds
|
|
- `EntraWebhookProcessingFailed`: Webhook processing failures
|
|
- `EntraAPIDown`: No successful API requests in 5 minutes
|
|
|
|
**Warning Alerts:**
|
|
- `EntraIssuanceErrorRateWarning`: Error rate >5%
|
|
- `EntraIssuanceLatencyWarning`: p95 latency >5 seconds
|
|
- `EntraRateLimitApproaching`: Rate limit usage >80%
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: Credential Issuance Failing
|
|
|
|
**Symptoms:**
|
|
- High error rate in metrics
|
|
- 500 errors in logs
|
|
- No credentials being issued
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check recent errors
|
|
kubectl logs -n the-order-prod deployment/identity-service --tail=100 | grep -i error
|
|
|
|
# Check Entra API connectivity
|
|
curl -X POST https://verifiedid.did.msidentity.com/v1.0/<tenant-id>/verifiableCredentials/createIssuanceRequest \
|
|
-H "Authorization: Bearer <token>"
|
|
|
|
# Verify credentials
|
|
kubectl get secret -n the-order-prod entra-credentials -o yaml
|
|
```
|
|
|
|
**Solutions:**
|
|
1. Verify Entra credentials are correct
|
|
2. Check API permissions are granted
|
|
3. Verify credential manifest exists
|
|
4. Check network connectivity to Entra API
|
|
5. Review Entra service status in Azure Portal
|
|
|
|
### Issue: Webhooks Not Received
|
|
|
|
**Symptoms:**
|
|
- No webhook events in metrics
|
|
- Credentials stuck in "pending" status
|
|
- Database not updated
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check webhook endpoint
|
|
curl -X POST https://api.theorder.org/vc/entra/webhook \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"requestId":"test","requestStatus":"issuance_successful"}'
|
|
|
|
# Check webhook logs
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep webhook
|
|
|
|
# Verify webhook URL in Entra
|
|
# Go to Azure Portal → Verified ID → Settings → Webhooks
|
|
```
|
|
|
|
**Solutions:**
|
|
1. Verify webhook URL is configured in Entra VerifiedID
|
|
2. Check webhook endpoint is accessible (firewall, ingress rules)
|
|
3. Verify webhook payload format matches expected schema
|
|
4. Check database connectivity
|
|
5. Review webhook processing logs
|
|
|
|
### Issue: High Latency
|
|
|
|
**Symptoms:**
|
|
- Slow credential issuance (>10 seconds)
|
|
- High p95/p99 latency metrics
|
|
- Timeout errors
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check API request duration
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep "duration"
|
|
|
|
# Check network latency to Entra
|
|
ping verifiedid.did.msidentity.com
|
|
|
|
# Check retry attempts
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep retry
|
|
```
|
|
|
|
**Solutions:**
|
|
1. Check network connectivity and latency
|
|
2. Verify Entra API is not experiencing issues
|
|
3. Review retry configuration (may be retrying too many times)
|
|
4. Check if rate limiting is causing delays
|
|
5. Consider increasing timeout values
|
|
|
|
### Issue: Rate Limit Errors
|
|
|
|
**Symptoms:**
|
|
- 429 errors in logs
|
|
- Rate limit metrics showing violations
|
|
- Requests being rejected
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check rate limit violations
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep "429"
|
|
|
|
# Check current rate limit settings
|
|
kubectl get configmap -n the-order-prod identity-service-config -o yaml | grep ENTRA_RATE_LIMIT
|
|
```
|
|
|
|
**Solutions:**
|
|
1. Review current rate limit configuration
|
|
2. Check Entra API quota limits
|
|
3. Adjust rate limits if needed
|
|
4. Implement request queuing if necessary
|
|
5. Contact Entra support if quota needs increase
|
|
|
|
### Issue: Token Refresh Failures
|
|
|
|
**Symptoms:**
|
|
- "Failed to get access token" errors
|
|
- Authentication failures
|
|
- 401 errors
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Check token refresh logs
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep "token"
|
|
|
|
# Verify credentials
|
|
kubectl get secret -n the-order-prod entra-credentials -o jsonpath='{.data.ENTRA_CLIENT_SECRET}' | base64 -d
|
|
```
|
|
|
|
**Solutions:**
|
|
1. Verify client secret is correct and not expired
|
|
2. Check API permissions are granted
|
|
3. Verify tenant ID and client ID are correct
|
|
4. Check if client secret needs rotation
|
|
5. Review Azure AD app registration status
|
|
|
|
## Common Operations
|
|
|
|
### Issue a Credential Manually
|
|
|
|
```bash
|
|
curl -X POST https://api.theorder.org/vc/issue/entra \
|
|
-H "Content-Type: application/json" \
|
|
-H "Authorization: Bearer <token>" \
|
|
-d '{
|
|
"claims": {
|
|
"email": "user@example.com",
|
|
"name": "John Doe",
|
|
"role": "member"
|
|
},
|
|
"manifestName": "default"
|
|
}'
|
|
```
|
|
|
|
### Check Credential Status
|
|
|
|
```bash
|
|
curl https://api.theorder.org/vc/entra/status/<requestId> \
|
|
-H "Authorization: Bearer <token>"
|
|
```
|
|
|
|
### Verify a Credential
|
|
|
|
```bash
|
|
curl -X POST https://api.theorder.org/vc/verify/entra \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"credential": {
|
|
"id": "vc:123",
|
|
"type": ["VerifiableCredential"],
|
|
"issuer": "did:web:...",
|
|
"credentialSubject": {...},
|
|
"proof": {...}
|
|
}
|
|
}'
|
|
```
|
|
|
|
### View Recent Issuances
|
|
|
|
```bash
|
|
# Query database
|
|
kubectl exec -n the-order-prod deployment/identity-service -- \
|
|
psql $DATABASE_URL -c "SELECT * FROM verifiable_credentials ORDER BY created_at DESC LIMIT 10;"
|
|
```
|
|
|
|
### Check Metrics
|
|
|
|
```bash
|
|
# Get all Entra metrics
|
|
curl https://api.theorder.org/metrics | grep entra_
|
|
|
|
# Get specific metric
|
|
curl https://api.theorder.org/metrics | grep entra_credentials_issued_total
|
|
```
|
|
|
|
### Rotate Client Secret
|
|
|
|
1. Create new client secret in Azure Portal
|
|
2. Update secret in Key Vault:
|
|
```bash
|
|
az keyvault secret set --vault-name <keyvault> --name "entra-client-secret" --value "<new-secret>"
|
|
```
|
|
3. Restart identity service to pick up new secret
|
|
4. Verify service starts correctly
|
|
5. Test credential issuance
|
|
6. Delete old secret after verification
|
|
|
|
### Add New Credential Manifest
|
|
|
|
1. Create manifest in Azure Portal → Verified ID
|
|
2. Note the Manifest ID
|
|
3. Update `ENTRA_MANIFESTS` environment variable:
|
|
```bash
|
|
ENTRA_MANIFESTS='{"default":"id1","new-manifest":"new-id"}'
|
|
```
|
|
4. Restart identity service
|
|
5. Test issuance with new manifest:
|
|
```bash
|
|
curl -X POST .../vc/issue/entra -d '{"claims": {...}, "manifestName": "new-manifest"}'
|
|
```
|
|
|
|
## Emergency Procedures
|
|
|
|
### Disable Entra Integration
|
|
|
|
If critical issues occur:
|
|
|
|
1. **Scale down identity service** (if using separate deployment):
|
|
```bash
|
|
kubectl scale deployment identity-service -n the-order-prod --replicas=0
|
|
```
|
|
|
|
2. **Or disable Entra routes** by setting:
|
|
```bash
|
|
ENTRA_TENANT_ID=""
|
|
```
|
|
|
|
3. **Verify routes are disabled**:
|
|
```bash
|
|
curl https://api.theorder.org/vc/issue/entra
|
|
# Should return 503 or route not found
|
|
```
|
|
|
|
4. **Monitor for stability**
|
|
|
|
### Rollback Deployment
|
|
|
|
1. Identify previous working version
|
|
2. Rollback deployment:
|
|
```bash
|
|
kubectl rollout undo deployment/identity-service -n the-order-prod
|
|
```
|
|
3. Verify rollback:
|
|
```bash
|
|
kubectl rollout status deployment/identity-service -n the-order-prod
|
|
```
|
|
4. Test critical functionality
|
|
5. Monitor metrics
|
|
|
|
### Emergency Credential Issuance
|
|
|
|
If automated issuance fails, use manual process:
|
|
|
|
1. Access Entra VerifiedID portal directly
|
|
2. Issue credential manually
|
|
3. Export credential data
|
|
4. Import into database if needed
|
|
5. Notify affected users
|
|
|
|
## Diagnostic Commands
|
|
|
|
### Check Service Status
|
|
```bash
|
|
kubectl get pods -n the-order-prod -l app=identity-service
|
|
kubectl describe pod <pod-name> -n the-order-prod
|
|
```
|
|
|
|
### View Logs
|
|
```bash
|
|
# Recent logs
|
|
kubectl logs -n the-order-prod deployment/identity-service --tail=100
|
|
|
|
# Follow logs
|
|
kubectl logs -n the-order-prod deployment/identity-service -f
|
|
|
|
# Logs with grep
|
|
kubectl logs -n the-order-prod deployment/identity-service | grep -i entra
|
|
```
|
|
|
|
### Check Configuration
|
|
```bash
|
|
# Environment variables
|
|
kubectl exec -n the-order-prod deployment/identity-service -- env | grep ENTRA
|
|
|
|
# ConfigMap
|
|
kubectl get configmap -n the-order-prod identity-service-config -o yaml
|
|
|
|
# Secrets (base64 encoded)
|
|
kubectl get secret -n the-order-prod entra-credentials -o yaml
|
|
```
|
|
|
|
### Test Connectivity
|
|
```bash
|
|
# Test Entra API
|
|
curl -v https://verifiedid.did.msidentity.com/v1.0/
|
|
|
|
# Test webhook endpoint
|
|
curl -X POST https://api.theorder.org/vc/entra/webhook \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"requestId":"test","requestStatus":"issuance_successful"}'
|
|
```
|
|
|
|
## Support Escalation
|
|
|
|
1. **Level 1**: Check logs, metrics, and run diagnostic commands
|
|
2. **Level 2**: Review configuration and test connectivity
|
|
3. **Level 3**: Contact Azure support for Entra VerifiedID issues
|
|
4. **Level 4**: Escalate to engineering team for code issues
|
|
|
|
## Contact Information
|
|
|
|
- **On-Call Engineer**: [Contact Info]
|
|
- **Azure Support**: [Azure Portal](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade)
|
|
- **Entra Documentation**: [Microsoft Learn](https://learn.microsoft.com/en-us/azure/active-directory/verifiable-credentials/)
|
|
|
|
---
|
|
|
|
**Last Updated**: [Current Date]
|
|
**Version**: 1.0
|
|
|