Files
the_order/docs/deployment/DEPLOYMENT_STEPS_SUMMARY.md
defiQUG 8649ad4124 feat: implement naming convention, deployment automation, and infrastructure updates
- Add comprehensive naming convention (provider-region-resource-env-purpose)
- Implement Terraform locals for centralized naming
- Update all Terraform resources to use new naming convention
- Create deployment automation framework (18 phase scripts)
- Add Azure setup scripts (provider registration, quota checks)
- Update deployment scripts config with naming functions
- Create complete deployment documentation (guide, steps, quick reference)
- Add frontend portal implementations (public and internal)
- Add UI component library (18 components)
- Enhance Entra VerifiedID integration with file utilities
- Add API client package for all services
- Create comprehensive documentation (naming, deployment, next steps)

Infrastructure:
- Resource groups, storage accounts with new naming
- Terraform configuration updates
- Outputs with naming convention examples

Deployment:
- Automated deployment scripts for all 15 phases
- State management and logging
- Error handling and validation

Documentation:
- Naming convention guide and implementation summary
- Complete deployment guide (296 steps)
- Next steps and quick start guides
- Azure prerequisites and setup completion docs

Note: ESLint warnings present - will be addressed in follow-up commit
2025-11-12 08:22:51 -08:00

565 lines
16 KiB
Markdown

# Deployment Steps Summary - Ordered by Execution Sequence
**Last Updated**: 2025-01-27
**Purpose**: Complete list of all deployment steps grouped by execution order
---
## Overview
This document lists all deployment steps in the exact order they must be executed. Steps are grouped into phases that can be executed sequentially, with some phases able to run in parallel (noted below).
**Total Phases**: 15
**Estimated Total Time**: 8-12 weeks (with parallelization)
---
## Phase 1: Prerequisites ⚙️
**Duration**: 1-2 days
**Can Run In Parallel**: No
**Dependencies**: None
### 1.1 Development Environment
1. Install Node.js >= 18.0.0
2. Install pnpm >= 8.0.0
3. Install Azure CLI
4. Install Terraform >= 1.5.0
5. Install kubectl
6. Install Docker (for local dev)
7. Clone repository
8. Initialize git submodules
9. Install dependencies (`pnpm install`)
10. Build all packages (`pnpm build`)
### 1.2 Azure Account
11. Create Azure subscription (if needed)
12. Login to Azure CLI (`az login`)
13. Set active subscription
14. Verify permissions (Contributor/Owner role)
### 1.3 Local Services (Optional)
15. Start Docker Compose services (PostgreSQL, Redis, OpenSearch)
---
## Phase 2: Azure Infrastructure Setup 🏗️
**Duration**: 4-6 weeks
**Can Run In Parallel**: Yes (with Phase 3)
**Dependencies**: Phase 1
### 2.1 Azure Subscription Preparation
16. Run `./infra/scripts/azure-setup.sh`
17. Run `./infra/scripts/azure-register-providers.sh`
18. Run `./infra/scripts/azure-check-quotas.sh`
19. Review quota reports
20. Verify all 13 resource providers registered
### 2.2 Terraform Infrastructure
21. Navigate to `infra/terraform`
22. Run `terraform init`
23. Create Terraform state storage (resource group, storage account, container)
24. Configure remote state backend in `versions.tf`
25. Re-initialize Terraform with `terraform init -migrate-state`
26. Run `terraform plan`
27. Deploy resource groups
28. Deploy storage accounts
29. Deploy AKS cluster (configuration to be added)
30. Deploy Azure Database for PostgreSQL (configuration to be added)
31. Deploy Azure Key Vault (configuration to be added)
32. Deploy Azure Container Registry (configuration to be added)
33. Deploy Virtual Network (configuration to be added)
34. Deploy Application Gateway/Load Balancer (configuration to be added)
### 2.3 Kubernetes Configuration
35. Get AKS credentials (`az aks get-credentials`)
36. Verify cluster access (`kubectl get nodes`)
37. Configure Azure CNI networking
38. Install External Secrets Operator
39. Configure Azure Key Vault Provider for Secrets Store CSI
40. Attach ACR to AKS (`az aks update --attach-acr`)
41. Enable Azure Monitor for Containers
42. Configure Log Analytics workspace
---
## Phase 3: Entra ID Configuration 🔐
**Duration**: 1-2 days
**Can Run In Parallel**: Yes (with Phase 2)
**Dependencies**: Phase 1
### 3.1 Azure AD App Registration
43. Create App Registration in Azure Portal
44. Note Application (client) ID
45. Note Directory (tenant) ID
46. Configure API permissions (Verifiable Credentials Service)
47. Grant admin consent for permissions
48. Create client secret
49. Save client secret securely (only shown once)
50. Configure redirect URIs for portals
51. Configure logout URLs
### 3.2 Microsoft Entra VerifiedID
52. Enable Verified ID service in Azure Portal
53. Wait for service activation
54. Create credential manifest
55. Define credential type
56. Define claims schema
57. Note Manifest ID
58. Verify Issuer DID format
59. Test DID resolution
### 3.3 Azure Logic Apps (Optional)
60. Create Logic App workflows (eIDAS, VC issuance, document processing)
61. Note workflow URLs
62. Generate access keys OR configure managed identity
63. Grant necessary permissions
64. Test workflow triggers
---
## Phase 4: Database & Storage Setup 💾
**Duration**: 1-2 days
**Dependencies**: Phase 2
### 4.1 PostgreSQL
65. Create databases (dev, stage, prod)
66. Create database users
67. Grant privileges
68. Configure firewall rules for AKS
69. Test database connection
### 4.2 Storage Accounts
70. Verify storage accounts created
71. Create container: `intake-documents`
72. Create container: `dataroom-deals`
73. Create container: `credentials`
74. Configure managed identity access
75. Configure CORS (if needed)
76. Enable versioning and soft delete
### 4.3 Redis Cache (If using Azure Cache)
77. Create Azure Cache for Redis (Terraform to be added)
78. Configure firewall rules
79. Set up access keys
80. Test connection
### 4.4 OpenSearch (If using managed service)
81. Create managed OpenSearch cluster (Terraform to be added)
82. Configure access
83. Set up indices
84. Test connection
---
## Phase 5: Container Registry Setup 📦
**Duration**: 1 day
**Dependencies**: Phase 2
### 5.1 Azure Container Registry
85. Verify ACR created
86. Enable admin user (or configure managed identity)
87. Get ACR credentials
88. Attach ACR to AKS (`az aks update --attach-acr`)
89. Test ACR access from AKS
---
## Phase 6: Application Build & Package 🔨
**Duration**: 2-4 hours
**Dependencies**: Phase 1, Phase 5
### 6.1 Build Packages
90. Build shared packages (`pnpm build`)
91. Build `@the-order/ui`
92. Build `@the-order/auth`
93. Build `@the-order/api-client`
94. Build `@the-order/database`
95. Build `@the-order/storage`
96. Build `@the-order/crypto`
97. Build `@the-order/schemas`
### 6.2 Build Frontend Apps
98. Build `portal-public`
99. Build `portal-internal`
### 6.3 Build Backend Services
100. Build `@the-order/identity`
101. Build `@the-order/intake`
102. Build `@the-order/finance`
103. Build `@the-order/dataroom`
### 6.4 Create Docker Images
104. Create `services/identity/Dockerfile` (to be created)
105. Create `services/intake/Dockerfile` (to be created)
106. Create `services/finance/Dockerfile` (to be created)
107. Create `services/dataroom/Dockerfile` (to be created)
108. Create `apps/portal-public/Dockerfile` (to be created)
109. Create `apps/portal-internal/Dockerfile` (to be created)
110. Login to ACR (`az acr login`)
111. Build and push `identity` image
112. Build and push `intake` image
113. Build and push `finance` image
114. Build and push `dataroom` image
115. Build and push `portal-public` image
116. Build and push `portal-internal` image
117. Sign all images with Cosign (security best practice)
---
## Phase 7: Database Migrations 🗄️
**Duration**: 1-2 hours
**Dependencies**: Phase 4, Phase 6
### 7.1 Run Migrations
118. Set `DATABASE_URL` for dev environment
119. Run migrations for dev (`pnpm --filter @the-order/database migrate up`)
120. Verify schema created (check tables)
121. Set `DATABASE_URL` for staging environment
122. Run migrations for staging
123. Verify schema created
124. Set `DATABASE_URL` for production environment
125. Run migrations for production
126. Verify schema created
127. Run seed scripts (if needed)
---
## Phase 8: Secrets Configuration 🔒
**Duration**: 2-4 hours
**Dependencies**: Phase 2, Phase 3
### 8.1 Store Secrets in Key Vault
128. Store `database-url-dev` in Key Vault
129. Store `database-url-stage` in Key Vault
130. Store `database-url-prod` in Key Vault
131. Store `entra-tenant-id` in Key Vault
132. Store `entra-client-id` in Key Vault
133. Store `entra-client-secret` in Key Vault
134. Store `entra-credential-manifest-id` in Key Vault
135. Store `storage-account-name` in Key Vault
136. Store `jwt-secret` in Key Vault
137. Store `kms-key-id` in Key Vault
138. Store `payment-gateway-api-key` in Key Vault
139. Store `ocr-service-api-key` in Key Vault
140. Store `eidas-api-key` in Key Vault
141. Store other service-specific secrets
### 8.2 Configure External Secrets Operator
142. Create SecretStore for Azure Key Vault (YAML to be created)
143. Create ExternalSecret resources (YAML to be created)
144. Apply SecretStore configuration
145. Apply ExternalSecret configuration
146. Verify secrets synced to Kubernetes
---
## Phase 9: Infrastructure Services Deployment 🛠️
**Duration**: 1-2 days
**Dependencies**: Phase 2, Phase 8
### 9.1 External Secrets Operator
147. Install External Secrets Operator
148. Wait for operator to be ready
149. Verify SecretStore working
### 9.2 Monitoring Stack
150. Add Prometheus Helm repository
151. Install Prometheus stack
152. Configure Grafana
153. Deploy OpenTelemetry Collector
154. Configure exporters
155. Set up trace collection
### 9.3 Logging Stack
156. Deploy OpenSearch (if not using managed service)
157. Configure Fluent Bit/Fluentd
158. Configure log forwarding
159. Set up log retention policies
---
## Phase 10: Backend Services Deployment 🚀
**Duration**: 2-4 days
**Dependencies**: Phase 6, Phase 7, Phase 8, Phase 9
### 10.1 Create Kubernetes Manifests
160. Create `infra/k8s/base/identity/deployment.yaml` (to be created)
161. Create `infra/k8s/base/identity/service.yaml` (to be created)
162. Create `infra/k8s/base/intake/deployment.yaml` (to be created)
163. Create `infra/k8s/base/intake/service.yaml` (to be created)
164. Create `infra/k8s/base/finance/deployment.yaml` (to be created)
165. Create `infra/k8s/base/finance/service.yaml` (to be created)
166. Create `infra/k8s/base/dataroom/deployment.yaml` (to be created)
167. Create `infra/k8s/base/dataroom/service.yaml` (to be created)
### 10.2 Deploy Identity Service
168. Apply Identity Service manifests
169. Verify pods running
170. Check logs
171. Test health endpoint
172. Verify service accessible
### 10.3 Deploy Intake Service
173. Apply Intake Service manifests
174. Verify pods running
175. Check logs
176. Test health endpoint
### 10.4 Deploy Finance Service
177. Apply Finance Service manifests
178. Verify pods running
179. Check logs
180. Test health endpoint
### 10.5 Deploy Dataroom Service
181. Apply Dataroom Service manifests
182. Verify pods running
183. Check logs
184. Test health endpoint
### 10.6 Verify Service Communication
185. Test internal service-to-service communication
186. Verify service discovery working
---
## Phase 11: Frontend Applications Deployment 🎨
**Duration**: 1-2 days
**Dependencies**: Phase 6, Phase 10
### 11.1 Portal Public
187. Create `infra/k8s/base/portal-public/deployment.yaml` (to be created)
188. Create `infra/k8s/base/portal-public/service.yaml` (to be created)
189. Create `infra/k8s/base/portal-public/ingress.yaml` (to be created)
190. Apply Portal Public manifests
191. Verify pods running
192. Check logs
193. Test application in browser
### 11.2 Portal Internal
194. Create `infra/k8s/base/portal-internal/deployment.yaml` (to be created)
195. Create `infra/k8s/base/portal-internal/service.yaml` (to be created)
196. Create `infra/k8s/base/portal-internal/ingress.yaml` (to be created)
197. Apply Portal Internal manifests
198. Verify pods running
199. Check logs
200. Test application in browser
---
## Phase 12: Networking & Gateways 🌐
**Duration**: 2-3 days
**Dependencies**: Phase 10, Phase 11
### 12.1 Configure Ingress
201. Deploy NGINX Ingress Controller (if not using Application Gateway)
202. Create Ingress resources (YAML to be created)
203. Apply Ingress configuration
204. Verify ingress rules
### 12.2 Configure Application Gateway (If using)
205. Create backend pools
206. Configure routing rules
207. Configure SSL termination
208. Set up health probes
### 12.3 Configure DNS
209. Create DNS record for `api.theorder.org`
210. Create DNS record for `portal.theorder.org`
211. Create DNS record for `admin.theorder.org`
212. Verify DNS resolution
### 12.4 Configure SSL/TLS
213. Install cert-manager (if using Let's Encrypt)
214. Create ClusterIssuer
215. Configure certificate requests
216. Verify certificates issued
217. Test HTTPS access
### 12.5 Configure WAF
218. Set up OWASP rules
219. Configure custom rules
220. Set up rate limiting
221. Configure IP allow/deny lists
---
## Phase 13: Monitoring & Observability 📊
**Duration**: 2-3 days
**Dependencies**: Phase 9, Phase 10, Phase 11
### 13.1 Application Insights
222. Create Application Insights resource
223. Add instrumentation keys to services
224. Configure custom metrics
225. Set up alerts
### 13.2 Log Analytics
226. Create Log Analytics workspace
227. Set up container insights
228. Configure log forwarding
229. Set up log queries
### 13.3 Set Up Alerts
230. Create alert rule for high error rate
231. Create alert rule for high latency
232. Create alert rule for resource usage
233. Configure email notifications
234. Configure webhook actions
235. Set up PagerDuty integration (if needed)
### 13.4 Configure Dashboards
236. Create Grafana dashboard for service health
237. Create Grafana dashboard for performance metrics
238. Create Grafana dashboard for business metrics
239. Create Grafana dashboard for error tracking
240. Create Azure custom dashboards
241. Configure shared dashboards
242. Set up access permissions
---
## Phase 14: Testing & Validation ✅
**Duration**: 3-5 days
**Dependencies**: All previous phases
### 14.1 Health Checks
243. Verify all pods running
244. Check all service endpoints
245. Verify all health endpoints responding
246. Check service logs for errors
### 14.2 Integration Testing
247. Test Identity Service API endpoints
248. Test Intake Service API endpoints
249. Test Finance Service API endpoints
250. Test Dataroom Service API endpoints
251. Test Portal Public application
252. Test Portal Internal application
253. Test authentication flow
254. Test API integration from frontend
### 14.3 End-to-End Testing
255. Test user registration flow
256. Test application submission flow
257. Test credential issuance flow
258. Test payment processing flow
259. Test document upload flow
260. Test complete user journeys
### 14.4 Performance Testing
261. Run load tests (k6, Apache Bench, or JMeter)
262. Verify response times acceptable
263. Verify throughput meets requirements
264. Verify resource usage within limits
265. Optimize based on results
### 14.5 Security Testing
266. Run Trivy security scan
267. Check for exposed secrets
268. Verify network policies configured
269. Verify RBAC properly set up
270. Verify TLS/SSL working
271. Verify authentication required
272. Test authorization controls
---
## Phase 15: Production Hardening 🔒
**Duration**: 2-3 days
**Dependencies**: Phase 14
### 15.1 Production Configuration
273. Update replica counts for production
274. Configure resource limits and requests
275. Configure liveness probes
276. Configure readiness probes
277. Set up horizontal pod autoscaling
278. Configure pod disruption budgets
### 15.2 Backup Configuration
279. Configure database backups
280. Configure storage backups
281. Enable blob versioning
282. Configure retention policies
283. Set up geo-replication (if needed)
284. Test backup restore procedures
### 15.3 Disaster Recovery
285. Document backup procedures
286. Test restore procedures
287. Set up automated backups
288. Configure multi-region deployment (if needed)
289. Configure DNS failover
290. Test disaster recovery procedures
### 15.4 Documentation
291. Update deployment documentation
292. Document all configuration
293. Create operational runbooks
294. Document troubleshooting steps
295. Create incident response procedures
296. Document escalation procedures
---
## Summary Statistics
- **Total Steps**: 296
- **Phases**: 15
- **Estimated Duration**: 8-12 weeks
- **Critical Path**: Phases 1 → 2 → 4 → 6 → 7 → 8 → 10 → 11 → 12 → 14 → 15
- **Can Run in Parallel**: Phases 2 & 3
---
## Quick Status Tracking
### ✅ Completed Phases
- [ ] Phase 1: Prerequisites
- [ ] Phase 2: Azure Infrastructure Setup
- [ ] Phase 3: Entra ID Configuration
- [ ] Phase 4: Database & Storage Setup
- [ ] Phase 5: Container Registry Setup
- [ ] Phase 6: Application Build & Package
- [ ] Phase 7: Database Migrations
- [ ] Phase 8: Secrets Configuration
- [ ] Phase 9: Infrastructure Services Deployment
- [ ] Phase 10: Backend Services Deployment
- [ ] Phase 11: Frontend Applications Deployment
- [ ] Phase 12: Networking & Gateways
- [ ] Phase 13: Monitoring & Observability
- [ ] Phase 14: Testing & Validation
- [ ] Phase 15: Production Hardening
---
## Next Steps After Deployment
1. **Monitor**: Watch logs and metrics for first 24-48 hours
2. **Optimize**: Adjust resource allocations based on actual usage
3. **Document**: Update runbooks with lessons learned
4. **Train**: Train operations team on new infrastructure
5. **Iterate**: Plan next deployment cycle improvements
---
**See `DEPLOYMENT_GUIDE.md` for detailed instructions for each step.**
**See `DEPLOYMENT_QUICK_REFERENCE.md` for quick command reference.**