Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements

- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
This commit is contained in:
defiQUG
2025-12-12 18:01:35 -08:00
parent e01131efaf
commit 9daf1fd378
968 changed files with 160890 additions and 1092 deletions

View File

@@ -0,0 +1,189 @@
# Fairness Audit Orchestration - Design Document
## Design Philosophy
> **We design from the end result backwards.**
First we list every output you want (reports, files, metrics), then we calculate how much input validation and enrichment is needed — roughly **2× your input effort** — and size the total job at about **3.2× the input**.
You only choose:
1. **What goes in** (Input)
2. **What comes out** (Output)
3. **When it needs to be ready** (Timeline)
The orchestration engine does everything in between.
## Mathematical Model
### Core Formula
```
Total Process Load = O + 2I ≈ 3.2I
```
Where:
- **I** = Input size/effort (units)
- **O** = Total output effort (sum of output weights)
- **2I** = Two input processing passes (ingestion + enrichment + fairness evaluation)
### Design Target
```
O ≈ 1.2 × I
```
This means for a typical input of 100 units:
- Output should be around 120 units
- Total load ≈ 320 units
- Input passes = 200 units (2 × 100)
### Why 2× Input?
The input requires two full passes:
1. **Ingestion & Enrichment**: Load data, validate, enrich with metadata
2. **Fairness Evaluation**: Run fairness algorithms, calculate metrics
Each pass processes the full input, hence 2×.
## Output Weight Guidelines
### Weight Calculation Factors
1. **Complexity**: How complex is the output to generate?
2. **Data Volume**: How much data does it contain?
3. **Processing Time**: How long does generation take?
4. **Dependencies**: Does it depend on other outputs?
### Recommended Weights
| Complexity | Typical Weight Range | Examples |
|-----------|---------------------|----------|
| Simple | 0.5 - 1.0 | Metrics export, Alert config |
| Medium | 1.0 - 2.0 | CSV exports, JSON reports |
| Complex | 2.0 - 3.0 | PDF reports, Dashboards, Compliance docs |
### Weight Examples
- **Metrics Export (1.0)**: Simple calculation, small output
- **Flagged Cases CSV (1.5)**: Medium complexity, moderate data
- **Fairness Audit PDF (2.5)**: Complex formatting, large output
- **Compliance Report (2.2)**: Complex structure, regulatory requirements
## Input Load Estimation
### Base Calculation
```typescript
Base = 100 units
+ Sensitive Attributes: 20 units each
+ Date Range: 5 units per day
+ Filters: 10 units each
+ Estimated Size: Use if provided
```
### Example Calculations
**Small Dataset**:
- 2 sensitive attributes
- 7-day range
- 1 filter
- Load = 100 + (2×20) + (7×5) + (1×10) = 165 units
**Large Dataset**:
- 5 sensitive attributes
- 90-day range
- 5 filters
- Load = 100 + (5×20) + (90×5) + (5×10) = 700 units
## Timeline Validation
### SLA Parsing
Supports formats:
- "2 hours"
- "1 day"
- "30 minutes"
- "45 seconds"
### Feasibility Checks
1. **Time Check**: `estimatedTime ≤ maxTimeSeconds`
2. **Output Check**: `outputLoad ≤ 1.5 × (inputLoad × 1.2)`
3. **Total Load Check**: `totalLoad ≤ 1.3 × (inputLoad × 3.2)`
### Warning Thresholds
- **Critical**: Estimated time exceeds timeline
- **Warning**: Estimated time > 80% of timeline
- **Info**: Output load > 1.5× target
## User Experience Flow
### Step 1: Select Outputs
- User checks desired outputs
- Engine calculates O in real-time
- Shows total output load
### Step 2: Specify Input
- User enters dataset, attributes, range
- Engine calculates I in real-time
- Shows estimated input load
### Step 3: Set Timeline
- User selects mode and SLA
- Engine validates feasibility
- Shows estimated time and warnings
### Step 4: Review & Run
- Engine shows complete analysis
- User reviews warnings/suggestions
- User confirms and runs
## Error Handling
### Invalid Configurations
1. **No Outputs Selected**: Disable run button
2. **No Dataset**: Disable run button
3. **Invalid SLA Format**: Show format hint
4. **Infeasible Timeline**: Show suggestions
### Suggestions
Engine provides actionable suggestions:
- "Consider reducing outputs"
- "Consider extending timeline"
- "Consider simplifying input filters"
## Performance Considerations
### Processing Rates
Rates are configurable and can be tuned based on:
- Hardware capabilities
- Network bandwidth
- Concurrent job limits
- Historical performance data
### Optimization Strategies
1. **Parallel Processing**: Process outputs in parallel when possible
2. **Caching**: Cache intermediate results
3. **Batch Processing**: Batch similar operations
4. **Resource Allocation**: Allocate resources based on load
## Future Enhancements
1. **Machine Learning**: Learn from historical runs to improve estimates
2. **Dynamic Weights**: Adjust weights based on actual performance
3. **Resource Scaling**: Automatically scale resources based on load
4. **Cost Estimation**: Add cost estimates alongside time estimates
5. **Multi-Tenant**: Support multiple concurrent orchestrations
## Related Documentation
- [Orchestration Engine](./ORCHESTRATION_ENGINE.md)
- [Output Weight Guidelines](./OUTPUT_WEIGHTS.md)
- [API Reference](./API_REFERENCE.md)

View File

@@ -0,0 +1,181 @@
# Fairness Audit Orchestration Engine
## Overview
The Fairness Audit Orchestration Engine uses a **3-variable model** to size and schedule fairness audit processes. The engine designs from outputs backwards, calculating the total process load and validating feasibility against requested timelines.
## The 3-Variable Model
### Variables
1. **I (Input)**: Input size/effort
- Dataset size
- Number of sensitive attributes
- Date range complexity
- Filter complexity
2. **O (Output)**: Total output effort
- Sum of all selected outputs (reports, dashboards, exports, alerts)
- Each output type has a weight
3. **T (Timeline)**: Runtime allocation
- Execution mode (now, scheduled, continuous)
- SLA/time limit
- Deadline
## Backend Logic
### Formula
```
Total Process Load ≈ O + 2I ≈ 3.2I
```
Where:
- **O** = Sum of all output weights
- **2I** = Two input passes (ingestion + enrichment + fairness evaluation)
- **3.2I** = Target total load (design target: O ≈ 1.2 × I)
### Calculation Flow
1. **Start with Outputs**
- User selects desired outputs
- Engine sums output weights → **O**
2. **Calculate Input Load**
- Engine analyzes input specification
- Calculates input complexity → **I**
3. **Calculate Total Load**
- Total = O + 2I
- Validates against target: ≈ 3.2I
4. **Estimate Time**
- Uses processing rates to estimate runtime
- Validates against timeline constraints
5. **Feasibility Check**
- Compares estimated time vs. requested timeline
- Checks output load vs. recommended (1.2 × I)
- Provides warnings and suggestions
## Output Types and Weights
| Output Type | Weight | Description |
|------------|--------|-------------|
| Fairness Audit PDF | 2.5 | Comprehensive fairness audit report |
| Metrics Export (SPD, TPR, FPR) | 1.0 | Statistical parity difference, rates |
| Flagged Cases CSV | 1.5 | Cases flagged for potential bias |
| Executive Summary Slides | 2.0 | Executive presentation slides |
| Detailed Report (JSON) | 1.2 | Machine-readable detailed analysis |
| Alert Configuration | 0.8 | Automated alert rules |
| Dashboard Export | 1.8 | Interactive dashboard |
| Compliance Report | 2.2 | Regulatory compliance documentation |
## Input Load Calculation
```typescript
Input Load = Base (100)
+ Sensitive Attributes (20 each)
+ Date Range (5 per day)
+ Filters (10 each)
```
Or use pre-calculated `estimatedSize` if available.
## Processing Rates
- **Input Processing**: 15 units/second
- **Output Processing**: 8 units/second
- **Average Rate**: ~11.5 units/second
## User-Facing Messages
### Feasible Configuration
> "This fairness audit will process approximately X input units and generate Y output units, taking approximately Z to complete."
### Feasible with Warnings
> "This audit is feasible but has some considerations: [warnings]. Estimated time: Z."
### Not Feasible
> "This audit configuration may not be feasible within the requested timeline. [warnings]. Estimated time: Z."
## Example Scenarios
### Scenario 1: Small Dataset, Few Outputs
- **Input**: 100 units (small dataset, 2 attributes)
- **Outputs**: Metrics Export (1.0) + Flagged Cases CSV (1.5) = 2.5 units
- **Total Load**: 2.5 + (2 × 100) = 202.5 units
- **Estimated Time**: ~18 seconds
- **Result**: ✅ Feasible
### Scenario 2: Large Dataset, Many Outputs
- **Input**: 500 units (large dataset, 5 attributes, 30-day range)
- **Outputs**: All 8 outputs = 13.0 units
- **Total Load**: 13.0 + (2 × 500) = 1013.0 units
- **Estimated Time**: ~88 seconds
- **Result**: ⚠️ May need timeline adjustment
### Scenario 3: Output-Heavy Request
- **Input**: 200 units
- **Outputs**: All outputs = 13.0 units
- **Target Output**: 200 × 1.2 = 240 units
- **Actual Output**: 13.0 units
- **Result**: ✅ Within target (O < 1.2 × I)
## Implementation
### Backend Engine
- Location: `api/src/services/fairness-orchestration/engine.ts`
- Provides: `orchestrate()`, calculation functions, feasibility checks
### Frontend Component
- Location: `portal/src/components/fairness/FairnessOrchestrationWizard.tsx`
- 3-column layout: Output | Input | Timeline
- Real-time orchestration calculation
- Visual feedback on feasibility
### Client Library
- Location: `portal/src/lib/fairness-orchestration.ts`
- Shared types and calculation functions
- Can be used client-side or called via API
## API Endpoints (To Be Implemented)
```
POST /api/fairness/orchestrate
Body: OrchestrationRequest
Response: OrchestrationResult
GET /api/fairness/outputs
Response: OutputType[]
POST /api/fairness/run
Body: OrchestrationRequest
Response: Job ID and status
```
## Configuration
### Adjustable Constants
```typescript
INPUT_PASS_MULTIPLIER = 2.0 // 2 × I for input passes
TOTAL_LOAD_MULTIPLIER = 3.2 // Target: O + 2I ≈ 3.2I
OUTPUT_TARGET_MULTIPLIER = 1.2 // Design target: O ≈ 1.2 × I
INPUT_PROCESSING_RATE = 15 // units/second
OUTPUT_PROCESSING_RATE = 8 // units/second
```
### Tuning Recommendations
- **High-volume scenarios**: Increase processing rates
- **Complex outputs**: Adjust output weights
- **Strict SLAs**: Add buffer time (20% recommended)
## Related Documentation
- [Orchestration Engine Design](./ORCHESTRATION_DESIGN.md)
- [Output Weight Guidelines](./OUTPUT_WEIGHTS.md)
- [User Guide](../fairness-audit/USER_GUIDE.md)

View File

@@ -0,0 +1,193 @@
# Output Weight Guidelines
## Overview
Each output type in the fairness audit orchestration has a **weight** that represents the relative effort required to generate it. These weights are used to calculate the total output load (O) in the orchestration formula.
## Weight Calculation
Weights are determined by:
1. **Processing Complexity**: How much computation is required?
2. **Data Volume**: How much data needs to be processed/generated?
3. **Format Complexity**: How complex is the output format?
4. **Dependencies**: Does it depend on other outputs?
## Output Types and Weights
### Lightweight Outputs (0.5 - 1.5 units)
#### Metrics Export (1.0 units)
- **Type**: `metrics-export`
- **Weight**: 1.0
- **Description**: Statistical parity difference, true positive rate, false positive rate metrics
- **Complexity**: Low - Simple calculations, small output
- **Dependencies**: None
- **Format**: JSON/CSV
#### Alert Configuration (0.8 units)
- **Type**: `alerts-config`
- **Weight**: 0.8
- **Description**: Automated alert rules for ongoing monitoring
- **Complexity**: Low - Rule generation, small output
- **Dependencies**: Metrics export
- **Format**: YAML/JSON
#### Detailed Report JSON (1.2 units)
- **Type**: `detailed-report-json`
- **Weight**: 1.2
- **Description**: Machine-readable detailed fairness analysis
- **Complexity**: Medium - Structured data, moderate size
- **Dependencies**: All metrics
- **Format**: JSON
### Medium Outputs (1.5 - 2.0 units)
#### Flagged Cases CSV (1.5 units)
- **Type**: `flagged-cases-csv`
- **Weight**: 1.5
- **Description**: Export of cases flagged for potential bias issues
- **Complexity**: Medium - Data filtering, CSV generation
- **Dependencies**: Fairness evaluation
- **Format**: CSV
#### Executive Summary Slides (2.0 units)
- **Type**: `exec-summary-slides`
- **Weight**: 2.0
- **Description**: Executive presentation slides with key findings
- **Complexity**: Medium-High - Data aggregation, slide generation
- **Dependencies**: All metrics, summary analysis
- **Format**: PowerPoint/PDF
#### Dashboard Export (1.8 units)
- **Type**: `dashboard-export`
- **Weight**: 1.8
- **Description**: Interactive dashboard with fairness metrics
- **Complexity**: Medium - Dashboard generation, visualization
- **Dependencies**: All metrics
- **Format**: HTML/Interactive
### Heavy Outputs (2.0 - 3.0 units)
#### Fairness Audit PDF (2.5 units)
- **Type**: `fairness-audit-pdf`
- **Weight**: 2.5
- **Description**: Comprehensive fairness audit report in PDF format
- **Complexity**: High - Full report generation, PDF formatting
- **Dependencies**: All analyses, metrics, findings
- **Format**: PDF
#### Compliance Report (2.2 units)
- **Type**: `compliance-report`
- **Weight**: 2.2
- **Description**: Regulatory compliance documentation
- **Complexity**: High - Regulatory formatting, documentation
- **Dependencies**: All analyses, audit trail
- **Format**: PDF/DOCX
## Weight Rationale
### Why These Weights?
1. **Metrics Export (1.0)**: Baseline weight
- Simple calculations
- Small output size
- Fast generation
2. **Alert Configuration (0.8)**: Lighter than baseline
- Minimal processing
- Small output
- Can reuse metrics
3. **Flagged Cases CSV (1.5)**: 50% more than baseline
- Requires filtering logic
- Moderate data volume
- CSV generation overhead
4. **Detailed Report JSON (1.2)**: Slightly above baseline
- Structured data compilation
- Moderate complexity
- JSON serialization
5. **Executive Summary Slides (2.0)**: 2× baseline
- Data aggregation required
- Slide generation complexity
- Visual formatting
6. **Dashboard Export (1.8)**: Between medium and heavy
- Dashboard framework overhead
- Visualization generation
- Interactive components
7. **Fairness Audit PDF (2.5)**: 2.5× baseline
- Comprehensive report
- PDF formatting complexity
- Large output size
8. **Compliance Report (2.2)**: Slightly less than PDF
- Regulatory formatting
- Documentation requirements
- Structured output
## Weight Adjustment Guidelines
### When to Increase Weight
- Output requires significant computation
- Large data volumes
- Complex formatting requirements
- Multiple dependencies
- Real-time processing needed
### When to Decrease Weight
- Simple calculations
- Small output size
- Reusable components
- Cached results available
- Parallel processing possible
## Example Scenarios
### Scenario 1: Minimal Outputs
- Metrics Export (1.0)
- Alert Configuration (0.8)
- **Total**: 1.8 units
### Scenario 2: Standard Audit
- Fairness Audit PDF (2.5)
- Metrics Export (1.0)
- Flagged Cases CSV (1.5)
- **Total**: 5.0 units
### Scenario 3: Comprehensive Audit
- All 8 outputs
- **Total**: 13.0 units
## Weight Validation
### Design Target Check
For input load I:
- **Target Output**: O ≈ 1.2 × I
- **Warning Threshold**: O > 1.5 × (1.2 × I)
- **Example**: If I = 100, target O = 120, warn if O > 180
### Total Load Check
- **Expected**: Total ≈ 3.2 × I
- **Warning**: Total > 1.3 × (3.2 × I)
- **Example**: If I = 100, expected total = 320, warn if total > 416
## Future Considerations
1. **Dynamic Weights**: Adjust based on actual performance
2. **Context-Aware**: Weights vary by dataset size
3. **Machine Learning**: Learn optimal weights from history
4. **Parallel Processing**: Reduce effective weights for parallel outputs
## Related Documentation
- [Orchestration Engine](./ORCHESTRATION_ENGINE.md)
- [Orchestration Design](./ORCHESTRATION_DESIGN.md)

View File

@@ -0,0 +1,285 @@
# User-Facing Messages for Fairness Orchestration
## Overview
This document defines all user-facing messages shown in the fairness orchestration UI. Messages are designed to be clear, actionable, and avoid exposing internal math.
## Success Messages
### Feasible Configuration (No Warnings)
```
This fairness audit will process approximately {inputLoad} input units and generate {outputLoad} output units, taking approximately {estimatedTime} to complete.
```
**Example**:
> "This fairness audit will process approximately 100 input units and generate 5.0 output units, taking approximately 18 seconds to complete."
### Feasible with Warnings
```
This audit is feasible but has some considerations: {warnings}. Estimated time: {estimatedTime}.
```
**Example**:
> "This audit is feasible but has some considerations: Estimated processing time (2.5 hours) is close to timeline limit (3 hours). Estimated time: 2.5 hours."
## Warning Messages
### Output Complexity Warning
```
Output complexity ({outputLoad} units) is significantly higher than recommended ({targetOutputLoad} units)
```
**Suggestion**:
> "Consider reducing the number of outputs or simplifying output requirements"
### Timeline Exceeded Warning
```
Estimated processing time ({estimatedTime}) exceeds requested timeline ({requestedTime})
```
**Suggestion**:
> "Consider extending timeline to {suggestedTime} or reducing outputs"
### Timeline Close Warning
```
Estimated processing time ({estimatedTime}) is close to timeline limit ({requestedTime})
```
**Suggestion**:
> "Consider adding buffer time or reducing outputs for safety"
### Total Load Warning
```
Total process load ({totalLoad} units) is higher than expected ({expectedLoad} units)
```
## Info Messages
### How It Works (Info Box)
```
We design from the end result backwards. First we list every output you want (reports, files, metrics), then we calculate how much input validation and enrichment is needed — roughly 2× your input effort — and size the total job at about 3.2× the input.
You only choose: (1) What goes in, (2) What comes out, (3) When it needs to be ready.
The orchestration engine does everything in between.
```
### Output Selection Help
```
Select the outputs you want to generate. Each output has a weight that represents the processing effort required. The total output load is the sum of all selected output weights.
```
### Input Specification Help
```
Specify your input data. The engine calculates input complexity based on dataset size, number of sensitive attributes, date range, and filters. Input processing requires two passes: ingestion/enrichment and fairness evaluation.
```
### Timeline Help
```
Set when you need results. The engine validates that your requested timeline is feasible given the input and output complexity. You can run now, schedule for later, or set up continuous monitoring.
```
## Error Messages
### No Outputs Selected
```
Please select at least one output to generate.
```
### No Dataset Specified
```
Please specify a dataset to analyze.
```
### Invalid SLA Format
```
Invalid timeline format. Please use formats like "2 hours", "1 day", or "30 minutes".
```
### Infeasible Configuration
```
This audit configuration may not be feasible within the requested timeline. {warnings}. Estimated time: {estimatedTime}.
```
## Button States
### Run Button (Enabled)
```
Run Fairness Audit
```
### Run Button (Disabled - No Outputs)
```
Select outputs to continue
```
### Run Button (Disabled - No Dataset)
```
Specify dataset to continue
```
### Run Button (Disabled - Infeasible)
```
Adjust Configuration
```
## Real-Time Feedback
### Output Load Display
```
Total Output Load: {outputLoad} units
```
### Input Load Display
```
Estimated Input Load: {inputLoad} units
```
### Timeline Display
```
Estimated Time: {estimatedTime}
Total Process Load: {totalLoad} units
```
## Orchestration Analysis Display
### Load Breakdown
```
Input Load: {inputLoad}
× 2 = {inputPasses} (passes)
Output Load: {outputLoad}
Total Load: {totalLoad}
≈ {inputLoad} × 3.2 = {expectedTotal}
```
### Feasibility Status
**Feasible**:
```
✅ This configuration is feasible and ready to run.
```
**Feasible with Warnings**:
```
⚠️ This configuration is feasible but has some considerations.
```
**Not Feasible**:
```
❌ This configuration may not be feasible within the requested timeline.
```
## Suggestions Format
### Reducing Outputs
```
Consider reducing the number of outputs or simplifying output requirements
```
### Extending Timeline
```
Consider extending timeline to {suggestedTime} or reducing outputs
```
### Adding Buffer
```
Consider adding buffer time or reducing outputs for safety
```
## Time Formatting
### Seconds
```
{seconds} seconds
```
### Minutes
```
{minutes} minutes
```
### Hours
```
{hours} hours
```
### Days
```
{days} days
```
## Examples
### Example 1: Simple Request
**Input**: Small dataset (100 units)
**Outputs**: Metrics Export (1.0)
**Timeline**: 2 hours
**Message**:
> "This fairness audit will process approximately 100 input units and generate 1.0 output units, taking approximately 18 seconds to complete."
### Example 2: Complex Request
**Input**: Large dataset (500 units)
**Outputs**: All 8 outputs (13.0 units)
**Timeline**: 1 hour
**Message**:
> "This audit configuration may not be feasible within the requested timeline. Estimated processing time (88 seconds) exceeds requested timeline (1 hour). Estimated time: 88 seconds."
**Suggestion**:
> "Consider extending timeline to 2 minutes or reducing outputs"
### Example 3: Warning Case
**Input**: Medium dataset (200 units)
**Outputs**: All outputs (13.0 units)
**Timeline**: 2 hours
**Message**:
> "This audit is feasible but has some considerations: Output complexity (13.0 units) is significantly higher than recommended (240.0 units). Estimated time: 17.4 seconds."
**Suggestion**:
> "Consider reducing the number of outputs or simplifying output requirements"
## Message Guidelines
1. **No Math Exposure**: Never show formulas like "O + 2I" to users
2. **Plain Language**: Use simple, clear language
3. **Actionable**: Always provide suggestions when warnings occur
4. **Contextual**: Messages adapt based on user selections
5. **Positive**: Frame positively when possible ("consider" vs "don't")
## Related Documentation
- [Orchestration Engine](./ORCHESTRATION_ENGINE.md)
- [Orchestration Design](./ORCHESTRATION_DESIGN.md)
- [Output Weights](./OUTPUT_WEIGHTS.md)