Add full monorepo: virtual-banker, backend, frontend, docs, scripts, deployment
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
166
docs/ARCHITECTURE.md
Normal file
166
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,166 @@
|
||||
# Virtual Banker Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The Virtual Banker is a multi-layered system that provides a digital human banking experience with full video realism, real-time voice interaction, and embeddable widget capabilities.
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Client Layer │
|
||||
│ ┌──────────────────────────────────────────────────────┐ │
|
||||
│ │ Embeddable Widget (React/TypeScript) │ │
|
||||
│ │ - Chat UI │ │
|
||||
│ │ - Voice Controls │ │
|
||||
│ │ - Avatar View │ │
|
||||
│ │ - WebRTC Client │ │
|
||||
│ └──────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Edge Layer │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ CDN │ │ API Gateway │ │ WebRTC │ │
|
||||
│ │ (Widget) │ │ (Auth/Rate) │ │ Gateway │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Core Services │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ Session │ │ Orchestrator │ │ LLM Gateway │ │
|
||||
│ │ Service │ │ │ │ │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ RAG Service │ │ Tool/Action │ │ Safety/ │ │
|
||||
│ │ │ │ Service │ │ Compliance │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Media Services │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ ASR Service │ │ TTS Service │ │ Avatar │ │
|
||||
│ │ (Streaming) │ │ (Streaming) │ │ Renderer │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Data Layer │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ PostgreSQL │ │ Redis │ │ Vector DB │ │
|
||||
│ │ (State) │ │ (Cache) │ │ (pgvector) │ │
|
||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Voice Turn Flow
|
||||
|
||||
1. **User speaks** → Widget captures audio via microphone
|
||||
2. **Audio stream** → WebRTC gateway → ASR service
|
||||
3. **ASR** → Transcribes to text (partial + final)
|
||||
4. **Orchestrator** → Sends transcript to LLM with context
|
||||
5. **LLM** → Generates response + tool calls + emotion tags
|
||||
6. **TTS** → Converts text to audio stream
|
||||
7. **Avatar** → Generates visemes, expressions, gestures
|
||||
8. **Widget** → Plays audio, displays captions, animates avatar
|
||||
|
||||
### Text Turn Flow
|
||||
|
||||
1. **User types** → Widget sends text message
|
||||
2. **Orchestrator** → Processes message (same as step 4+ above)
|
||||
|
||||
## Components
|
||||
|
||||
### Backend Services
|
||||
|
||||
#### Session Service
|
||||
- Creates and manages sessions
|
||||
- Issues ephemeral tokens
|
||||
- Loads tenant configurations
|
||||
- Tracks session state
|
||||
|
||||
#### Conversation Orchestrator
|
||||
- Maintains conversation state machine
|
||||
- Routes messages to appropriate services
|
||||
- Handles barge-in (interruptions)
|
||||
- Synchronizes audio/video
|
||||
|
||||
#### LLM Gateway
|
||||
- Multi-tenant prompt templates
|
||||
- Function/tool calling
|
||||
- Output schema enforcement
|
||||
- Model routing
|
||||
|
||||
#### RAG Service
|
||||
- Document ingestion and embedding
|
||||
- Vector similarity search
|
||||
- Reranking
|
||||
- Citation formatting
|
||||
|
||||
#### Tool/Action Service
|
||||
- Tool registry and execution
|
||||
- Banking service integrations
|
||||
- Human-in-the-loop confirmations
|
||||
- Audit logging
|
||||
|
||||
### Frontend Widget
|
||||
|
||||
#### Components
|
||||
- **ChatPanel**: Main chat interface
|
||||
- **VoiceControls**: Push-to-talk, hands-free, volume
|
||||
- **AvatarView**: Video stream display
|
||||
- **Captions**: Real-time captions overlay
|
||||
- **Settings**: User preferences
|
||||
|
||||
#### Hooks
|
||||
- **useSession**: Session management
|
||||
- **useConversation**: Message handling
|
||||
- **useWebRTC**: WebRTC connection
|
||||
|
||||
### Avatar System
|
||||
|
||||
#### Unreal Engine
|
||||
- Digital human character
|
||||
- Blendshapes for visemes/expressions
|
||||
- Animation blueprints
|
||||
- PixelStreaming for video output
|
||||
|
||||
#### Render Service
|
||||
- Controls Unreal instances
|
||||
- Manages GPU resources
|
||||
- Streams video via WebRTC
|
||||
|
||||
## Security
|
||||
|
||||
- JWT/SSO authentication
|
||||
- Ephemeral session tokens
|
||||
- PII redaction
|
||||
- Content filtering
|
||||
- Rate limiting
|
||||
- Audit trails
|
||||
|
||||
## Accessibility
|
||||
|
||||
- WCAG 2.1 AA compliance
|
||||
- Keyboard navigation
|
||||
- Screen reader support
|
||||
- Captions (always available)
|
||||
- Reduced motion support
|
||||
- ARIA labels
|
||||
|
||||
## Scalability
|
||||
|
||||
- Stateless services (behind load balancer)
|
||||
- Redis for session caching
|
||||
- PostgreSQL for persistent state
|
||||
- GPU cluster for avatar rendering
|
||||
- CDN for widget assets
|
||||
|
||||
Reference in New Issue
Block a user