Files
FusionAGI/docs/interfaces.md

432 lines
13 KiB
Markdown
Raw Permalink Normal View History

# FusionAGI Interface Layer
Complete multi-modal interface system for admin control and user interaction.
## Overview
FusionAGI now provides two comprehensive interface layers:
1. **Admin Control Panel** - System management and configuration
2. **Multi-Modal User Interface** - Full sensory user experience
```mermaid
flowchart TB
subgraph admin [Admin Control Panel]
Voice[Voice Library]
Conv[Conversation Tuning]
Agent[Agent Config]
Monitor[System Monitoring]
Gov[Governance / MAA]
end
subgraph core [FusionAGI Core]
Orch[Orchestrator]
Mem[Memory]
Tools[Tools]
end
subgraph ui [Multi-Modal User Interface]
Text[Text]
VoiceUI[Voice]
Visual[Visual]
Haptic[Haptic]
Session[Session Mgmt]
Task[Task Integration]
end
admin --> Orch
Orch --> Mem
Orch --> Tools
ui --> Orch
Session --> Task
```
## Admin Control Panel
Administrative interface for managing all aspects of FusionAGI.
### Features
- **Voice Library Management**: Add, configure, and organize TTS voice profiles
- **Conversation Tuning**: Configure natural language styles and personalities
- **Agent Configuration**: Manage agent settings, permissions, and behavior
- **System Monitoring**: Real-time health metrics and performance tracking
- **Governance**: Policy management and audit log access
- **Manufacturing Authority**: MAA configuration and oversight
### Usage
`AdminControlPanel` accepts optional `voice_library` and `conversation_tuner` (default `None`); when omitted, internal defaults are created.
```python
from fusionagi import Orchestrator, EventBus, StateManager
from fusionagi.interfaces import AdminControlPanel, VoiceLibrary, ConversationTuner
from fusionagi.governance import PolicyEngine, AuditLog
# Initialize core components
bus = EventBus()
state = StateManager()
orch = Orchestrator(event_bus=bus, state_manager=state)
# Create admin panel
admin = AdminControlPanel(
orchestrator=orch,
event_bus=bus,
state_manager=state,
voice_library=VoiceLibrary(),
conversation_tuner=ConversationTuner(),
)
# Add voice profiles
from fusionagi.interfaces.voice import VoiceProfile
voice = VoiceProfile(
name="Professional Assistant",
language="en-US",
gender="neutral",
style="professional",
pitch=1.0,
speed=1.0,
)
admin.add_voice_profile(voice)
# Configure conversation styles
from fusionagi.interfaces.conversation import ConversationStyle
style = ConversationStyle(
formality="neutral",
verbosity="balanced",
empathy_level=0.8,
technical_depth=0.6,
)
admin.register_conversation_style("technical_support", style)
# Monitor system
status = admin.get_system_status()
print(f"Status: {status.status}, Active tasks: {status.active_tasks}")
# Export configuration
config = admin.export_configuration()
```
## Multi-Modal User Interface
Unified interface supporting multiple sensory modalities simultaneously.
### Supported Modalities
- **Text**: Chat, commands, structured input
- **Voice**: Speech-to-text, text-to-speech
- **Visual**: Images, video, AR/VR (extensible)
- **Haptic**: Touch feedback, vibration patterns (extensible)
- **Gesture**: Motion control, hand tracking (extensible)
- **Biometric**: Emotion detection, physiological signals (extensible)
### Features
- Seamless modality switching
- Simultaneous multi-modal I/O
- Accessibility support
- Context-aware modality selection
- Real-time feedback across all active modalities
### Usage
```python
from fusionagi.interfaces import MultiModalUI, VoiceInterface, ConversationManager
from fusionagi.interfaces.base import ModalityType
# Initialize components
voice = VoiceInterface(stt_provider="whisper", tts_provider="elevenlabs")
conv_manager = ConversationManager()
# Create multi-modal UI
ui = MultiModalUI(
orchestrator=orch,
conversation_manager=conv_manager,
voice_interface=voice,
)
# Create user session with preferred modalities
session_id = ui.create_session(
user_id="user123",
preferred_modalities=[ModalityType.TEXT, ModalityType.VOICE],
accessibility_settings={"screen_reader": True},
)
# Send multi-modal output
await ui.send_to_user(
session_id,
"Hello! How can I help you today?",
modalities=[ModalityType.TEXT, ModalityType.VOICE],
)
# Receive user input (any active modality)
message = await ui.receive_from_user(session_id, timeout_seconds=30.0)
# Submit task with interactive feedback
task_id = await ui.submit_task_interactive(
session_id,
goal="Analyze sales data and create report",
)
# Conversational interaction
response = await ui.converse(session_id, "What's the status of my task?")
```
## Voice Interface
Speech interaction with configurable voice profiles.
### Voice Library
```python
from fusionagi.interfaces import VoiceLibrary, VoiceProfile
library = VoiceLibrary()
# Add multiple voices
voices = [
VoiceProfile(
name="Friendly Assistant",
language="en-US",
gender="female",
style="friendly",
pitch=1.1,
speed=1.0,
),
VoiceProfile(
name="Technical Expert",
language="en-US",
gender="male",
style="professional",
pitch=0.9,
speed=0.95,
),
VoiceProfile(
name="Multilingual Guide",
language="es-ES",
gender="neutral",
style="calm",
),
]
for voice in voices:
library.add_voice(voice)
# Set default
library.set_default_voice(voices[0].id)
# Filter voices
spanish_voices = library.list_voices(language="es-ES")
female_voices = library.list_voices(gender="female")
```
### Speech-to-Text Providers
Supported STT providers (extensible):
- **Whisper**: OpenAI Whisper (local or API)
- **Azure**: Azure Cognitive Services
- **Google**: Google Cloud Speech-to-Text
- **Deepgram**: Deepgram API
### Text-to-Speech Providers
Supported TTS providers (extensible):
- **System**: OS-native TTS (pyttsx3)
- **ElevenLabs**: ElevenLabs API
- **Azure**: Azure Cognitive Services
- **Google**: Google Cloud TTS
## Conversation Management
Natural language conversation with tunable styles.
### Conversation Styles
```python
from fusionagi.interfaces import ConversationTuner, ConversationStyle
tuner = ConversationTuner()
# Define conversation styles
styles = {
"customer_support": ConversationStyle(
formality="neutral",
verbosity="balanced",
empathy_level=0.9,
proactivity=0.8,
technical_depth=0.4,
),
"technical_expert": ConversationStyle(
formality="formal",
verbosity="detailed",
empathy_level=0.5,
technical_depth=0.9,
humor_level=0.1,
),
"casual_friend": ConversationStyle(
formality="casual",
verbosity="balanced",
empathy_level=0.8,
humor_level=0.7,
technical_depth=0.3,
),
}
for name, style in styles.items():
tuner.register_style(name, style)
# Tune for specific context
tuned_style = tuner.tune_for_context(
domain="technical",
user_preferences={"verbosity": "concise"},
)
```
### Conversation Sessions
```python
from fusionagi.interfaces import ConversationManager, ConversationTurn
manager = ConversationManager(tuner=tuner)
# Create session
session_id = manager.create_session(
user_id="user123",
style_name="customer_support",
language="en",
domain="technical_support",
)
# Add conversation turns
manager.add_turn(ConversationTurn(
session_id=session_id,
speaker="user",
content="My system is not responding",
sentiment=-0.3,
))
manager.add_turn(ConversationTurn(
session_id=session_id,
speaker="agent",
content="I understand that's frustrating. Let me help you troubleshoot.",
sentiment=0.5,
))
# Get conversation history
history = manager.get_history(session_id, limit=10)
# Get context for LLM
context = manager.get_context_summary(session_id)
```
## Extending with New Modalities
To add a new sensory modality:
1. **Create Interface Adapter**:
```python
from fusionagi.interfaces.base import InterfaceAdapter, InterfaceCapabilities, InterfaceMessage, ModalityType
class HapticInterface(InterfaceAdapter):
def __init__(self):
super().__init__("haptic")
def capabilities(self) -> InterfaceCapabilities:
return InterfaceCapabilities(
supported_modalities=[ModalityType.HAPTIC],
supports_streaming=True,
supports_interruption=True,
)
async def send(self, message: InterfaceMessage) -> None:
# Send haptic feedback (vibration pattern, etc.)
pattern = message.content
await self._send_haptic_pattern(pattern)
async def receive(self, timeout_seconds: float | None = None) -> InterfaceMessage | None:
# Receive haptic input (touch, pressure, etc.)
data = await self._read_haptic_sensor(timeout_seconds)
return InterfaceMessage(
id=f"haptic_{uuid.uuid4().hex[:8]}",
modality=ModalityType.HAPTIC,
content=data,
)
```
2. **Register with UI**:
```python
haptic = HapticInterface()
ui.register_interface(ModalityType.HAPTIC, haptic)
```
3. **Enable for Session**:
```python
ui.enable_modality(session_id, ModalityType.HAPTIC)
```
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Admin Control Panel │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Voice Library│ │Conversation │ │ Agent Config │ │
│ │ Management │ │ Tuning │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ System │ │ Governance │ │ MAA │ │
│ │ Monitoring │ │ & Audit │ │ Control │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ FusionAGI Core System │
│ (Orchestrator, Agents, Memory, Tools) │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Multi-Modal User Interface │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Text │ │ Voice │ │ Visual │ │ Haptic │ │
│ │Interface │ │Interface │ │Interface │ │Interface │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Gesture │ │Biometric │ │
│ │Interface │ │Interface │ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────┘
```
## Integration with FusionAGI Core
The interface layer integrates seamlessly with FusionAGI's core components:
- **Orchestrator**: Task submission and monitoring
- **Event Bus**: Real-time updates and notifications
- **Agents**: Direct agent interaction and configuration
- **Memory**: Conversation history and user preferences
- **Governance**: Policy enforcement and audit logging
- **MAA**: Manufacturing authority oversight
## Next Steps
1. **Implement STT/TTS Providers**: Integrate with actual speech services
2. **Build Web UI**: Create web-based admin panel and user interface
3. **Add Visual Modality**: Support images, video, AR/VR
4. **Implement Haptic**: Add haptic feedback support
5. **Gesture Recognition**: Integrate motion tracking
6. **Biometric Sensors**: Add emotion and physiological monitoring
7. **Mobile Apps**: Native iOS/Android interfaces
8. **Accessibility**: Enhanced screen reader and assistive technology support
## License
MIT