31 lines
1.5 KiB
Markdown
31 lines
1.5 KiB
Markdown
|
|
# ADR-003: Consequence Engine for Ethical Learning
|
||
|
|
|
||
|
|
## Status
|
||
|
|
Accepted
|
||
|
|
|
||
|
|
## Context
|
||
|
|
Traditional AI ethics systems use static rules (constitutional AI, RLHF reward models). FusionAGI needed a system that could learn ethical behavior from experience — understanding that every choice carries consequences and that risk/reward assessment improves with data.
|
||
|
|
|
||
|
|
## Decision
|
||
|
|
Implemented a **ConsequenceEngine** that:
|
||
|
|
1. Records every choice the system makes (action + alternatives considered)
|
||
|
|
2. Estimates risk and reward before acting
|
||
|
|
3. Records actual outcomes after execution
|
||
|
|
4. Computes "surprise factor" (prediction error)
|
||
|
|
5. Feeds into AdaptiveEthics for lesson generation
|
||
|
|
6. Uses adaptive risk memory window that grows with experience
|
||
|
|
|
||
|
|
The weight system for ethical lessons is **unclamped** — extreme outcomes can push lesson weights below 0 (strong negative signal) or above 1.
|
||
|
|
|
||
|
|
## Consequences
|
||
|
|
- The system develops genuine experiential ethics rather than rule-following
|
||
|
|
- Early-stage behavior may be more exploratory (higher risk)
|
||
|
|
- All consequence records are persisted via PersistentLearningStore
|
||
|
|
- Cross-head learning via InsightBus amplifies ethical insights
|
||
|
|
- The SelfModel's values evolve based on consequence feedback
|
||
|
|
|
||
|
|
## Alternatives Considered
|
||
|
|
1. **RLHF-style reward model** — Rejected: requires human feedback loop, doesn't scale
|
||
|
|
2. **Constitutional AI** — Rejected: static rules, doesn't learn
|
||
|
|
3. **No ethics system** — Rejected: need accountability and learning signal
|