# Mempool Service Architecture Specification ## Overview This document specifies the mempool service that tracks pending transactions, monitors transaction propagation, and provides real-time mempool insights. ## Architecture ```mermaid flowchart TB subgraph RPC[RPC Node] WS[WebSocket
New Pending Tx] Poll[Poll
Pending Tx] end subgraph Ingest[Ingestion] Listener[Transaction Listener] Validator[Transaction Validator] Queue[Message Queue] end subgraph Process[Processing] Tracker[Propagation Tracker] RBF[RBF Detector] Fee[Fee Calculator] Storage[Storage] end subgraph Output[Output] API[API Endpoints] WS_Out[WebSocket
Subscriptions] Metrics[Metrics] end WS --> Listener Poll --> Listener Listener --> Validator Validator --> Queue Queue --> Tracker Queue --> RBF Queue --> Fee Tracker --> Storage RBF --> Storage Fee --> Storage Storage --> API Storage --> WS_Out Storage --> Metrics ``` ## Transaction Ingestion ### Sources **1. WebSocket Subscription**: - Subscribe to `pendingTransactions` via `eth_subscribe` - Real-time updates as transactions enter mempool **2. Polling**: - Poll `eth_pendingTransactions` every 1-2 seconds - Fallback if WebSocket unavailable **3. Transaction Submission**: - Track transactions submitted via our API - Link user submissions to mempool status ### Transaction Validation **Validation Checks**: - Valid transaction format - Valid signature - Sufficient nonce (account state check) - Sufficient balance (for value transfers) - Valid gas price (above minimum) **Invalid Transactions**: Log but don't track (will be rejected by network) ## Transaction Tracking ### Transaction State **States**: - `pending`: In mempool - `confirmed`: Included in block - `dropped`: Removed from mempool (replaced or expired) - `replaced`: Replaced by higher gas price transaction (RBF) ### Propagation Tracking **Metrics Tracked**: - First seen timestamp - Last seen timestamp - Propagation time (first seen → confirmed) - Propagation path (which nodes saw it first) **Method**: - Track when transaction first appears - Monitor mempool across multiple nodes - Calculate propagation statistics ### RBF (Replace-by-Fee) Detection **Detection**: - Monitor transactions with same nonce - Detect higher gas price replacements - Link old transaction to new transaction **Data Stored**: - Original transaction hash - Replacement transaction hash - Gas price increase - Replacement timestamp ### Bundle/MEV Visibility **Detection** (where supported): - Identify transaction bundles - Detect MEV patterns - Track front-running/back-running **Limitations**: Depends on chain/node capabilities ### Private Transaction Markers **Detection**: - Identify private transaction services (Flashbots, etc.) - Mark transactions as private - Track private vs public mempool ## Storage Schema See `../database/timeseries-schema.md` for detailed schema. **Key Fields**: - Transaction hash - From/to addresses - Value, gas price, gas limit - Nonce - First seen timestamp - Status - Confirmed block number (when confirmed) - Confirmed timestamp ## Fee Estimation ### Fee Calculation **Methods**: 1. **Historical Analysis**: Analyze recent block fees 2. **Percentile Method**: Calculate percentiles of recent transactions 3. **Market-Based**: Track current mempool competition **Fee Estimates**: - Slow: 25th percentile - Standard: 50th percentile (median) - Fast: 75th percentile - Urgent: 95th percentile **Update Frequency**: Every block (real-time) ## API Endpoints ### Get Pending Transactions `GET /api/v1/mempool/{chain_id}/transactions` **Query Parameters**: - `from_address`: Filter by sender - `to_address`: Filter by recipient - `min_value`: Minimum value - `min_gas_price`: Minimum gas price - `limit`: Max results (default: 100) **Response**: Array of pending transactions ### Get Transaction Status `GET /api/v1/mempool/{chain_id}/transactions/{hash}` **Response**: Transaction status and propagation info ### Get Fee Estimates `GET /api/v1/mempool/{chain_id}/fees` **Response**: ```json { "slow": "20000000000", "standard": "30000000000", "fast": "50000000000", "urgent": "100000000000" } ``` ## WebSocket Subscriptions See `../api/websocket-api.md` for WebSocket API details. **Channels**: - `pending_transactions`: New pending transactions - `transaction_status`: Status updates for specific transactions - `fee_updates`: Fee estimate updates ## Data Retention **Raw Data**: 7 days (detailed transaction data) **Aggregated Data**: 30 days (fee statistics, propagation metrics) **Archived Data**: Move to data lake after retention period ## Performance Considerations ### Throughput **Target**: Process 1000 transactions/second **Scalability**: Horizontal scaling with message queue ### Latency **Target**: - Ingestion latency: < 1 second - API response time: < 100ms (p95) ### Storage Optimization **Strategy**: - Time-series database for efficient queries - Partition by time (daily partitions) - Automatic cleanup of old data ## Monitoring ### Metrics - Pending transaction count - Transaction ingestion rate - Confirmation rate - Average propagation time - Fee estimate accuracy ### Alerts - High pending transaction count (> 10,000) - Low confirmation rate (< 50% within 5 minutes) - Fee estimate errors ## References - Time-Series Schema: See `../database/timeseries-schema.md` - WebSocket API: See `../api/websocket-api.md` - Fee Oracle: See `fee-oracle.md`