Files
explorer-monorepo/docs/specs/mempool/mempool-service.md

252 lines
5.5 KiB
Markdown
Raw Normal View History

# Mempool Service Architecture Specification
## Overview
This document specifies the mempool service that tracks pending transactions, monitors transaction propagation, and provides real-time mempool insights.
## Architecture
```mermaid
flowchart TB
subgraph RPC[RPC Node]
WS[WebSocket<br/>New Pending Tx]
Poll[Poll<br/>Pending Tx]
end
subgraph Ingest[Ingestion]
Listener[Transaction Listener]
Validator[Transaction Validator]
Queue[Message Queue]
end
subgraph Process[Processing]
Tracker[Propagation Tracker]
RBF[RBF Detector]
Fee[Fee Calculator]
Storage[Storage]
end
subgraph Output[Output]
API[API Endpoints]
WS_Out[WebSocket<br/>Subscriptions]
Metrics[Metrics]
end
WS --> Listener
Poll --> Listener
Listener --> Validator
Validator --> Queue
Queue --> Tracker
Queue --> RBF
Queue --> Fee
Tracker --> Storage
RBF --> Storage
Fee --> Storage
Storage --> API
Storage --> WS_Out
Storage --> Metrics
```
## Transaction Ingestion
### Sources
**1. WebSocket Subscription**:
- Subscribe to `pendingTransactions` via `eth_subscribe`
- Real-time updates as transactions enter mempool
**2. Polling**:
- Poll `eth_pendingTransactions` every 1-2 seconds
- Fallback if WebSocket unavailable
**3. Transaction Submission**:
- Track transactions submitted via our API
- Link user submissions to mempool status
### Transaction Validation
**Validation Checks**:
- Valid transaction format
- Valid signature
- Sufficient nonce (account state check)
- Sufficient balance (for value transfers)
- Valid gas price (above minimum)
**Invalid Transactions**: Log but don't track (will be rejected by network)
## Transaction Tracking
### Transaction State
**States**:
- `pending`: In mempool
- `confirmed`: Included in block
- `dropped`: Removed from mempool (replaced or expired)
- `replaced`: Replaced by higher gas price transaction (RBF)
### Propagation Tracking
**Metrics Tracked**:
- First seen timestamp
- Last seen timestamp
- Propagation time (first seen → confirmed)
- Propagation path (which nodes saw it first)
**Method**:
- Track when transaction first appears
- Monitor mempool across multiple nodes
- Calculate propagation statistics
### RBF (Replace-by-Fee) Detection
**Detection**:
- Monitor transactions with same nonce
- Detect higher gas price replacements
- Link old transaction to new transaction
**Data Stored**:
- Original transaction hash
- Replacement transaction hash
- Gas price increase
- Replacement timestamp
### Bundle/MEV Visibility
**Detection** (where supported):
- Identify transaction bundles
- Detect MEV patterns
- Track front-running/back-running
**Limitations**: Depends on chain/node capabilities
### Private Transaction Markers
**Detection**:
- Identify private transaction services (Flashbots, etc.)
- Mark transactions as private
- Track private vs public mempool
## Storage Schema
See `../database/timeseries-schema.md` for detailed schema.
**Key Fields**:
- Transaction hash
- From/to addresses
- Value, gas price, gas limit
- Nonce
- First seen timestamp
- Status
- Confirmed block number (when confirmed)
- Confirmed timestamp
## Fee Estimation
### Fee Calculation
**Methods**:
1. **Historical Analysis**: Analyze recent block fees
2. **Percentile Method**: Calculate percentiles of recent transactions
3. **Market-Based**: Track current mempool competition
**Fee Estimates**:
- Slow: 25th percentile
- Standard: 50th percentile (median)
- Fast: 75th percentile
- Urgent: 95th percentile
**Update Frequency**: Every block (real-time)
## API Endpoints
### Get Pending Transactions
`GET /api/v1/mempool/{chain_id}/transactions`
**Query Parameters**:
- `from_address`: Filter by sender
- `to_address`: Filter by recipient
- `min_value`: Minimum value
- `min_gas_price`: Minimum gas price
- `limit`: Max results (default: 100)
**Response**: Array of pending transactions
### Get Transaction Status
`GET /api/v1/mempool/{chain_id}/transactions/{hash}`
**Response**: Transaction status and propagation info
### Get Fee Estimates
`GET /api/v1/mempool/{chain_id}/fees`
**Response**:
```json
{
"slow": "20000000000",
"standard": "30000000000",
"fast": "50000000000",
"urgent": "100000000000"
}
```
## WebSocket Subscriptions
See `../api/websocket-api.md` for WebSocket API details.
**Channels**:
- `pending_transactions`: New pending transactions
- `transaction_status`: Status updates for specific transactions
- `fee_updates`: Fee estimate updates
## Data Retention
**Raw Data**: 7 days (detailed transaction data)
**Aggregated Data**: 30 days (fee statistics, propagation metrics)
**Archived Data**: Move to data lake after retention period
## Performance Considerations
### Throughput
**Target**: Process 1000 transactions/second
**Scalability**: Horizontal scaling with message queue
### Latency
**Target**:
- Ingestion latency: < 1 second
- API response time: < 100ms (p95)
### Storage Optimization
**Strategy**:
- Time-series database for efficient queries
- Partition by time (daily partitions)
- Automatic cleanup of old data
## Monitoring
### Metrics
- Pending transaction count
- Transaction ingestion rate
- Confirmation rate
- Average propagation time
- Fee estimate accuracy
### Alerts
- High pending transaction count (> 10,000)
- Low confirmation rate (< 50% within 5 minutes)
- Fee estimate errors
## References
- Time-Series Schema: See `../database/timeseries-schema.md`
- WebSocket API: See `../api/websocket-api.md`
- Fee Oracle: See `fee-oracle.md`