Files
explorer-monorepo/docs/specs/multichain/unified-search.md

193 lines
3.8 KiB
Markdown

# Unified Search Architecture
## Overview
This document specifies the unified search architecture that enables searching across multiple chains and entity types with relevance ranking.
## Architecture
```mermaid
flowchart TB
Query[Search Query]
Router[Query Router]
subgraph Search[Search Services]
ES1[Elasticsearch<br/>Chain 138]
ES2[Elasticsearch<br/>Chain 1]
ES3[Elasticsearch<br/>Chain 137]
end
Agg[Aggregator]
Rank[Relevance Ranking]
Results[Unified Results]
Query --> Router
Router --> ES1
Router --> ES2
Router --> ES3
ES1 --> Agg
ES2 --> Agg
ES3 --> Agg
Agg --> Rank
Rank --> Results
```
## Search Algorithm
### Query Processing
**Steps**:
1. Parse query (extract terms, filters)
2. Determine chain scope (all chains or specific chain)
3. Route to appropriate search indices
4. Execute searches in parallel
5. Aggregate results
6. Rank by relevance
7. Return unified results
### Query Types
**Exact Match** (Hash, Address):
- Direct lookup in specific chain
- Return single result if found
**Full-Text Search** (Name, Symbol, Label):
- Search across all chains
- Rank by relevance
- Return top N results per chain
**Fuzzy Search** (Typos, Partial matches):
- Use fuzzy matching
- Rank by similarity
- Include suggestions
## Ranking and Relevance Scoring
### Relevance Factors
**1. Exact Match Score**:
- Exact match: 100%
- Prefix match: 80%
- Fuzzy match: 60%
**2. Chain Relevance**:
- User's preferred chain: +20%
- Popular chains: +10%
**3. Entity Type Relevance**:
- Addresses: Highest (most specific)
- Transactions: High
- Blocks: Medium
- Tokens: Medium
- Contracts: Lower (unless verified)
**4. Popularity Score**:
- Transaction count
- Token holder count
- Contract usage
### Scoring Formula
```
score = (exact_match_score * 0.5) +
(chain_relevance * 0.2) +
(entity_type_relevance * 0.2) +
(popularity_score * 0.1)
```
## Result Aggregation
### Aggregation Strategy
**Per-Chain Results**:
- Limit results per chain (e.g., top 10)
- Combine across chains
- Remove duplicates (same address on multiple chains)
### Result Format
```json
{
"query": "0x123...",
"total_results": 5,
"results": [
{
"type": "address",
"chain_id": 138,
"address": "0x123...",
"label": "My Wallet",
"score": 0.95
},
{
"type": "transaction",
"chain_id": 138,
"hash": "0x123...",
"score": 0.80
}
],
"chains_searched": [138, 1, 137]
}
```
## Performance Optimization
### Caching
**Cache Strategy**:
- Cache popular queries (top 1000)
- Cache duration: 1 minute
- Invalidate on data updates
### Parallel Search
**Strategy**: Execute searches across chains in parallel
**Benefits**:
- Faster response time
- Better resource utilization
### Result Limiting
**Per-Chain Limit**: Top 10-20 results per chain
**Total Limit**: Top 50-100 results total
## Search Indexes
### Per-Chain Indices
**Index Names**: `{entity_type}-{chain_id}` (e.g., `addresses-138`)
**Benefits**:
- Independent scaling per chain
- Chain-specific optimizations
- Easy chain addition/removal
### Global Index (Optional)
**Use Case**: Quick lookup across all chains
**Implementation**:
- Separate index with chain_id field
- Less detailed than per-chain indices
- Faster for simple queries
## API Endpoint
### Unified Search
`GET /api/v1/search`
**Query Parameters**:
- `q` (string, required): Search query
- `chain_id` (integer, optional): Filter by chain
- `type` (string, optional): Filter by type (address, transaction, block, token, contract)
- `limit` (integer, default: 50): Max results
**Response**: Unified search results (see format above)
## References
- Search Index Schema: See `../database/search-index-schema.md`
- Multi-chain Indexing: See `multichain-indexing.md`