Add full monorepo: virtual-banker, backend, frontend, docs, scripts, deployment
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
240
docs/CCIP_MONITOR_METRICS.md
Normal file
240
docs/CCIP_MONITOR_METRICS.md
Normal file
@@ -0,0 +1,240 @@
|
||||
# CCIP Monitor Metrics Documentation
|
||||
|
||||
**Date**: 2025-01-12
|
||||
**Network**: ChainID 138
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the metrics available from the CCIP Monitor service.
|
||||
|
||||
---
|
||||
|
||||
## CCIP Monitor Service
|
||||
|
||||
### Service Details
|
||||
|
||||
- **Container**: VMID 3501
|
||||
- **Service**: `ccip-monitor`
|
||||
- **Metrics Port**: 8000
|
||||
- **Metrics Endpoint**: `http://localhost:8000/metrics`
|
||||
|
||||
---
|
||||
|
||||
## Available Metrics
|
||||
|
||||
### System Metrics
|
||||
|
||||
#### `ccip_monitor_up`
|
||||
- **Type**: Gauge
|
||||
- **Description**: Service availability (1 = up, 0 = down)
|
||||
- **Labels**: None
|
||||
|
||||
#### `ccip_monitor_rpc_connected`
|
||||
- **Type**: Gauge
|
||||
- **Description**: RPC connection status (1 = connected, 0 = disconnected)
|
||||
- **Labels**: None
|
||||
|
||||
---
|
||||
|
||||
### CCIP Message Metrics
|
||||
|
||||
#### `ccip_messages_sent_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total number of CCIP messages sent
|
||||
- **Labels**:
|
||||
- `source_chain`: Source chain identifier
|
||||
- `destination_chain`: Destination chain identifier
|
||||
- `status`: Message status (success, failed)
|
||||
|
||||
#### `ccip_messages_received_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total number of CCIP messages received
|
||||
- **Labels**:
|
||||
- `source_chain`: Source chain identifier
|
||||
- `destination_chain`: Destination chain identifier
|
||||
- `status`: Message status (success, failed)
|
||||
|
||||
#### `ccip_messages_pending`
|
||||
- **Type**: Gauge
|
||||
- **Description**: Number of pending CCIP messages
|
||||
- **Labels**:
|
||||
- `source_chain`: Source chain identifier
|
||||
- `destination_chain`: Destination chain identifier
|
||||
|
||||
---
|
||||
|
||||
### Bridge Metrics
|
||||
|
||||
#### `bridge_transactions_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total number of bridge transactions
|
||||
- **Labels**:
|
||||
- `bridge_type`: Bridge type (WETH9, WETH10)
|
||||
- `destination_chain`: Destination chain identifier
|
||||
- `status`: Transaction status (success, failed)
|
||||
|
||||
#### `bridge_token_amount_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total amount of tokens bridged
|
||||
- **Labels**:
|
||||
- `bridge_type`: Bridge type (WETH9, WETH10)
|
||||
- `destination_chain`: Destination chain identifier
|
||||
- `token_type`: Token type
|
||||
|
||||
---
|
||||
|
||||
### Fee Metrics
|
||||
|
||||
#### `ccip_fees_paid_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total CCIP fees paid
|
||||
- **Labels**:
|
||||
- `fee_token`: Fee token address
|
||||
- `destination_chain`: Destination chain identifier
|
||||
|
||||
#### `ccip_fee_calculation_errors_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total fee calculation errors
|
||||
- **Labels**: None
|
||||
|
||||
---
|
||||
|
||||
### Error Metrics
|
||||
|
||||
#### `ccip_errors_total`
|
||||
- **Type**: Counter
|
||||
- **Description**: Total number of errors
|
||||
- **Labels**:
|
||||
- `error_type`: Error type
|
||||
- `component`: Component where error occurred
|
||||
|
||||
---
|
||||
|
||||
## Querying Metrics
|
||||
|
||||
### Using curl
|
||||
|
||||
```bash
|
||||
curl http://localhost:8000/metrics
|
||||
```
|
||||
|
||||
### Using Prometheus
|
||||
|
||||
If Prometheus is configured to scrape the metrics endpoint:
|
||||
|
||||
```promql
|
||||
# Service availability
|
||||
ccip_monitor_up
|
||||
|
||||
# Total messages sent
|
||||
sum(ccip_messages_sent_total)
|
||||
|
||||
# Pending messages
|
||||
sum(ccip_messages_pending)
|
||||
|
||||
# Bridge transactions
|
||||
sum(bridge_transactions_total)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Metric Examples
|
||||
|
||||
### Example Metrics Output
|
||||
|
||||
```
|
||||
# HELP ccip_monitor_up Service availability
|
||||
# TYPE ccip_monitor_up gauge
|
||||
ccip_monitor_up 1
|
||||
|
||||
# HELP ccip_messages_sent_total Total CCIP messages sent
|
||||
# TYPE ccip_messages_sent_total counter
|
||||
ccip_messages_sent_total{source_chain="138",destination_chain="1",status="success"} 10
|
||||
ccip_messages_sent_total{source_chain="138",destination_chain="1",status="failed"} 1
|
||||
|
||||
# HELP bridge_transactions_total Total bridge transactions
|
||||
# TYPE bridge_transactions_total counter
|
||||
bridge_transactions_total{bridge_type="WETH9",destination_chain="1",status="success"} 5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Setup
|
||||
|
||||
### Prometheus Configuration
|
||||
|
||||
```yaml
|
||||
scrape_configs:
|
||||
- job_name: 'ccip-monitor'
|
||||
static_configs:
|
||||
- targets: ['localhost:8000']
|
||||
```
|
||||
|
||||
### Grafana Dashboard
|
||||
|
||||
Create dashboard with:
|
||||
- Service availability
|
||||
- Message throughput
|
||||
- Bridge transaction volume
|
||||
- Error rates
|
||||
- Fee usage
|
||||
|
||||
---
|
||||
|
||||
## Alerting
|
||||
|
||||
### Recommended Alerts
|
||||
|
||||
1. **Service Down**
|
||||
- Alert when `ccip_monitor_up == 0`
|
||||
- Severity: Critical
|
||||
|
||||
2. **High Error Rate**
|
||||
- Alert when error rate exceeds threshold
|
||||
- Severity: Warning
|
||||
|
||||
3. **Pending Messages**
|
||||
- Alert when pending messages exceed threshold
|
||||
- Severity: Warning
|
||||
|
||||
4. **RPC Disconnected**
|
||||
- Alert when `ccip_monitor_rpc_connected == 0`
|
||||
- Severity: Critical
|
||||
|
||||
---
|
||||
|
||||
## Health Check
|
||||
|
||||
### Using Health Check Script
|
||||
|
||||
```bash
|
||||
./scripts/check-ccip-monitor-health.sh
|
||||
```
|
||||
|
||||
### Manual Check
|
||||
|
||||
```bash
|
||||
# Check service status
|
||||
pct exec 3501 -- systemctl status ccip-monitor
|
||||
|
||||
# Check metrics endpoint
|
||||
curl http://localhost:8000/metrics
|
||||
|
||||
# Check logs
|
||||
pct exec 3501 -- journalctl -u ccip-monitor -n 50
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [CCIP Operations Runbook](./CCIP_OPERATIONS_RUNBOOK.md) (Task 135)
|
||||
- [CCIP Configuration Status](./CCIP_CONFIGURATION_STATUS.md)
|
||||
- [Complete Task Catalog](./CCIP_COMPLETE_TASK_CATALOG.md)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-01-12
|
||||
|
||||
Reference in New Issue
Block a user