Complete markdown files cleanup and organization
- Organized 252 files across project - Root directory: 187 → 2 files (98.9% reduction) - Moved configuration guides to docs/04-configuration/ - Moved troubleshooting guides to docs/09-troubleshooting/ - Moved quick start guides to docs/01-getting-started/ - Moved reports to reports/ directory - Archived temporary files - Generated comprehensive reports and documentation - Created maintenance scripts and guides All files organized according to established standards.
This commit is contained in:
165
docs/09-troubleshooting/FIX_TUNNEL_ALTERNATIVES.md
Normal file
165
docs/09-troubleshooting/FIX_TUNNEL_ALTERNATIVES.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Fix Tunnel - Alternative Methods
|
||||
|
||||
## Problem
|
||||
|
||||
The `fix-shared-tunnel.sh` script cannot connect because your machine is on `192.168.1.0/24` and cannot directly reach `192.168.11.0/24`.
|
||||
|
||||
## Solution Methods
|
||||
|
||||
### Method 1: Use SSH Tunnel ⭐ Recommended
|
||||
|
||||
```bash
|
||||
# Terminal 1: Start SSH tunnel
|
||||
./setup_ssh_tunnel.sh
|
||||
|
||||
# Terminal 2: Run fix with localhost
|
||||
PROXMOX_HOST=localhost ./fix-shared-tunnel.sh
|
||||
```
|
||||
|
||||
### Method 2: Manual File Deployment
|
||||
|
||||
The script automatically generates configuration files when connection fails:
|
||||
|
||||
**Location**: `/tmp/tunnel-fix-10ab22da-8ea3-4e2e-a896-27ece2211a05/`
|
||||
|
||||
**Files**:
|
||||
- `tunnel-services.yml` - Tunnel configuration
|
||||
- `cloudflared-services.service` - Systemd service
|
||||
- `DEPLOY_INSTRUCTIONS.md` - Deployment guide
|
||||
|
||||
**Deploy from Proxmox host**:
|
||||
```bash
|
||||
# Copy files to Proxmox host
|
||||
scp -r /tmp/tunnel-fix-* root@192.168.11.12:/tmp/
|
||||
|
||||
# SSH to Proxmox host
|
||||
ssh root@192.168.11.12
|
||||
|
||||
# Deploy to container
|
||||
pct push 102 /tmp/tunnel-fix-*/tunnel-services.yml /etc/cloudflared/tunnel-services.yml
|
||||
pct push 102 /tmp/tunnel-fix-*/cloudflared-services.service /etc/systemd/system/cloudflared-services.service
|
||||
pct exec 102 -- chmod 600 /etc/cloudflared/tunnel-services.yml
|
||||
pct exec 102 -- systemctl daemon-reload
|
||||
pct exec 102 -- systemctl enable cloudflared-services.service
|
||||
pct exec 102 -- systemctl start cloudflared-services.service
|
||||
```
|
||||
|
||||
### Method 3: Cloudflare Dashboard ⭐ Easiest
|
||||
|
||||
1. Go to: https://one.dash.cloudflare.com/
|
||||
2. Navigate to: **Zero Trust** → **Networks** → **Tunnels**
|
||||
3. Find tunnel: `10ab22da-8ea3-4e2e-a896-27ece2211a05`
|
||||
4. Click **Configure**
|
||||
5. Add all hostnames:
|
||||
|
||||
| Hostname | Service | URL |
|
||||
|----------|---------|-----|
|
||||
| dbis-admin.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| dbis-api.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| dbis-api-2.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| mim4u.org.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| www.mim4u.org.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| rpc-http-prv.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| rpc-http-pub.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| rpc-ws-prv.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
| rpc-ws-pub.d-bis.org | HTTP | 192.168.11.21:80 |
|
||||
|
||||
6. Add catch-all rule: **HTTP 404: Not Found** (must be last)
|
||||
7. Save configuration
|
||||
8. Wait 1-2 minutes for tunnel to reload
|
||||
|
||||
### Method 4: Run from Proxmox Network
|
||||
|
||||
If you have access to a machine on `192.168.11.0/24`:
|
||||
|
||||
```bash
|
||||
# Copy script to that machine
|
||||
scp fix-shared-tunnel.sh user@192.168.11.x:/tmp/
|
||||
|
||||
# SSH to that machine and run
|
||||
ssh user@192.168.11.x
|
||||
cd /tmp
|
||||
chmod +x fix-shared-tunnel.sh
|
||||
./fix-shared-tunnel.sh
|
||||
```
|
||||
|
||||
### Method 5: Direct Container Access
|
||||
|
||||
If you can access the container directly:
|
||||
|
||||
```bash
|
||||
# Create config file inside container
|
||||
pct exec 102 -- bash << 'EOF'
|
||||
cat > /etc/cloudflared/tunnel-services.yml << 'CONFIG'
|
||||
tunnel: 10ab22da-8ea3-4e2e-a896-27ece2211a05
|
||||
credentials-file: /etc/cloudflared/credentials-services.json
|
||||
|
||||
ingress:
|
||||
- hostname: dbis-admin.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: dbis-admin.d-bis.org
|
||||
- hostname: dbis-api.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: dbis-api.d-bis.org
|
||||
- hostname: dbis-api-2.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: dbis-api-2.d-bis.org
|
||||
- hostname: mim4u.org.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: mim4u.org.d-bis.org
|
||||
- hostname: www.mim4u.org.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: www.mim4u.org.d-bis.org
|
||||
- hostname: rpc-http-prv.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: rpc-http-prv.d-bis.org
|
||||
- hostname: rpc-http-pub.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: rpc-http-pub.d-bis.org
|
||||
- hostname: rpc-ws-prv.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: rpc-ws-prv.d-bis.org
|
||||
- hostname: rpc-ws-pub.d-bis.org
|
||||
service: http://192.168.11.21:80
|
||||
originRequest:
|
||||
httpHostHeader: rpc-ws-pub.d-bis.org
|
||||
- service: http_status:404
|
||||
|
||||
metrics: 127.0.0.1:9090
|
||||
loglevel: info
|
||||
gracePeriod: 30s
|
||||
CONFIG
|
||||
|
||||
chmod 600 /etc/cloudflared/tunnel-services.yml
|
||||
EOF
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
After applying any method:
|
||||
|
||||
```bash
|
||||
# Check tunnel status in Cloudflare Dashboard
|
||||
# Should change from DOWN to HEALTHY
|
||||
|
||||
# Test endpoints
|
||||
curl -I https://dbis-admin.d-bis.org
|
||||
curl -I https://rpc-http-pub.d-bis.org
|
||||
curl -I https://dbis-api.d-bis.org
|
||||
```
|
||||
|
||||
## Recommended Approach
|
||||
|
||||
**For Quick Fix**: Use **Method 3 (Cloudflare Dashboard)** - No SSH needed, immediate effect
|
||||
|
||||
**For Automation**: Use **Method 1 (SSH Tunnel)** - Scriptable, repeatable
|
||||
|
||||
**For Production**: Use **Method 2 (Manual Deployment)** - Most control, can review files first
|
||||
460
docs/09-troubleshooting/METAMASK_TROUBLESHOOTING_GUIDE.md
Normal file
460
docs/09-troubleshooting/METAMASK_TROUBLESHOOTING_GUIDE.md
Normal file
@@ -0,0 +1,460 @@
|
||||
# MetaMask Troubleshooting Guide - ChainID 138
|
||||
|
||||
**Date**: $(date)
|
||||
**Network**: SMOM-DBIS-138 (ChainID 138)
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Common Issues & Solutions
|
||||
|
||||
### 1. Network Connection Issues
|
||||
|
||||
#### Issue: "Could not fetch chain ID. Is your RPC URL correct?"
|
||||
|
||||
**Symptoms**:
|
||||
- MetaMask shows error: "Could not fetch chain ID. Is your RPC URL correct?"
|
||||
- Network won't connect
|
||||
- Can't fetch balance
|
||||
|
||||
**Root Cause**: The RPC endpoint is requiring JWT authentication, which MetaMask doesn't support.
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Remove and Re-add Network with Correct RPC URL**
|
||||
- MetaMask → Settings → Networks
|
||||
- Find "Defi Oracle Meta Mainnet" or "SMOM-DBIS-138"
|
||||
- Click "Delete" or "Remove"
|
||||
- Click "Add Network" → "Add a network manually"
|
||||
- Enter these exact values:
|
||||
- **Network Name**: `Defi Oracle Meta Mainnet`
|
||||
- **RPC URL**: `https://rpc-http-pub.d-bis.org`
|
||||
- **Chain ID**: `138` (must be decimal, not hex)
|
||||
- **Currency Symbol**: `ETH`
|
||||
- **Block Explorer URL**: `https://explorer.d-bis.org` (optional)
|
||||
- Click "Save"
|
||||
|
||||
2. **If RPC URL Still Requires Authentication (Server Issue)**
|
||||
- The public RPC endpoint should NOT require JWT authentication
|
||||
- Contact network administrators to fix server configuration
|
||||
- VMID 2502 should serve `rpc-http-pub.d-bis.org` WITHOUT authentication
|
||||
- Check Nginx configuration on VMID 2502
|
||||
|
||||
3. **Verify RPC Endpoint is Working**
|
||||
```bash
|
||||
# Test if endpoint responds (should return chain ID 0x8a = 138)
|
||||
curl -X POST https://rpc-http-pub.d-bis.org \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}'
|
||||
```
|
||||
- **Expected**: `{"jsonrpc":"2.0","id":1,"result":"0x8a"}`
|
||||
- **If you get JWT error**: Server needs to be reconfigured
|
||||
|
||||
#### Issue: "Network Error" or "Failed to Connect"
|
||||
|
||||
**Symptoms**:
|
||||
- MetaMask shows "Network Error"
|
||||
- Can't fetch balance
|
||||
- Transactions fail immediately
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Verify RPC URL**
|
||||
```
|
||||
Correct: https://rpc-http-pub.d-bis.org
|
||||
Incorrect: http://rpc-http-pub.d-bis.org (missing 's')
|
||||
Incorrect: https://rpc-core.d-bis.org (deprecated/internal)
|
||||
```
|
||||
|
||||
2. **Check Chain ID**
|
||||
- Must be exactly `138` (decimal)
|
||||
- Not `0x8a` (that's hex, but MetaMask expects decimal in manual entry)
|
||||
- Verify in network settings
|
||||
|
||||
3. **Remove and Re-add Network**
|
||||
- Settings → Networks → Remove the network
|
||||
- Add network again with correct settings
|
||||
- See [Quick Start Guide](./METAMASK_QUICK_START_GUIDE.md)
|
||||
|
||||
4. **Clear MetaMask Cache**
|
||||
- Settings → Advanced → Reset Account (if needed)
|
||||
- Or clear browser cache and reload MetaMask
|
||||
|
||||
5. **Check RPC Endpoint Status**
|
||||
```bash
|
||||
curl -X POST https://rpc-http-pub.d-bis.org \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Token Display Issues
|
||||
|
||||
#### Issue: "6,000,000,000.0T WETH" Instead of "6 WETH"
|
||||
|
||||
**Root Cause**: WETH9 contract's `decimals()` returns 0 instead of 18
|
||||
|
||||
**Solution**:
|
||||
|
||||
1. **Remove Token**
|
||||
- Find WETH9 in token list
|
||||
- Click token → "Hide token" or remove
|
||||
|
||||
2. **Re-import with Correct Decimals**
|
||||
- Import tokens → Custom token
|
||||
- Address: `0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2`
|
||||
- Symbol: `WETH`
|
||||
- **Decimals: `18`** ⚠️ **Critical: Must be 18**
|
||||
|
||||
3. **Verify Display**
|
||||
- Should now show: "6 WETH" or "6.0 WETH"
|
||||
- Not: "6,000,000,000.0T WETH"
|
||||
|
||||
**See**:
|
||||
- [WETH9 Display Fix Instructions](./METAMASK_WETH9_FIX_INSTRUCTIONS.md)
|
||||
- [MetaMask RPC Chain ID Error Fix](./METAMASK_RPC_CHAIN_ID_ERROR_FIX.md) - For "Could not fetch chain ID" errors
|
||||
- [RPC Public Endpoint Routing](./RPC_PUBLIC_ENDPOINT_ROUTING.md) - Architecture and routing details
|
||||
|
||||
---
|
||||
|
||||
#### Issue: Token Not Showing Balance
|
||||
|
||||
**Symptoms**:
|
||||
- Token imported but shows 0 balance
|
||||
- Token doesn't appear in list
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Check Token Address**
|
||||
- WETH9: `0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2`
|
||||
- WETH10: `0xf4BB2e28688e89fCcE3c0580D37d36A7672E8A9f`
|
||||
- Verify address is correct (case-sensitive)
|
||||
|
||||
2. **Verify You Have Tokens**
|
||||
```bash
|
||||
cast call 0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2 \
|
||||
"balanceOf(address)" <YOUR_ADDRESS> \
|
||||
--rpc-url https://rpc-http-pub.d-bis.org
|
||||
```
|
||||
|
||||
3. **Refresh Token List**
|
||||
- Click "Import tokens" → Refresh
|
||||
- Or remove and re-add token
|
||||
|
||||
4. **Check Network**
|
||||
- Ensure you're on ChainID 138
|
||||
- Tokens are chain-specific
|
||||
|
||||
---
|
||||
|
||||
### 3. Transaction Issues
|
||||
|
||||
#### Issue: Transaction Stuck or Pending Forever
|
||||
|
||||
**Symptoms**:
|
||||
- Transaction shows "Pending" for extended time
|
||||
- No confirmation after hours
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Check Network Status**
|
||||
- Verify RPC endpoint is responding
|
||||
- Check block explorer for recent blocks
|
||||
|
||||
2. **Check Gas Price**
|
||||
- May need to increase gas price
|
||||
- Network may be congested
|
||||
|
||||
3. **Replace Transaction** (Same Nonce)
|
||||
- Create new transaction with same nonce
|
||||
- Higher gas price
|
||||
- This cancels the old transaction
|
||||
|
||||
4. **Reset Nonce** (Last Resort)
|
||||
- Settings → Advanced → Reset Account
|
||||
- ⚠️ This clears transaction history
|
||||
|
||||
---
|
||||
|
||||
#### Issue: "Insufficient Funds for Gas"
|
||||
|
||||
**Symptoms**:
|
||||
- Transaction fails immediately
|
||||
- Error: "insufficient funds"
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Check ETH Balance**
|
||||
- Need ETH for gas fees
|
||||
- Gas costs vary (typically 0.001-0.01 ETH)
|
||||
|
||||
2. **Reduce Gas Limit** (If too high)
|
||||
- MetaMask may estimate too high
|
||||
- Try manual gas limit
|
||||
|
||||
3. **Get More ETH**
|
||||
- Request from network administrators
|
||||
- Bridge from another chain
|
||||
- Use faucet (if available)
|
||||
|
||||
---
|
||||
|
||||
#### Issue: Transaction Reverted
|
||||
|
||||
**Symptoms**:
|
||||
- Transaction confirmed but reverted
|
||||
- Error in transaction details
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Check Transaction Details**
|
||||
- View on block explorer
|
||||
- Look for revert reason
|
||||
|
||||
2. **Common Revert Reasons**:
|
||||
- Insufficient allowance (for token transfers)
|
||||
- Contract logic error
|
||||
- Invalid parameters
|
||||
- Out of gas (rare, usually fails before)
|
||||
|
||||
3. **Verify Contract State**
|
||||
- Check if contract is paused
|
||||
- Verify you have permissions
|
||||
- Check contract requirements
|
||||
|
||||
---
|
||||
|
||||
### 4. Price Feed Issues
|
||||
|
||||
#### Issue: Price Not Updating
|
||||
|
||||
**Symptoms**:
|
||||
- Oracle price seems stale
|
||||
- Price doesn't change
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Check Oracle Contract**
|
||||
```bash
|
||||
cast call 0x3304b747e565a97ec8ac220b0b6a1f6ffdb837e6 \
|
||||
"latestRoundData()" \
|
||||
--rpc-url https://rpc-http-pub.d-bis.org
|
||||
```
|
||||
|
||||
2. **Verify `updatedAt` Timestamp**
|
||||
- Should update every 60 seconds
|
||||
- If > 5 minutes old, Oracle Publisher may be down
|
||||
|
||||
3. **Check Oracle Publisher Service**
|
||||
- Service should be running (VMID 3500)
|
||||
- Check service logs for errors
|
||||
|
||||
4. **Manual Price Query**
|
||||
- Use Web3.js or Ethers.js to query directly
|
||||
- See [Oracle Integration Guide](./METAMASK_ORACLE_INTEGRATION.md)
|
||||
|
||||
---
|
||||
|
||||
#### Issue: Price Returns Zero or Error
|
||||
|
||||
**Symptoms**:
|
||||
- `latestRoundData()` returns 0
|
||||
- Contract call fails
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Verify Contract Address**
|
||||
- Oracle Proxy: `0x3304b747e565a97ec8ac220b0b6a1f6ffdb837e6`
|
||||
- Ensure correct address
|
||||
|
||||
2. **Check Contract Deployment**
|
||||
- Verify contract exists on ChainID 138
|
||||
- Check block explorer
|
||||
|
||||
3. **Verify Network**
|
||||
- Must be on ChainID 138
|
||||
- Price feeds are chain-specific
|
||||
|
||||
---
|
||||
|
||||
### 5. Network Switching Issues
|
||||
|
||||
#### Issue: Can't Switch to ChainID 138
|
||||
|
||||
**Symptoms**:
|
||||
- Network doesn't appear in list
|
||||
- Switch fails
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Add Network Manually**
|
||||
- See [Quick Start Guide](./METAMASK_QUICK_START_GUIDE.md)
|
||||
- Ensure all fields are correct
|
||||
|
||||
2. **Programmatic Addition** (For dApps)
|
||||
```javascript
|
||||
try {
|
||||
await window.ethereum.request({
|
||||
method: 'wallet_switchEthereumChain',
|
||||
params: [{ chainId: '0x8a' }], // 138 in hex
|
||||
});
|
||||
} catch (switchError) {
|
||||
// Network doesn't exist, add it
|
||||
if (switchError.code === 4902) {
|
||||
await window.ethereum.request({
|
||||
method: 'wallet_addEthereumChain',
|
||||
params: [networkConfig],
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Clear Network Cache**
|
||||
- Remove network
|
||||
- Re-add with correct settings
|
||||
|
||||
---
|
||||
|
||||
### 6. Account Issues
|
||||
|
||||
#### Issue: Wrong Account Connected
|
||||
|
||||
**Symptoms**:
|
||||
- Different address than expected
|
||||
- Can't see expected balance
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Switch Account in MetaMask**
|
||||
- Click account icon
|
||||
- Select correct account
|
||||
|
||||
2. **Import Account** (If needed)
|
||||
- Settings → Import Account
|
||||
- Use private key or seed phrase
|
||||
|
||||
3. **Verify Address**
|
||||
- Check address matches expected
|
||||
- Addresses are case-insensitive but verify format
|
||||
|
||||
---
|
||||
|
||||
#### Issue: Account Not Showing Balance
|
||||
|
||||
**Symptoms**:
|
||||
- Account connected but balance is 0
|
||||
- Expected to have ETH/tokens
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Verify Network**
|
||||
- Must be on ChainID 138
|
||||
- Balances are chain-specific
|
||||
|
||||
2. **Check Address**
|
||||
- Verify correct address
|
||||
- Check on block explorer
|
||||
|
||||
3. **Refresh Balance**
|
||||
- Click refresh icon in MetaMask
|
||||
- Or switch networks and switch back
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Advanced Troubleshooting
|
||||
|
||||
### Enable Debug Mode
|
||||
|
||||
**MetaMask Settings**:
|
||||
1. Settings → Advanced
|
||||
2. Enable "Show Hex Data"
|
||||
3. Enable "Enhanced Gas Fee UI"
|
||||
4. Check browser console for errors
|
||||
|
||||
### Check Browser Console
|
||||
|
||||
**Open Console**:
|
||||
- Chrome/Edge: F12 → Console
|
||||
- Firefox: F12 → Console
|
||||
- Safari: Cmd+Option+I → Console
|
||||
|
||||
**Look For**:
|
||||
- RPC errors
|
||||
- Network errors
|
||||
- JavaScript errors
|
||||
- MetaMask-specific errors
|
||||
|
||||
### Verify RPC Response
|
||||
|
||||
**Test RPC Endpoint**:
|
||||
```bash
|
||||
curl -X POST https://rpc-http-pub.d-bis.org \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"jsonrpc": "2.0",
|
||||
"method": "eth_blockNumber",
|
||||
"params": [],
|
||||
"id": 1
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected Response**:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"result": "0x..."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📞 Getting Help
|
||||
|
||||
### Resources
|
||||
|
||||
1. **Documentation**:
|
||||
- [Quick Start Guide](./METAMASK_QUICK_START_GUIDE.md)
|
||||
- [Full Integration Requirements](./METAMASK_FULL_INTEGRATION_REQUIREMENTS.md)
|
||||
- [Oracle Integration](./METAMASK_ORACLE_INTEGRATION.md)
|
||||
|
||||
2. **Block Explorer**:
|
||||
- `https://explorer.d-bis.org`
|
||||
- Check transactions, contracts, addresses
|
||||
|
||||
3. **Network Status**:
|
||||
- RPC: `https://rpc-http-pub.d-bis.org` (public, no auth required)
|
||||
- Permissioned RPC: `https://rpc-http-prv.d-bis.org` (requires JWT auth)
|
||||
- Verify endpoint is responding
|
||||
|
||||
### Information to Provide When Reporting Issues
|
||||
|
||||
1. **MetaMask Version**: Settings → About
|
||||
2. **Browser**: Chrome/Firefox/Safari + version
|
||||
3. **Network**: ChainID 138
|
||||
4. **Error Message**: Exact error text
|
||||
5. **Steps to Reproduce**: What you did before error
|
||||
6. **Console Errors**: Any JavaScript errors
|
||||
7. **Transaction Hash**: If transaction-related
|
||||
|
||||
---
|
||||
|
||||
## ✅ Quick Diagnostic Checklist
|
||||
|
||||
Run through this checklist when troubleshooting:
|
||||
|
||||
- [ ] Network is "Defi Oracle Meta Mainnet" or "SMOM-DBIS-138" (ChainID 138)
|
||||
- [ ] RPC URL is `https://rpc-http-pub.d-bis.org` (public endpoint, no auth)
|
||||
- [ ] Chain ID is `138` (decimal, not hex)
|
||||
- [ ] RPC endpoint does NOT require JWT authentication
|
||||
- [ ] Account is connected and correct
|
||||
- [ ] Sufficient ETH for gas fees
|
||||
- [ ] Token decimals are correct (18 for WETH)
|
||||
- [ ] Browser console shows no errors
|
||||
- [ ] RPC endpoint is responding
|
||||
- [ ] Block explorer shows recent blocks
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: $(date)
|
||||
|
||||
115
docs/09-troubleshooting/NO_SSH_ACCESS_SOLUTION.md
Normal file
115
docs/09-troubleshooting/NO_SSH_ACCESS_SOLUTION.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# Solution: Fix Tunnels Without SSH Access
|
||||
|
||||
## Problem
|
||||
|
||||
- All 6 Cloudflare tunnels are DOWN
|
||||
- Cannot access Proxmox network via SSH (network segmentation)
|
||||
- SSH tunnel setup fails (can't connect to establish tunnel)
|
||||
|
||||
## Solution: Cloudflare Dashboard ⭐ EASIEST
|
||||
|
||||
**No SSH needed!** Configure tunnels directly in Cloudflare Dashboard.
|
||||
|
||||
### Step-by-Step
|
||||
|
||||
1. **Access Dashboard**
|
||||
- Go to: https://one.dash.cloudflare.com/
|
||||
- Sign in
|
||||
- Navigate to: **Zero Trust** → **Networks** → **Tunnels**
|
||||
|
||||
2. **For Each Tunnel** (6 total):
|
||||
- Click on tunnel name
|
||||
- Click **Configure** button
|
||||
- Go to **Public Hostnames** tab
|
||||
- Add/Edit hostname configurations
|
||||
- Save
|
||||
|
||||
3. **Wait 1-2 Minutes**
|
||||
- Tunnels should reconnect automatically
|
||||
- Status should change from **DOWN** to **HEALTHY**
|
||||
|
||||
### Tunnel Configuration Details
|
||||
|
||||
#### Shared Tunnel (Most Important)
|
||||
**Tunnel**: `rpc-http-pub.d-bis.org` (ID: `10ab22da-8ea3-4e2e-a896-27ece2211a05`)
|
||||
|
||||
**Add these 9 hostnames** (all pointing to `http://192.168.11.21:80`):
|
||||
- `dbis-admin.d-bis.org`
|
||||
- `dbis-api.d-bis.org`
|
||||
- `dbis-api-2.d-bis.org`
|
||||
- `mim4u.org.d-bis.org`
|
||||
- `www.mim4u.org.d-bis.org`
|
||||
- `rpc-http-prv.d-bis.org`
|
||||
- `rpc-http-pub.d-bis.org`
|
||||
- `rpc-ws-prv.d-bis.org`
|
||||
- `rpc-ws-pub.d-bis.org`
|
||||
|
||||
**Important**: Add catch-all rule (HTTP 404) as the LAST entry.
|
||||
|
||||
#### Proxmox Tunnels
|
||||
Each needs one hostname pointing to HTTPS:
|
||||
|
||||
| Tunnel | Hostname | Target |
|
||||
|--------|----------|--------|
|
||||
| tunnel-ml110 | ml110-01.d-bis.org | https://192.168.11.10:8006 |
|
||||
| tunnel-r630-01 | r630-01.d-bis.org | https://192.168.11.11:8006 |
|
||||
| tunnel-r630-02 | r630-02.d-bis.org | https://192.168.11.12:8006 |
|
||||
|
||||
**Options**: Enable "No TLS Verify" (Proxmox uses self-signed certs)
|
||||
|
||||
#### Other Tunnels
|
||||
- `explorer.d-bis.org` → `http://192.168.11.21:80`
|
||||
- `mim4u-tunnel` → `http://192.168.11.21:80`
|
||||
|
||||
## Why This Works
|
||||
|
||||
Cloudflare tunnels use **outbound connections** from your infrastructure to Cloudflare. The configuration in the dashboard tells Cloudflare how to route traffic. Even if the tunnel connector (cloudflared) is down, once it reconnects, it will use the dashboard configuration.
|
||||
|
||||
## If Dashboard Method Doesn't Work
|
||||
|
||||
If tunnels remain DOWN after dashboard configuration, the tunnel connector (cloudflared in VMID 102) is likely not running. You need physical/network access to:
|
||||
|
||||
### Option 1: Physical Access to Proxmox Host
|
||||
|
||||
```bash
|
||||
# Direct console access to 192.168.11.12
|
||||
pct start 102
|
||||
pct exec 102 -- systemctl start cloudflared-*
|
||||
pct exec 102 -- systemctl status cloudflared-*
|
||||
```
|
||||
|
||||
### Option 2: VPN Access
|
||||
|
||||
If you have VPN access to `192.168.11.0/24` network:
|
||||
|
||||
```bash
|
||||
# Connect via VPN first, then:
|
||||
ssh root@192.168.11.12 "pct start 102"
|
||||
ssh root@192.168.11.12 "pct exec 102 -- systemctl start cloudflared-*"
|
||||
```
|
||||
|
||||
### Option 3: Cloudflare Tunnel Token Method
|
||||
|
||||
If you can get new tunnel tokens from Cloudflare Dashboard:
|
||||
|
||||
1. Go to tunnel → Configure
|
||||
2. Download new token/credentials
|
||||
3. Deploy to container (requires access)
|
||||
|
||||
## Verification
|
||||
|
||||
After configuring in dashboard:
|
||||
|
||||
```bash
|
||||
# Wait 1-2 minutes, then test:
|
||||
curl -I https://ml110-01.d-bis.org
|
||||
curl -I https://r630-01.d-bis.org
|
||||
curl -I https://explorer.d-bis.org
|
||||
curl -I https://rpc-http-pub.d-bis.org
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
✅ **Best Method**: Cloudflare Dashboard (no SSH needed)
|
||||
⚠️ **If that fails**: Need physical/network access to start container
|
||||
📋 **All tunnel IDs and configs**: See generated files in `/tmp/tunnel-fix-manual-*/`
|
||||
165
docs/09-troubleshooting/R630-04-AUTHENTICATION-ISSUE.md
Normal file
165
docs/09-troubleshooting/R630-04-AUTHENTICATION-ISSUE.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# R630-04 Authentication Issue
|
||||
|
||||
**IP:** 192.168.11.14
|
||||
**User:** root
|
||||
**Status:** ❌ Permission denied with password authentication
|
||||
|
||||
---
|
||||
|
||||
## Current Situation
|
||||
|
||||
- **SSH Port:** ✅ Open and accepting connections (port 22)
|
||||
- **Authentication Methods Offered:** `publickey,password`
|
||||
- **Password Auth:** ❌ Failing (permission denied)
|
||||
- **Public Key Auth:** ⚠️ Not configured
|
||||
|
||||
---
|
||||
|
||||
## Debug Information
|
||||
|
||||
From SSH verbose output:
|
||||
```
|
||||
debug1: Authentications that can continue: publickey,password
|
||||
debug1: Next authentication method: publickey
|
||||
debug1: Authentications that can continue: publickey,password
|
||||
debug1: Next authentication method: password
|
||||
Permission denied, please try again.
|
||||
```
|
||||
|
||||
This shows:
|
||||
- Server accepts both authentication methods
|
||||
- Public key auth tried first (no keys configured)
|
||||
- Password auth attempted but rejected
|
||||
|
||||
---
|
||||
|
||||
## Possible Solutions
|
||||
|
||||
### Option 1: Verify Password
|
||||
|
||||
Double-check the password. Common issues:
|
||||
- Typos (especially with special characters like `@`)
|
||||
- Caps Lock
|
||||
- Wrong password entirely
|
||||
- Password changed since last successful login
|
||||
|
||||
### Option 2: Connect from R630-03
|
||||
|
||||
Since R630-03 works, try:
|
||||
|
||||
```bash
|
||||
# Connect to R630-03 first
|
||||
ssh root@192.168.11.13
|
||||
# Password: L@kers2010
|
||||
|
||||
# Then from R630-03, connect to R630-04
|
||||
ssh root@192.168.11.14
|
||||
# Try password: L@kers2010
|
||||
```
|
||||
|
||||
Sometimes connecting from within the same network helps.
|
||||
|
||||
### Option 3: Use Console Access
|
||||
|
||||
If you have physical/console access to R630-04:
|
||||
|
||||
1. **Physical Console** - Connect KVM/keyboard directly
|
||||
2. **iDRAC/iLO** - Use Dell's remote management (if available)
|
||||
3. **Serial Console** - If configured
|
||||
|
||||
From console:
|
||||
```bash
|
||||
# Check SSH configuration
|
||||
cat /etc/ssh/sshd_config | grep -E "PasswordAuthentication|PermitRootLogin"
|
||||
|
||||
# Reset root password
|
||||
passwd root
|
||||
|
||||
# Check account status
|
||||
passwd -S root
|
||||
lastb | grep root | tail -10 # Check failed login attempts
|
||||
```
|
||||
|
||||
### Option 4: Set Up SSH Key Authentication
|
||||
|
||||
If you can access R630-04 through another method (console, Proxmox host, etc.):
|
||||
|
||||
**Generate SSH key:**
|
||||
```bash
|
||||
# On your local machine
|
||||
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_r630-04 -N ""
|
||||
```
|
||||
|
||||
**Copy public key to R630-04:**
|
||||
```bash
|
||||
# If you have console access to R630-04
|
||||
cat ~/.ssh/id_ed25519_r630-04.pub
|
||||
# Then on R630-04:
|
||||
mkdir -p /root/.ssh
|
||||
chmod 700 /root/.ssh
|
||||
echo "PASTE_PUBLIC_KEY_HERE" >> /root/.ssh/authorized_keys
|
||||
chmod 600 /root/.ssh/authorized_keys
|
||||
```
|
||||
|
||||
**Connect with key:**
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519_r630-04 root@192.168.11.14
|
||||
```
|
||||
|
||||
### Option 5: Check if Password Was Changed
|
||||
|
||||
If you have access to another Proxmox host that manages R630-04, or have documentation, verify:
|
||||
- When was the password last changed?
|
||||
- Is there a password management system?
|
||||
- Are there multiple root accounts or users?
|
||||
|
||||
---
|
||||
|
||||
## Quick Checklist
|
||||
|
||||
- [ ] Try password again carefully (check for typos)
|
||||
- [ ] Try connecting from R630-03
|
||||
- [ ] Check if password was changed
|
||||
- [ ] Try console/iDRAC access
|
||||
- [ ] Check if SSH keys are set up
|
||||
- [ ] Verify you're using the correct username (root)
|
||||
|
||||
---
|
||||
|
||||
## If You Have Console Access
|
||||
|
||||
Once you can access the console, run:
|
||||
|
||||
```bash
|
||||
# Reset root password
|
||||
passwd root
|
||||
|
||||
# Verify SSH configuration allows password auth
|
||||
grep -E "^PasswordAuthentication|^#PasswordAuthentication" /etc/ssh/sshd_config
|
||||
|
||||
# Should show:
|
||||
# PasswordAuthentication yes
|
||||
# OR (commented out means yes by default)
|
||||
# #PasswordAuthentication yes
|
||||
|
||||
# If it shows "PasswordAuthentication no", change it:
|
||||
sed -i 's/^PasswordAuthentication no/PasswordAuthentication yes/' /etc/ssh/sshd_config
|
||||
systemctl restart sshd
|
||||
|
||||
# Check root account status
|
||||
passwd -S root
|
||||
|
||||
# Check for locked account
|
||||
usermod -U root # Unlock if locked
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Try password one more time** - Make sure Caps Lock is off, type carefully
|
||||
2. **Try from R630-03** - Network path might matter
|
||||
3. **Get console access** - Physical KVM or iDRAC
|
||||
4. **Check password documentation** - Verify if password was changed
|
||||
5. **Set up SSH keys** - More secure and reliable long-term solution
|
||||
|
||||
256
docs/09-troubleshooting/R630-04-CONSOLE-ACCESS-GUIDE.md
Normal file
256
docs/09-troubleshooting/R630-04-CONSOLE-ACCESS-GUIDE.md
Normal file
@@ -0,0 +1,256 @@
|
||||
# R630-04 Console Access Guide
|
||||
|
||||
**IP:** 192.168.11.14
|
||||
**Status:** Console access available
|
||||
**Tasks:** Reset password, fix pveproxy, verify web interface
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Login via Console
|
||||
|
||||
Log in to R630-04 using your console access (physical keyboard, iDRAC KVM, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Check Current Status
|
||||
|
||||
Once logged in, run these commands to understand the current state:
|
||||
|
||||
```bash
|
||||
# Check hostname
|
||||
hostname
|
||||
cat /etc/hostname
|
||||
|
||||
# Check Proxmox version
|
||||
pveversion
|
||||
|
||||
# Check pveproxy service status
|
||||
systemctl status pveproxy --no-pager -l
|
||||
|
||||
# Check recent pveproxy logs
|
||||
journalctl -u pveproxy --no-pager -n 50
|
||||
|
||||
# Check if port 8006 is listening
|
||||
ss -tlnp | grep 8006
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Reset Root Password
|
||||
|
||||
Set a password for root (you can use `L@kers2010` to match other hosts, or choose a different one):
|
||||
|
||||
```bash
|
||||
passwd root
|
||||
# Enter new password twice when prompted
|
||||
```
|
||||
|
||||
**Recommended:** Use `L@kers2010` to match R630-03 and ml110 for consistency.
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Fix pveproxy Service
|
||||
|
||||
### 4.1 Check Service Status
|
||||
|
||||
```bash
|
||||
systemctl status pveproxy --no-pager -l | head -40
|
||||
```
|
||||
|
||||
### 4.2 Check Logs for Errors
|
||||
|
||||
```bash
|
||||
journalctl -u pveproxy --no-pager -n 100 | grep -i error
|
||||
journalctl -u pveproxy --no-pager -n 100 | tail -50
|
||||
```
|
||||
|
||||
### 4.3 Restart pveproxy
|
||||
|
||||
```bash
|
||||
systemctl restart pveproxy
|
||||
sleep 3
|
||||
systemctl status pveproxy --no-pager | head -20
|
||||
```
|
||||
|
||||
### 4.4 Check if Port 8006 is Now Listening
|
||||
|
||||
```bash
|
||||
ss -tlnp | grep 8006
|
||||
```
|
||||
|
||||
Should show something like:
|
||||
```
|
||||
LISTEN 0 128 0.0.0.0:8006 0.0.0.0:* users:(("pveproxy",pid=1234,fd=6))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5: If pveproxy Still Fails
|
||||
|
||||
### 5.1 Check All Proxmox Services
|
||||
|
||||
```bash
|
||||
systemctl list-units --type=service --all | grep -E 'pveproxy|pvedaemon|pve-cluster|pvestatd'
|
||||
systemctl status pvedaemon --no-pager | head -20
|
||||
systemctl status pve-cluster --no-pager | head -20
|
||||
```
|
||||
|
||||
### 5.2 Restart All Proxmox Services
|
||||
|
||||
```bash
|
||||
systemctl restart pveproxy pvedaemon pvestatd pve-cluster
|
||||
sleep 5
|
||||
systemctl status pveproxy --no-pager | head -20
|
||||
```
|
||||
|
||||
### 5.3 Check for Port Conflicts
|
||||
|
||||
```bash
|
||||
# Check if something else is using port 8006
|
||||
lsof -i :8006
|
||||
ss -tlnp | grep 8006
|
||||
```
|
||||
|
||||
### 5.4 Check Disk Space
|
||||
|
||||
```bash
|
||||
df -h
|
||||
# Low disk space can cause service issues
|
||||
```
|
||||
|
||||
### 5.5 Check Log Directory Permissions
|
||||
|
||||
```bash
|
||||
ls -la /var/log/pveproxy/
|
||||
# Should be owned by root:root
|
||||
```
|
||||
|
||||
### 5.6 Check Proxmox Cluster Status (if in cluster)
|
||||
|
||||
```bash
|
||||
pvecm status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Verify Web Interface Works
|
||||
|
||||
### 6.1 Test Locally
|
||||
|
||||
```bash
|
||||
# Test HTTPS connection locally
|
||||
curl -k https://localhost:8006 | head -20
|
||||
|
||||
# Should return HTML (Proxmox login page)
|
||||
```
|
||||
|
||||
### 6.2 Test from Another Host
|
||||
|
||||
From another machine on the network:
|
||||
|
||||
```bash
|
||||
# Test from R630-03 or your local machine
|
||||
curl -k https://192.168.11.14:8006 | head -20
|
||||
```
|
||||
|
||||
### 6.3 Open in Browser
|
||||
|
||||
Open in web browser:
|
||||
```
|
||||
https://192.168.11.14:8006
|
||||
```
|
||||
|
||||
You should see the Proxmox login page.
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Document Password
|
||||
|
||||
Once password is set and everything works, document it:
|
||||
|
||||
1. Update `docs/PROXMOX_HOST_PASSWORDS.md` with R630-04 password
|
||||
2. Update `INFRASTRUCTURE_OVERVIEW_COMPLETE.md` with correct status
|
||||
|
||||
---
|
||||
|
||||
## Quick Command Reference
|
||||
|
||||
Copy-paste these commands in order:
|
||||
|
||||
```bash
|
||||
# 1. Check status
|
||||
hostname
|
||||
pveversion
|
||||
systemctl status pveproxy --no-pager -l | head -30
|
||||
|
||||
# 2. Reset password
|
||||
passwd root
|
||||
# Enter: L@kers2010 (or your chosen password)
|
||||
|
||||
# 3. Fix pveproxy
|
||||
systemctl restart pveproxy
|
||||
sleep 3
|
||||
systemctl status pveproxy --no-pager | head -20
|
||||
ss -tlnp | grep 8006
|
||||
|
||||
# 4. If still failing, restart all services
|
||||
systemctl restart pveproxy pvedaemon pvestatd
|
||||
systemctl status pveproxy --no-pager | head -20
|
||||
|
||||
# 5. Test web interface
|
||||
curl -k https://localhost:8006 | head -10
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Expected Results
|
||||
|
||||
After completing these steps:
|
||||
|
||||
✅ Root password set and documented
|
||||
✅ pveproxy service running
|
||||
✅ Port 8006 listening
|
||||
✅ Web interface accessible at https://192.168.11.14:8006
|
||||
✅ SSH access working with new password
|
||||
|
||||
---
|
||||
|
||||
## If Issues Persist
|
||||
|
||||
If pveproxy still fails after restart:
|
||||
|
||||
1. **Check for specific error messages:**
|
||||
```bash
|
||||
journalctl -u pveproxy --no-pager -n 200 | grep -i "error\|fail\|exit"
|
||||
```
|
||||
|
||||
2. **Check Proxmox installation:**
|
||||
```bash
|
||||
dpkg -l | grep proxmox
|
||||
pveversion -v
|
||||
```
|
||||
|
||||
3. **Reinstall pveproxy (if needed):**
|
||||
```bash
|
||||
apt update
|
||||
apt install --reinstall pveproxy
|
||||
systemctl restart pveproxy
|
||||
```
|
||||
|
||||
4. **Check system resources:**
|
||||
```bash
|
||||
free -h
|
||||
df -h
|
||||
top -bn1 | head -20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Once you're done, let me know:**
|
||||
1. What password you set
|
||||
2. Whether pveproxy is working
|
||||
3. If the web interface is accessible
|
||||
4. Any error messages you encountered
|
||||
|
||||
I'll update the documentation accordingly!
|
||||
|
||||
185
docs/09-troubleshooting/R630-04-PROXMOX-TROUBLESHOOTING.md
Normal file
185
docs/09-troubleshooting/R630-04-PROXMOX-TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# R630-04 Proxmox Troubleshooting Guide
|
||||
|
||||
**IP Address:** 192.168.11.14
|
||||
**Proxmox Version:** 6.17.2-1-PVE
|
||||
**Issue:** pveproxy worker exit (web interface not accessible on port 8006)
|
||||
|
||||
---
|
||||
|
||||
## Problem Summary
|
||||
|
||||
- Proxmox VE is installed (version 6.17.2-1-PVE)
|
||||
- SSH access works (port 22)
|
||||
- Web interface not accessible (port 8006)
|
||||
- pveproxy workers are crashing/exiting
|
||||
|
||||
---
|
||||
|
||||
## Diagnostic Steps
|
||||
|
||||
### 1. Check pveproxy Service Status
|
||||
|
||||
```bash
|
||||
systemctl status pveproxy --no-pager -l
|
||||
```
|
||||
|
||||
Look for:
|
||||
- Service state (should be "active (running)")
|
||||
- Worker process exits
|
||||
- Error messages
|
||||
|
||||
### 2. Check Recent Logs
|
||||
|
||||
```bash
|
||||
journalctl -u pveproxy --no-pager -n 100
|
||||
```
|
||||
|
||||
Look for:
|
||||
- Worker exit messages
|
||||
- Error patterns
|
||||
- Stack traces
|
||||
|
||||
### 3. Check Port 8006
|
||||
|
||||
```bash
|
||||
ss -tlnp | grep 8006
|
||||
# or
|
||||
netstat -tlnp | grep 8006
|
||||
```
|
||||
|
||||
Should show pveproxy listening on port 8006.
|
||||
|
||||
### 4. Check Proxmox Cluster Status
|
||||
|
||||
```bash
|
||||
pvecm status
|
||||
```
|
||||
|
||||
If in a cluster, verify cluster connectivity.
|
||||
|
||||
---
|
||||
|
||||
## Common Fixes
|
||||
|
||||
### Fix 1: Restart pveproxy Service
|
||||
|
||||
```bash
|
||||
systemctl restart pveproxy
|
||||
systemctl status pveproxy
|
||||
```
|
||||
|
||||
### Fix 2: Check and Fix Configuration
|
||||
|
||||
```bash
|
||||
# Check configuration files
|
||||
ls -la /etc/pveproxy/
|
||||
cat /etc/default/pveproxy 2>/dev/null
|
||||
|
||||
# Check for syntax errors
|
||||
pveproxy --help
|
||||
```
|
||||
|
||||
### Fix 3: Reinstall pveproxy Package
|
||||
|
||||
```bash
|
||||
apt update
|
||||
apt install --reinstall pveproxy
|
||||
systemctl restart pveproxy
|
||||
```
|
||||
|
||||
### Fix 4: Check for Port Conflicts
|
||||
|
||||
```bash
|
||||
# Find what's using port 8006
|
||||
ss -tlnp | grep 8006
|
||||
lsof -i :8006
|
||||
|
||||
# If something else is using it, stop that service
|
||||
```
|
||||
|
||||
### Fix 5: Check Disk Space and Permissions
|
||||
|
||||
```bash
|
||||
# Check disk space
|
||||
df -h
|
||||
|
||||
# Check log directory permissions
|
||||
ls -la /var/log/pveproxy/
|
||||
# Should be owned by root:root with appropriate permissions
|
||||
```
|
||||
|
||||
### Fix 6: Check for Corrupted Database
|
||||
|
||||
```bash
|
||||
# Check Proxmox database
|
||||
pveversion -v
|
||||
|
||||
# Check cluster database (if in cluster)
|
||||
systemctl status pve-cluster
|
||||
```
|
||||
|
||||
### Fix 7: Full Service Restart
|
||||
|
||||
```bash
|
||||
# Restart all Proxmox services
|
||||
systemctl restart pveproxy pvedaemon pvestatd pve-cluster
|
||||
systemctl status pveproxy pvedaemon pvestatd pve-cluster
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Troubleshooting
|
||||
|
||||
### View Real-time Logs
|
||||
|
||||
```bash
|
||||
journalctl -u pveproxy -f
|
||||
```
|
||||
|
||||
### Check Worker Process Details
|
||||
|
||||
```bash
|
||||
# See running pveproxy processes
|
||||
ps aux | grep pveproxy
|
||||
|
||||
# Check process limits
|
||||
cat /proc/$(pgrep -f pveproxy | head -1)/limits
|
||||
```
|
||||
|
||||
### Test pveproxy Manually
|
||||
|
||||
```bash
|
||||
# Stop service
|
||||
systemctl stop pveproxy
|
||||
|
||||
# Try running manually to see errors
|
||||
/usr/bin/pveproxy start
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scripts Available
|
||||
|
||||
1. **check-r630-04-commands.sh** - Diagnostic commands
|
||||
2. **fix-r630-04-pveproxy.sh** - Automated fix script
|
||||
|
||||
---
|
||||
|
||||
## Expected Resolution
|
||||
|
||||
After fixing:
|
||||
- `systemctl status pveproxy` should show "active (running)"
|
||||
- `ss -tlnp | grep 8006` should show pveproxy listening
|
||||
- Web interface should be accessible at `https://192.168.11.14:8006`
|
||||
|
||||
---
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- Proxmox VE Documentation: https://pve.proxmox.com/pve-docs/
|
||||
- Proxmox Forum: https://forum.proxmox.com/
|
||||
- Log locations:
|
||||
- `/var/log/pveproxy/access.log`
|
||||
- `/var/log/pveproxy/error.log`
|
||||
- `journalctl -u pveproxy`
|
||||
|
||||
329
docs/09-troubleshooting/SECURITY_INCIDENT_RESPONSE.md
Normal file
329
docs/09-troubleshooting/SECURITY_INCIDENT_RESPONSE.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# Security Incident Response Procedures
|
||||
|
||||
**Last Updated:** 2025-01-20
|
||||
**Document Version:** 1.0
|
||||
**Status:** Active Documentation
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines procedures for responding to security incidents, including detection, containment, eradication, recovery, and post-incident activities.
|
||||
|
||||
---
|
||||
|
||||
## Incident Response Phases
|
||||
|
||||
### Phase 1: Preparation
|
||||
|
||||
**Pre-Incident Activities:**
|
||||
|
||||
1. **Incident Response Team:**
|
||||
- Define roles and responsibilities
|
||||
- Establish communication channels
|
||||
- Create contact list
|
||||
|
||||
2. **Tools and Resources:**
|
||||
- Log collection and analysis tools
|
||||
- Forensic tools
|
||||
- Backup systems
|
||||
- Documentation
|
||||
|
||||
3. **Procedures:**
|
||||
- Incident classification
|
||||
- Escalation procedures
|
||||
- Communication templates
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Detection and Analysis
|
||||
|
||||
#### Detection Methods
|
||||
|
||||
1. **Automated Detection:**
|
||||
- Intrusion detection systems (IDS)
|
||||
- Security information and event management (SIEM)
|
||||
- Log analysis
|
||||
- Anomaly detection
|
||||
|
||||
2. **Manual Detection:**
|
||||
- User reports
|
||||
- System administrator observations
|
||||
- Security audits
|
||||
|
||||
#### Incident Classification
|
||||
|
||||
**Severity Levels:**
|
||||
|
||||
- **Critical:** Active breach, data exfiltration, system compromise
|
||||
- **High:** Unauthorized access, potential data exposure
|
||||
- **Medium:** Suspicious activity, policy violations
|
||||
- **Low:** Minor security events, false positives
|
||||
|
||||
#### Initial Analysis
|
||||
|
||||
**Information Gathering:**
|
||||
|
||||
1. **What Happened:**
|
||||
- Timeline of events
|
||||
- Affected systems
|
||||
- Indicators of compromise (IOCs)
|
||||
|
||||
2. **Who/What:**
|
||||
- Source of attack
|
||||
- Attack vector
|
||||
- Tools used
|
||||
|
||||
3. **Impact Assessment:**
|
||||
- Data accessed/modified
|
||||
- Systems compromised
|
||||
- Business impact
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Containment
|
||||
|
||||
#### Short-Term Containment
|
||||
|
||||
**Immediate Actions:**
|
||||
|
||||
1. **Isolate Affected Systems:**
|
||||
```bash
|
||||
# Disable network interface
|
||||
ip link set <interface> down
|
||||
|
||||
# Block IP addresses
|
||||
iptables -A INPUT -s <attacker-ip> -j DROP
|
||||
```
|
||||
|
||||
2. **Preserve Evidence:**
|
||||
- Take snapshots of affected systems
|
||||
- Copy logs
|
||||
- Document current state
|
||||
|
||||
3. **Disable Compromised Accounts:**
|
||||
```bash
|
||||
# Disable user account
|
||||
usermod -L <username>
|
||||
|
||||
# Revoke API tokens
|
||||
# Via Proxmox UI: Datacenter → Permissions → API Tokens
|
||||
```
|
||||
|
||||
#### Long-Term Containment
|
||||
|
||||
**System Hardening:**
|
||||
|
||||
1. **Update Security Controls:**
|
||||
- Patch vulnerabilities
|
||||
- Update firewall rules
|
||||
- Enhance monitoring
|
||||
|
||||
2. **Access Control:**
|
||||
- Review user accounts
|
||||
- Rotate credentials
|
||||
- Implement MFA where possible
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Eradication
|
||||
|
||||
#### Remove Threat
|
||||
|
||||
**Actions:**
|
||||
|
||||
1. **Remove Malware:**
|
||||
```bash
|
||||
# Scan for malware
|
||||
clamscan -r /path/to/scan
|
||||
|
||||
# Remove infected files
|
||||
# (after verification)
|
||||
```
|
||||
|
||||
2. **Close Attack Vectors:**
|
||||
- Patch vulnerabilities
|
||||
- Fix misconfigurations
|
||||
- Update security policies
|
||||
|
||||
3. **Clean Compromised Systems:**
|
||||
- Rebuild from known-good backups
|
||||
- Verify system integrity
|
||||
- Reinstall if necessary
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Recovery
|
||||
|
||||
#### System Restoration
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. **Restore from Backups:**
|
||||
- Use pre-incident backups
|
||||
- Verify backup integrity
|
||||
- Restore systems
|
||||
|
||||
2. **Verify System Integrity:**
|
||||
- Check system logs
|
||||
- Verify configurations
|
||||
- Test functionality
|
||||
|
||||
3. **Monitor Systems:**
|
||||
- Enhanced monitoring
|
||||
- Watch for re-infection
|
||||
- Track system behavior
|
||||
|
||||
#### Service Restoration
|
||||
|
||||
**Gradual Restoration:**
|
||||
|
||||
1. **Priority Systems First:**
|
||||
- Critical services
|
||||
- Business-critical applications
|
||||
- User-facing services
|
||||
|
||||
2. **Verification:**
|
||||
- Test each service
|
||||
- Verify data integrity
|
||||
- Confirm functionality
|
||||
|
||||
---
|
||||
|
||||
### Phase 6: Post-Incident Activity
|
||||
|
||||
#### Lessons Learned
|
||||
|
||||
**Post-Incident Review:**
|
||||
|
||||
1. **Timeline Review:**
|
||||
- Document complete timeline
|
||||
- Identify gaps in response
|
||||
- Note what worked well
|
||||
|
||||
2. **Root Cause Analysis:**
|
||||
- Identify root cause
|
||||
- Determine contributing factors
|
||||
- Document findings
|
||||
|
||||
3. **Improvements:**
|
||||
- Update procedures
|
||||
- Enhance security controls
|
||||
- Improve monitoring
|
||||
|
||||
#### Documentation
|
||||
|
||||
**Incident Report:**
|
||||
|
||||
1. **Executive Summary:**
|
||||
- Incident overview
|
||||
- Impact assessment
|
||||
- Response timeline
|
||||
|
||||
2. **Technical Details:**
|
||||
- Attack vector
|
||||
- IOCs
|
||||
- Remediation steps
|
||||
|
||||
3. **Recommendations:**
|
||||
- Security improvements
|
||||
- Process improvements
|
||||
- Training needs
|
||||
|
||||
---
|
||||
|
||||
## Incident Response Contacts
|
||||
|
||||
### Primary Contacts
|
||||
|
||||
- **Security Team Lead:** [Contact Information]
|
||||
- **Infrastructure Lead:** [Contact Information]
|
||||
- **Management:** [Contact Information]
|
||||
|
||||
### Escalation
|
||||
|
||||
- **Level 1:** Security team (immediate)
|
||||
- **Level 2:** Management (1 hour)
|
||||
- **Level 3:** External security firm (4 hours)
|
||||
|
||||
---
|
||||
|
||||
## Common Incident Scenarios
|
||||
|
||||
### Unauthorized Access
|
||||
|
||||
**Symptoms:**
|
||||
- Unknown logins
|
||||
- Unusual account activity
|
||||
- Failed login attempts
|
||||
|
||||
**Response:**
|
||||
1. Disable compromised accounts
|
||||
2. Review access logs
|
||||
3. Change all passwords
|
||||
4. Investigate source
|
||||
|
||||
### Malware Infection
|
||||
|
||||
**Symptoms:**
|
||||
- Unusual system behavior
|
||||
- High CPU/memory usage
|
||||
- Network anomalies
|
||||
|
||||
**Response:**
|
||||
1. Isolate affected systems
|
||||
2. Identify malware
|
||||
3. Remove malware
|
||||
4. Restore from backup if needed
|
||||
|
||||
### Data Breach
|
||||
|
||||
**Symptoms:**
|
||||
- Unauthorized data access
|
||||
- Data exfiltration
|
||||
- Database anomalies
|
||||
|
||||
**Response:**
|
||||
1. Contain breach
|
||||
2. Assess data exposure
|
||||
3. Notify affected parties (if required)
|
||||
4. Enhance security controls
|
||||
|
||||
---
|
||||
|
||||
## Prevention
|
||||
|
||||
### Security Best Practices
|
||||
|
||||
1. **Regular Updates:**
|
||||
- Keep systems patched
|
||||
- Update security tools
|
||||
- Review configurations
|
||||
|
||||
2. **Monitoring:**
|
||||
- Log analysis
|
||||
- Anomaly detection
|
||||
- Regular audits
|
||||
|
||||
3. **Access Control:**
|
||||
- Least privilege principle
|
||||
- MFA where possible
|
||||
- Regular access reviews
|
||||
|
||||
4. **Backups:**
|
||||
- Regular backups
|
||||
- Test restores
|
||||
- Offsite backups
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- **[DISASTER_RECOVERY.md](../03-deployment/DISASTER_RECOVERY.md)** - Disaster recovery procedures
|
||||
- **[BACKUP_AND_RESTORE.md](../03-deployment/BACKUP_AND_RESTORE.md)** - Backup procedures
|
||||
- **[TROUBLESHOOTING_FAQ.md](TROUBLESHOOTING_FAQ.md)** - General troubleshooting
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2025-01-20
|
||||
**Review Cycle:** Quarterly
|
||||
113
docs/09-troubleshooting/STORAGE_MIGRATION_ISSUE.md
Normal file
113
docs/09-troubleshooting/STORAGE_MIGRATION_ISSUE.md
Normal file
@@ -0,0 +1,113 @@
|
||||
# Storage Migration Issue - pve2 Configuration
|
||||
|
||||
**Date**: $(date)
|
||||
**Issue**: Container migrations failing due to storage configuration mismatch
|
||||
|
||||
## Problem
|
||||
|
||||
Container migrations from ml110 to pve2 are failing with the error:
|
||||
```
|
||||
Volume group "pve" not found
|
||||
ERROR: storage migration for 'local-lvm:vm-XXXX-disk-0' to storage 'local-lvm' failed
|
||||
```
|
||||
|
||||
## Root Cause
|
||||
|
||||
**ml110** (source):
|
||||
- Has `local-lvm` storage **active**
|
||||
- Uses volume group named **"pve"** (standard Proxmox setup)
|
||||
- Containers stored on `local-lvm:vm-XXXX-disk-0`
|
||||
|
||||
**pve2** (target):
|
||||
- Has `local-lvm` storage but it's **INACTIVE**
|
||||
- Has volume groups named **lvm1, lvm2, lvm3, lvm4, lvm5, lvm6** instead of "pve"
|
||||
- Storage is not properly configured for Proxmox
|
||||
|
||||
## Storage Status
|
||||
|
||||
### ml110 Storage
|
||||
```
|
||||
local-lvm: lvmthin, active, 832GB total, 108GB used
|
||||
Volume Group: pve (standard)
|
||||
```
|
||||
|
||||
### pve2 Storage
|
||||
```
|
||||
local-lvm: lvmthin, INACTIVE, 0GB available
|
||||
Volume Groups: lvm1, lvm2, lvm3, lvm4, lvm5, lvm6 (non-standard)
|
||||
```
|
||||
|
||||
## Solutions
|
||||
|
||||
### Option 1: Configure pve2's local-lvm Storage (Recommended)
|
||||
|
||||
1. **Rename/create "pve" volume group on pve2**:
|
||||
```bash
|
||||
# On pve2, check current LVM setup
|
||||
ssh root@192.168.11.12 "vgs; lvs"
|
||||
|
||||
# Rename one of the volume groups to "pve" (if possible)
|
||||
# OR create a new "pve" volume group from available space
|
||||
```
|
||||
|
||||
2. **Activate local-lvm storage on pve2**:
|
||||
```bash
|
||||
# Check storage configuration
|
||||
ssh root@192.168.11.12 "cat /etc/pve/storage.cfg"
|
||||
|
||||
# May need to reconfigure local-lvm to use correct volume group
|
||||
```
|
||||
|
||||
### Option 2: Migrate to Different Storage on pve2
|
||||
|
||||
Use `local` (directory storage) instead of `local-lvm`:
|
||||
|
||||
```bash
|
||||
# Migrate with storage specification
|
||||
pct migrate <VMID> pve2 --storage local --restart
|
||||
```
|
||||
|
||||
**Pros**: Works immediately, no storage reconfiguration needed
|
||||
**Cons**: Directory storage is slower than LVM thin provisioning
|
||||
|
||||
### Option 3: Use Shared Storage
|
||||
|
||||
Configure shared storage (NFS, Ceph, etc.) accessible from both nodes:
|
||||
|
||||
```bash
|
||||
# Add shared storage to cluster
|
||||
# Then migrate containers to shared storage
|
||||
```
|
||||
|
||||
## Immediate Workaround
|
||||
|
||||
Until pve2's local-lvm is properly configured, we can:
|
||||
|
||||
1. **Skip migrations** for now
|
||||
2. **Configure pve2 storage** first
|
||||
3. **Then proceed with migrations**
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ⏳ Investigate pve2's LVM configuration
|
||||
2. ⏳ Configure local-lvm storage on pve2 with "pve" volume group
|
||||
3. ⏳ Verify storage is active and working
|
||||
4. ⏳ Retry container migrations
|
||||
|
||||
## Verification Commands
|
||||
|
||||
```bash
|
||||
# Check pve2 storage status
|
||||
ssh root@192.168.11.12 "pvesm status"
|
||||
|
||||
# Check volume groups
|
||||
ssh root@192.168.11.12 "vgs"
|
||||
|
||||
# Check local-lvm configuration
|
||||
ssh root@192.168.11.12 "cat /etc/pve/storage.cfg | grep -A 5 local-lvm"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Status**: ⚠️ Migrations paused pending storage configuration fix
|
||||
|
||||
@@ -4,12 +4,16 @@ Common issues and solutions for Besu validated set deployment.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Container Issues](#container-issues)
|
||||
2. [Service Issues](#service-issues)
|
||||
3. [Network Issues](#network-issues)
|
||||
4. [Consensus Issues](#consensus-issues)
|
||||
5. [Configuration Issues](#configuration-issues)
|
||||
6. [Performance Issues](#performance-issues)
|
||||
**Estimated Reading Time:** 30 minutes
|
||||
**Progress:** Check off sections as you read
|
||||
|
||||
1. ✅ [Container Issues](#container-issues) - *Container troubleshooting*
|
||||
2. ✅ [Service Issues](#service-issues) - *Service troubleshooting*
|
||||
3. ✅ [Network Issues](#network-issues) - *Network troubleshooting*
|
||||
4. ✅ [Consensus Issues](#consensus-issues) - *Consensus troubleshooting*
|
||||
5. ✅ [Configuration Issues](#configuration-issues) - *Configuration troubleshooting*
|
||||
6. ✅ [Performance Issues](#performance-issues) - *Performance troubleshooting*
|
||||
7. ✅ [Additional Common Questions](#additional-common-questions) - *More FAQs*
|
||||
|
||||
---
|
||||
|
||||
@@ -43,6 +47,27 @@ pct start <vmid>
|
||||
- Invalid container configuration
|
||||
- OS template issues
|
||||
|
||||
<details>
|
||||
<summary>Click to expand advanced troubleshooting steps</summary>
|
||||
|
||||
**Advanced Diagnostics:**
|
||||
```bash
|
||||
# Check container resources
|
||||
pct list --full | grep <vmid>
|
||||
|
||||
# Check Proxmox host resources
|
||||
free -h
|
||||
df -h
|
||||
|
||||
# Check container logs in detail
|
||||
journalctl -u pve-container@<vmid> -n 100 --no-pager
|
||||
|
||||
# Verify container template
|
||||
pveam list | grep <template-name>
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
---
|
||||
|
||||
### Q: Container runs out of disk space
|
||||
@@ -483,6 +508,187 @@ If issues persist:
|
||||
|
||||
---
|
||||
|
||||
## Additional Common Questions
|
||||
|
||||
### Q: How do I add a new VMID?
|
||||
|
||||
**Answer:**
|
||||
1. Check available VMID ranges in [VMID_ALLOCATION_FINAL.md](../02-architecture/VMID_ALLOCATION_FINAL.md)
|
||||
2. Select an appropriate VMID from the designated range for your service
|
||||
3. Verify the VMID is not already in use: `pct list | grep <vmid>` or `qm list | grep <vmid>`
|
||||
4. Document the assignment in VMID_ALLOCATION_FINAL.md
|
||||
5. Use the VMID when creating containers/VMs
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Check if VMID 2503 is available
|
||||
pct list | grep 2503
|
||||
qm list | grep 2503
|
||||
|
||||
# If available, create container with VMID 2503
|
||||
pct create 2503 ...
|
||||
```
|
||||
|
||||
**Related Documentation:**
|
||||
- [VMID Allocation Registry](../02-architecture/VMID_ALLOCATION_FINAL.md) ⭐⭐⭐
|
||||
- [VMID Quick Reference](../12-quick-reference/VMID_QUICK_REFERENCE.md) ⭐⭐⭐
|
||||
|
||||
---
|
||||
|
||||
### Q: What's the difference between public and private RPC?
|
||||
|
||||
**Answer:**
|
||||
|
||||
| Feature | Public RPC | Private RPC |
|
||||
|---------|-----------|-------------|
|
||||
| **Discovery** | Enabled | Disabled |
|
||||
| **Permissioning** | Disabled | Enabled |
|
||||
| **Access** | Public (CORS: *) | Restricted (internal only) |
|
||||
| **APIs** | ETH, NET, WEB3 (read-only) | ETH, NET, WEB3, ADMIN, DEBUG (full) |
|
||||
| **Use Case** | dApps, external users | Internal services, admin |
|
||||
| **ChainID** | 0x8a (138) or 0x1 (wallet compatibility) | 0x8a (138) |
|
||||
| **Domain** | rpc-http-pub.d-bis.org | rpc-http-prv.d-bis.org |
|
||||
|
||||
**Public RPC:**
|
||||
- Accessible from the internet
|
||||
- Used by dApps and external tools
|
||||
- Read-only APIs for security
|
||||
- May report chainID 0x1 for MetaMask compatibility
|
||||
|
||||
**Private RPC:**
|
||||
- Internal network only
|
||||
- Used by internal services and administration
|
||||
- Full API access including ADMIN and DEBUG
|
||||
- Strict permissioning and access control
|
||||
|
||||
**Related Documentation:**
|
||||
- [RPC Node Types Architecture](../05-network/RPC_NODE_TYPES_ARCHITECTURE.md) ⭐⭐
|
||||
- [RPC Template Types](../05-network/RPC_TEMPLATE_TYPES.md) ⭐
|
||||
|
||||
---
|
||||
|
||||
### Q: How do I troubleshoot Cloudflare tunnel issues?
|
||||
|
||||
**Answer:**
|
||||
|
||||
**Step 1: Check Tunnel Status**
|
||||
```bash
|
||||
# Check cloudflared container status
|
||||
pct status 102
|
||||
|
||||
# Check tunnel logs
|
||||
pct logs 102 --tail 50
|
||||
|
||||
# Verify tunnel is running
|
||||
pct exec 102 -- ps aux | grep cloudflared
|
||||
```
|
||||
|
||||
**Step 2: Verify Configuration**
|
||||
```bash
|
||||
# Check tunnel configuration
|
||||
pct exec 102 -- cat /etc/cloudflared/config.yaml
|
||||
|
||||
# Verify credentials file exists
|
||||
pct exec 102 -- ls -la /etc/cloudflared/*.json
|
||||
```
|
||||
|
||||
**Step 3: Test Connectivity**
|
||||
```bash
|
||||
# Test from internal network
|
||||
curl -I http://192.168.11.21:80
|
||||
|
||||
# Test from external (through Cloudflare)
|
||||
curl -I https://explorer.d-bis.org
|
||||
```
|
||||
|
||||
**Step 4: Check Cloudflare Dashboard**
|
||||
- Verify tunnel is healthy in Cloudflare Zero Trust dashboard
|
||||
- Check ingress rules are configured correctly
|
||||
- Verify DNS records point to tunnel
|
||||
|
||||
**Common Issues:**
|
||||
- Tunnel not running → Restart: `pct restart 102`
|
||||
- Configuration error → Check YAML syntax
|
||||
- Credentials invalid → Regenerate tunnel token
|
||||
- DNS not resolving → Check Cloudflare DNS settings
|
||||
|
||||
**Related Documentation:**
|
||||
- [Cloudflare Tunnel Routing Architecture](../05-network/CLOUDFLARE_TUNNEL_ROUTING_ARCHITECTURE.md) ⭐⭐⭐
|
||||
- [Cloudflare Routing Master Reference](../05-network/CLOUDFLARE_ROUTING_MASTER.md) ⭐⭐⭐
|
||||
- [Troubleshooting Quick Reference](../12-quick-reference/TROUBLESHOOTING_QUICK_REFERENCE.md) ⭐⭐⭐
|
||||
|
||||
---
|
||||
|
||||
### Q: What's the recommended storage configuration?
|
||||
|
||||
**Answer:**
|
||||
|
||||
**For R630 Compute Nodes:**
|
||||
- **Boot drives (2×600GB):** ZFS mirror (recommended) or hardware RAID1
|
||||
- **Data SSDs (6×250GB):** ZFS pool with one of:
|
||||
- Striped mirrors (if pairs available)
|
||||
- RAIDZ1 (single parity, 5 drives usable)
|
||||
- RAIDZ2 (double parity, 4 drives usable)
|
||||
- **High-write workloads:** Dedicated dataset with quotas
|
||||
|
||||
**For ML110 Management Node:**
|
||||
- Standard Proxmox storage configuration
|
||||
- Sufficient space for templates and backups
|
||||
|
||||
**Storage Best Practices:**
|
||||
- Use ZFS for data integrity and snapshots
|
||||
- Enable compression for space efficiency
|
||||
- Set quotas for containers to prevent disk exhaustion
|
||||
- Regular backups to external storage
|
||||
|
||||
**Related Documentation:**
|
||||
- [Network Architecture - Storage Orchestration](../02-architecture/NETWORK_ARCHITECTURE.md#53-storage-orchestration-r630) ⭐⭐⭐
|
||||
- [Backup and Restore](../03-deployment/BACKUP_AND_RESTORE.md) ⭐⭐
|
||||
|
||||
---
|
||||
|
||||
### Q: How do I migrate from flat LAN to VLANs?
|
||||
|
||||
**Answer:**
|
||||
|
||||
**Phase 1: Preparation**
|
||||
1. Review VLAN plan in [NETWORK_ARCHITECTURE.md](../02-architecture/NETWORK_ARCHITECTURE.md)
|
||||
2. Document current IP assignments
|
||||
3. Plan IP address migration for each service
|
||||
4. Create rollback plan
|
||||
|
||||
**Phase 2: Network Configuration**
|
||||
1. Configure ES216G switches with VLAN trunks
|
||||
2. Enable VLAN-aware bridge on Proxmox hosts
|
||||
3. Create VLAN interfaces on ER605 router
|
||||
4. Test VLAN connectivity
|
||||
|
||||
**Phase 3: Service Migration**
|
||||
1. Migrate services one VLAN at a time
|
||||
2. Start with non-critical services
|
||||
3. Update container/VM network configuration
|
||||
4. Verify connectivity after each migration
|
||||
|
||||
**Phase 4: Validation**
|
||||
1. Test all services on new VLANs
|
||||
2. Verify routing between VLANs
|
||||
3. Test egress NAT pools
|
||||
4. Document final configuration
|
||||
|
||||
**Migration Order (Recommended):**
|
||||
1. Management services (VLAN 11) - Already active
|
||||
2. Monitoring/observability (VLAN 120, 121)
|
||||
3. Besu network (VLANs 110, 111, 112)
|
||||
4. CCIP network (VLANs 130, 132, 133, 134)
|
||||
5. Service layer (VLAN 160)
|
||||
6. Sovereign tenants (VLANs 200-203)
|
||||
|
||||
**Related Documentation:**
|
||||
- [Network Architecture - VLAN Orchestration](../02-architecture/NETWORK_ARCHITECTURE.md#3-layer-2--vlan-orchestration-plan) ⭐⭐⭐
|
||||
- [Orchestration Deployment Guide - VLAN Enablement](../02-architecture/ORCHESTRATION_DEPLOYMENT_GUIDE.md#phase-1--vlan-enablement) ⭐⭐⭐
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
### Operational Procedures
|
||||
|
||||
158
docs/09-troubleshooting/TROUBLESHOOTING_GUIDE.md
Normal file
158
docs/09-troubleshooting/TROUBLESHOOTING_GUIDE.md
Normal file
@@ -0,0 +1,158 @@
|
||||
# Comprehensive Troubleshooting Guide
|
||||
|
||||
**Purpose**: Common issues and solutions for bridge operations
|
||||
|
||||
---
|
||||
|
||||
## ❌ Common Errors
|
||||
|
||||
### "Execution reverted"
|
||||
|
||||
**Cause**: Transaction reverted by contract logic
|
||||
|
||||
**Solutions**:
|
||||
1. Check contract state
|
||||
2. Verify parameters
|
||||
3. Check allowances
|
||||
4. Verify balances
|
||||
|
||||
**Debug**:
|
||||
```bash
|
||||
cast call <CONTRACT> "<function>" <args> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### "Insufficient funds"
|
||||
|
||||
**Cause**: Not enough ETH for gas or LINK for fees
|
||||
|
||||
**Solutions**:
|
||||
1. Check ETH balance
|
||||
```bash
|
||||
cast balance <address> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
2. Check LINK balance
|
||||
```bash
|
||||
cast call <LINK_TOKEN> "balanceOf(address)" <address> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
3. Add funds if needed
|
||||
|
||||
---
|
||||
|
||||
### "Nonce too low"
|
||||
|
||||
**Cause**: Transaction nonce is lower than current nonce
|
||||
|
||||
**Solutions**:
|
||||
1. Check current nonce
|
||||
```bash
|
||||
cast nonce <address> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
2. Wait for pending transactions
|
||||
3. Use correct nonce
|
||||
|
||||
---
|
||||
|
||||
### "Replacement transaction underpriced"
|
||||
|
||||
**Cause**: Pending transaction with lower gas price
|
||||
|
||||
**Solutions**:
|
||||
1. Wait for pending transaction
|
||||
2. Use higher gas price
|
||||
3. Cancel pending transaction (if possible)
|
||||
|
||||
---
|
||||
|
||||
### "Destination not enabled"
|
||||
|
||||
**Cause**: Destination chain not configured on bridge
|
||||
|
||||
**Solutions**:
|
||||
1. Verify destination configuration
|
||||
```bash
|
||||
cast call <BRIDGE> "destinations(uint64)" <SELECTOR> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
2. Configure destination if missing
|
||||
```bash
|
||||
bash scripts/configure-bridge-destinations.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### "Gas price below minimum"
|
||||
|
||||
**Cause**: Gas price too low for network
|
||||
|
||||
**Solutions**:
|
||||
1. Get current gas price
|
||||
```bash
|
||||
cast gas-price --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
2. Use higher gas price (1.2x-1.5x current)
|
||||
```bash
|
||||
bash scripts/bridge-with-dynamic-gas.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Debugging Steps
|
||||
|
||||
### 1. Check System Status
|
||||
```bash
|
||||
bash scripts/health-check.sh
|
||||
```
|
||||
|
||||
### 2. Check Transaction Status
|
||||
```bash
|
||||
cast tx <tx_hash> --rpc-url $RPC_URL
|
||||
```
|
||||
|
||||
### 3. Check Logs
|
||||
```bash
|
||||
tail -100 logs/alerts-$(date +%Y%m%d).log
|
||||
```
|
||||
|
||||
### 4. Run Test Suite
|
||||
```bash
|
||||
bash scripts/test-suite.sh all
|
||||
```
|
||||
|
||||
### 5. Check Recent Events
|
||||
```bash
|
||||
bash scripts/monitor-bridge-transfers.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Advanced Troubleshooting
|
||||
|
||||
### Transaction Stuck
|
||||
|
||||
1. Check transaction status
|
||||
2. Check nonce
|
||||
3. Retry with higher gas
|
||||
4. Consider canceling if possible
|
||||
|
||||
### Contract Not Found
|
||||
|
||||
1. Verify contract address
|
||||
2. Check network
|
||||
3. Verify contract deployment
|
||||
|
||||
### RPC Issues
|
||||
|
||||
1. Test RPC connectivity
|
||||
2. Check RPC logs
|
||||
3. Try backup RPC endpoint
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: $(date)
|
||||
|
||||
121
docs/09-troubleshooting/TROUBLESHOOT_CONNECTION.md
Normal file
121
docs/09-troubleshooting/TROUBLESHOOT_CONNECTION.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Troubleshooting Proxmox Connection
|
||||
|
||||
## Current Issue
|
||||
|
||||
The Proxmox host `192.168.11.10` is not reachable from this machine.
|
||||
|
||||
## Diagnosis Results
|
||||
|
||||
- ❌ **Ping Test**: 100% packet loss (host unreachable)
|
||||
- ❌ **Port 8006**: Not accessible
|
||||
- ✅ **Configuration**: Loaded correctly from `~/.env`
|
||||
|
||||
## Possible Causes
|
||||
|
||||
1. **Network Connectivity**
|
||||
- Host is on a different network segment
|
||||
- VPN not connected
|
||||
- Network routing issue
|
||||
- Host is powered off
|
||||
|
||||
2. **Firewall**
|
||||
- Firewall blocking port 8006
|
||||
- Network firewall rules
|
||||
|
||||
3. **Wrong Host Address**
|
||||
- Host IP may have changed
|
||||
- Host may be on different network
|
||||
|
||||
## Troubleshooting Steps
|
||||
|
||||
### 1. Check Network Connectivity
|
||||
|
||||
```bash
|
||||
# Test basic connectivity
|
||||
ping -c 3 192.168.11.10
|
||||
|
||||
# Check if host is on same network
|
||||
ip route | grep 192.168.11.0
|
||||
```
|
||||
|
||||
### 2. Check Alternative Hosts
|
||||
|
||||
If you have access to other Proxmox hosts, try:
|
||||
|
||||
```bash
|
||||
# Test connectivity to alternative hosts
|
||||
ping -c 3 <alternative-proxmox-host>
|
||||
```
|
||||
|
||||
### 3. Use Shell Script (SSH Alternative)
|
||||
|
||||
If you have SSH access to the Proxmox node, use the shell script instead:
|
||||
|
||||
```bash
|
||||
export PROXMOX_HOST=192.168.11.10
|
||||
export PROXMOX_USER=root
|
||||
./list_vms.sh
|
||||
```
|
||||
|
||||
The shell script uses SSH which may work even if the API port is blocked.
|
||||
|
||||
### 4. Check VPN/Network Access
|
||||
|
||||
If the Proxmox host is on a remote network:
|
||||
- Ensure VPN is connected
|
||||
- Verify network routing
|
||||
- Check if you're on the correct network segment
|
||||
|
||||
### 5. Verify Host is Running
|
||||
|
||||
- Check if Proxmox host is powered on
|
||||
- Verify Proxmox services are running
|
||||
- Check Proxmox web interface accessibility
|
||||
|
||||
### 6. Test from Proxmox Host Itself
|
||||
|
||||
If you can access the Proxmox host directly:
|
||||
|
||||
```bash
|
||||
# SSH to Proxmox host
|
||||
ssh root@192.168.11.10
|
||||
|
||||
# Test API locally
|
||||
curl -k https://localhost:8006/api2/json/version
|
||||
```
|
||||
|
||||
## Alternative: Use Shell Script
|
||||
|
||||
The shell script (`list_vms.sh`) uses SSH instead of the API, which may work even if:
|
||||
- API port is blocked
|
||||
- You're on a different network
|
||||
- VPN provides SSH access but not API access
|
||||
|
||||
```bash
|
||||
export PROXMOX_HOST=192.168.11.10
|
||||
export PROXMOX_USER=root
|
||||
./list_vms.sh
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **If host is accessible via SSH**: Use `list_vms.sh`
|
||||
2. **If host is on different network**: Connect VPN or update network routing
|
||||
3. **If host IP changed**: Update `PROXMOX_HOST` in `~/.env`
|
||||
4. **If host is down**: Wait for it to come back online
|
||||
|
||||
## Quick Test Commands
|
||||
|
||||
```bash
|
||||
# Test ping
|
||||
ping -c 3 192.168.11.10
|
||||
|
||||
# Test port
|
||||
timeout 5 bash -c "echo > /dev/tcp/192.168.11.10/8006" && echo "Port open" || echo "Port closed"
|
||||
|
||||
# Test SSH (if available)
|
||||
ssh -o ConnectTimeout=5 root@192.168.11.10 "pvesh get /nodes" && echo "SSH works" || echo "SSH failed"
|
||||
|
||||
# Check current network
|
||||
ip addr show | grep "inet "
|
||||
```
|
||||
57
docs/09-troubleshooting/TUNNEL_SOLUTIONS.md
Normal file
57
docs/09-troubleshooting/TUNNEL_SOLUTIONS.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Tunnel-Based Solutions for Proxmox Access
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Your Current Situation
|
||||
- **Your Network**: `192.168.1.0/24` (IP: 192.168.1.36)
|
||||
- **Proxmox Network**: `192.168.11.0/24` (Hosts: 192.168.11.10, 11, 12)
|
||||
- **Problem**: Different network segments - direct connection blocked
|
||||
|
||||
### Available Tunnels
|
||||
|
||||
| Host | Internal IP | Tunnel URL | Status |
|
||||
|------|-------------|------------|--------|
|
||||
| ml110-01 | 192.168.11.10 | https://ml110-01.d-bis.org | ✅ Active |
|
||||
| r630-01 | 192.168.11.11 | https://r630-01.d-bis.org | ✅ Active |
|
||||
| r630-02 | 192.168.11.12 | https://r630-02.d-bis.org | ✅ Healthy |
|
||||
|
||||
## Solution 1: Use SSH Tunnel (Recommended for API)
|
||||
|
||||
```bash
|
||||
# Start SSH tunnel
|
||||
./setup_ssh_tunnel.sh
|
||||
|
||||
# In another terminal, use localhost
|
||||
PROXMOX_HOST=localhost python3 list_vms.py
|
||||
|
||||
# Stop tunnel when done
|
||||
./stop_ssh_tunnel.sh
|
||||
```
|
||||
|
||||
## Solution 2: Access Web UI via Cloudflare Tunnel
|
||||
|
||||
Simply open in browser:
|
||||
- https://ml110-01.d-bis.org (for ml110-01)
|
||||
- https://r630-01.d-bis.org (for r630-01)
|
||||
- https://r630-02.d-bis.org (for r630-02)
|
||||
|
||||
## Solution 3: Run Script from Proxmox Network
|
||||
|
||||
Copy scripts to a machine on `192.168.11.0/24` and run there.
|
||||
|
||||
## Solution 4: Use Shell Script via SSH
|
||||
|
||||
```bash
|
||||
export PROXMOX_HOST=192.168.11.10
|
||||
export PROXMOX_USER=root
|
||||
./list_vms.sh
|
||||
```
|
||||
|
||||
## Files Created
|
||||
|
||||
- `TUNNEL_ANALYSIS.md` - Complete tunnel analysis
|
||||
- `list_vms_with_tunnels.py` - Enhanced script with tunnel awareness
|
||||
- `setup_ssh_tunnel.sh` - SSH tunnel setup script
|
||||
- `stop_ssh_tunnel.sh` - Stop SSH tunnel script
|
||||
- `TUNNEL_SOLUTIONS.md` - This file
|
||||
|
||||
133
docs/09-troubleshooting/fix-ssh-key-issue.md
Normal file
133
docs/09-troubleshooting/fix-ssh-key-issue.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Fix SSH "Failed to Load Local Private Key" Error
|
||||
|
||||
**Issue:** "failed to load local private key" error when trying to connect
|
||||
|
||||
---
|
||||
|
||||
## Common Causes
|
||||
|
||||
1. **SSH config references a key that doesn't exist**
|
||||
2. **Private key has wrong permissions**
|
||||
3. **Corrupted or missing private key**
|
||||
4. **SSH trying to use wrong key file**
|
||||
|
||||
---
|
||||
|
||||
## Quick Fixes
|
||||
|
||||
### Option 1: Use Password Authentication Only (Temporary)
|
||||
|
||||
Force SSH to use password authentication and skip keys:
|
||||
|
||||
```bash
|
||||
ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no root@192.168.11.14
|
||||
```
|
||||
|
||||
Or with sshpass:
|
||||
|
||||
```bash
|
||||
sshpass -p 'L@kers2010' ssh -o PreferredAuthentications=password -o PubkeyAuthentication=no root@192.168.11.14
|
||||
```
|
||||
|
||||
### Option 2: Check and Fix SSH Config
|
||||
|
||||
Check if there's a problematic SSH config entry:
|
||||
|
||||
```bash
|
||||
cat ~/.ssh/config
|
||||
```
|
||||
|
||||
If you see an entry for R630-04 or 192.168.11.14 with `IdentityFile` pointing to a missing key, either:
|
||||
- Remove that entry
|
||||
- Comment it out
|
||||
- Create the missing key file
|
||||
|
||||
### Option 3: Fix Key Permissions
|
||||
|
||||
If keys exist but have wrong permissions:
|
||||
|
||||
```bash
|
||||
chmod 600 ~/.ssh/id_*
|
||||
chmod 644 ~/.ssh/id_*.pub
|
||||
chmod 700 ~/.ssh
|
||||
```
|
||||
|
||||
### Option 4: Remove Problematic Key References
|
||||
|
||||
If a specific key is causing issues, you can:
|
||||
|
||||
```bash
|
||||
# Check which keys SSH is trying to use
|
||||
ssh -v root@192.168.11.14 2>&1 | grep -i "identity\|key"
|
||||
|
||||
# If a specific key is problematic, temporarily rename it
|
||||
mv ~/.ssh/id_rsa ~/.ssh/id_rsa.backup 2>/dev/null
|
||||
mv ~/.ssh/id_ed25519 ~/.ssh/id_ed25519.backup 2>/dev/null
|
||||
```
|
||||
|
||||
### Option 5: Clear SSH Agent (if using)
|
||||
|
||||
```bash
|
||||
ssh-add -D # Remove all keys from agent
|
||||
eval $(ssh-agent -k) # Kill agent
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Solution
|
||||
|
||||
Since you have console access and just want to reset the password, use password-only authentication:
|
||||
|
||||
```bash
|
||||
# From your local machine
|
||||
sshpass -p 'YOUR_PASSWORD' ssh \
|
||||
-o PreferredAuthentications=password \
|
||||
-o PubkeyAuthentication=no \
|
||||
-o StrictHostKeyChecking=no \
|
||||
root@192.168.11.14
|
||||
```
|
||||
|
||||
Or if you're already on console, just run commands directly without SSH.
|
||||
|
||||
---
|
||||
|
||||
## For Console Access
|
||||
|
||||
If you're already logged in via console, you don't need SSH at all. Just run the commands directly on R630-04:
|
||||
|
||||
```bash
|
||||
# Reset password
|
||||
passwd root
|
||||
|
||||
# Fix pveproxy
|
||||
systemctl restart pveproxy
|
||||
|
||||
# Check status
|
||||
systemctl status pveproxy
|
||||
ss -tlnp | grep 8006
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## After Fixing
|
||||
|
||||
Once password is reset and you can SSH in, you can:
|
||||
|
||||
1. **Set up SSH keys properly** (optional):
|
||||
```bash
|
||||
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_r630-04 -N ""
|
||||
ssh-copy-id -i ~/.ssh/id_ed25519_r630-04.pub root@192.168.11.14
|
||||
```
|
||||
|
||||
2. **Update SSH config** (optional):
|
||||
```bash
|
||||
cat >> ~/.ssh/config << 'EOF'
|
||||
Host r630-04
|
||||
HostName 192.168.11.14
|
||||
User root
|
||||
IdentityFile ~/.ssh/id_ed25519_r630-04
|
||||
EOF
|
||||
```
|
||||
|
||||
But for now, just use password authentication or console access.
|
||||
|
||||
179
docs/09-troubleshooting/ssh-r630-04-options.md
Normal file
179
docs/09-troubleshooting/ssh-r630-04-options.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# SSH Connection Options for R630-04
|
||||
|
||||
**IP:** 192.168.11.14
|
||||
**User:** root
|
||||
**Issue:** Permission denied with password authentication
|
||||
|
||||
---
|
||||
|
||||
## Possible Causes
|
||||
|
||||
1. **Password incorrect** - Double-check the password
|
||||
2. **Password authentication disabled** - Server may require key-based auth
|
||||
3. **Account locked** - Too many failed attempts
|
||||
4. **SSH configuration** - Server may have restrictive settings
|
||||
5. **Wrong user** - May need different username
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting Steps
|
||||
|
||||
### 1. Check SSH Authentication Methods
|
||||
|
||||
From another host that can connect to R630-04, check:
|
||||
|
||||
```bash
|
||||
ssh -v root@192.168.11.14 2>&1 | grep -i "auth"
|
||||
```
|
||||
|
||||
Look for:
|
||||
- `publickey` - Key-based authentication enabled
|
||||
- `password` - Password authentication enabled
|
||||
- `keyboard-interactive` - Interactive password prompt
|
||||
|
||||
### 2. Try Different Authentication Methods
|
||||
|
||||
**Option A: Use SSH Key (if available)**
|
||||
|
||||
```bash
|
||||
# Check for existing SSH keys
|
||||
ls -la ~/.ssh/id_*
|
||||
|
||||
# Copy public key to R630-04 (if you have access from another host)
|
||||
ssh-copy-id root@192.168.11.14
|
||||
```
|
||||
|
||||
**Option B: Check if password has special characters**
|
||||
|
||||
The password `L@kers2010` contains `@` which should work, but try:
|
||||
- Typing it carefully
|
||||
- Using copy-paste
|
||||
- Checking for hidden characters
|
||||
|
||||
### 3. Connect from R630-03 (which works)
|
||||
|
||||
Since R630-03 works, you can:
|
||||
|
||||
```bash
|
||||
# SSH to R630-03 first
|
||||
ssh root@192.168.11.13
|
||||
# Password: L@kers2010
|
||||
|
||||
# Then from R630-03, SSH to R630-04
|
||||
ssh root@192.168.11.14
|
||||
```
|
||||
|
||||
### 4. Check SSH Configuration on R630-04
|
||||
|
||||
If you have console access or another way to access R630-04:
|
||||
|
||||
```bash
|
||||
# Check SSH configuration
|
||||
cat /etc/ssh/sshd_config | grep -E "PasswordAuthentication|PubkeyAuthentication|PermitRootLogin"
|
||||
|
||||
# Should show:
|
||||
# PasswordAuthentication yes (or the line commented out)
|
||||
# PubkeyAuthentication yes
|
||||
# PermitRootLogin yes (or prohibit-password)
|
||||
```
|
||||
|
||||
### 5. Reset Root Password (if you have console access)
|
||||
|
||||
If you have physical/console access:
|
||||
|
||||
```bash
|
||||
# Boot into single user mode or recovery
|
||||
# Then reset password:
|
||||
passwd root
|
||||
```
|
||||
|
||||
### 6. Check Account Status
|
||||
|
||||
```bash
|
||||
# Check if root account is locked
|
||||
passwd -S root
|
||||
|
||||
# Check failed login attempts
|
||||
lastb | grep root | tail -20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative Access Methods
|
||||
|
||||
### 1. Use Proxmox Console
|
||||
|
||||
If R630-04 is managed by another Proxmox host:
|
||||
|
||||
```bash
|
||||
# From Proxmox host managing R630-04
|
||||
pct enter <container-id> # if it's a container
|
||||
# or
|
||||
qm monitor <vm-id> # if it's a VM
|
||||
```
|
||||
|
||||
### 2. Use iDRAC/iLO (Dell R630)
|
||||
|
||||
If it's a physical Dell R630 server:
|
||||
|
||||
- Access iDRAC interface (usually https://<idrac-ip>)
|
||||
- Use remote console
|
||||
- Reset password from console
|
||||
|
||||
### 3. Network Boot/KVM Access
|
||||
|
||||
If you have KVM over IP or network boot access, you can:
|
||||
- Access console directly
|
||||
- Reset password
|
||||
- Check SSH configuration
|
||||
|
||||
---
|
||||
|
||||
## Quick Verification
|
||||
|
||||
Try these commands from R630-03 (which works):
|
||||
|
||||
```bash
|
||||
# From R630-03
|
||||
ssh root@192.168.11.13
|
||||
# After logging in, try:
|
||||
ssh -v root@192.168.11.14 2>&1 | grep -E "auth|password|key"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Next Steps
|
||||
|
||||
1. **Try connecting from R630-03** - Sometimes network path matters
|
||||
2. **Verify password** - Try typing it again carefully
|
||||
3. **Check if password was changed** - May have been changed since last login
|
||||
4. **Use console access** - If available (iDRAC, KVM, etc.)
|
||||
5. **Check SSH logs on R630-04** - `/var/log/auth.log` or `journalctl -u ssh`
|
||||
|
||||
---
|
||||
|
||||
## If Password Authentication is Disabled
|
||||
|
||||
If the server only accepts SSH keys:
|
||||
|
||||
1. **Generate SSH key pair** (on your local machine):
|
||||
```bash
|
||||
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_r630-04
|
||||
```
|
||||
|
||||
2. **Copy public key** (if you have another way to access):
|
||||
```bash
|
||||
# Method 1: If you have access from R630-03
|
||||
ssh root@192.168.11.13
|
||||
ssh-copy-id -i ~/.ssh/id_ed25519_r630-04.pub root@192.168.11.14
|
||||
|
||||
# Method 2: Manual copy (if you have console access)
|
||||
# Copy the public key content to:
|
||||
# /root/.ssh/authorized_keys on R630-04
|
||||
```
|
||||
|
||||
3. **Connect with key**:
|
||||
```bash
|
||||
ssh -i ~/.ssh/id_ed25519_r630-04 root@192.168.11.14
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user