Files

354 lines
8.0 KiB
Markdown
Raw Permalink Normal View History

# Troubleshooting Guide
Common issues and solutions for Cloudflare Tunnel setup.
## Quick Diagnostics
Run the health check script first:
```bash
./scripts/check-tunnel-health.sh
```
## Common Issues
### 1. Tunnel Service Not Running
**Symptoms:**
- `systemctl status cloudflared-ml110` shows inactive
- DNS resolves but connection times out
**Solutions:**
```bash
# Check service status
systemctl status cloudflared-ml110
# Check logs for errors
journalctl -u cloudflared-ml110 -n 50
# Restart service
systemctl restart cloudflared-ml110
# Check if credentials file exists
ls -la /etc/cloudflared/tunnel-ml110.json
# Verify configuration
cat /etc/cloudflared/tunnel-ml110.yml
```
**Common Causes:**
- Missing credentials file
- Invalid tunnel ID in config
- Network connectivity issues
- Incorrect configuration syntax
### 2. DNS Not Resolving
**Symptoms:**
- `dig ml110-01.d-bis.org` returns no results
- Browser shows "DNS_PROBE_FINISHED_NXDOMAIN"
**Solutions:**
1. **Verify DNS record exists:**
- Go to Cloudflare Dashboard → DNS → Records
- Check CNAME record exists for `ml110-01`
- Verify proxy is enabled (orange cloud)
2. **Verify DNS record target:**
- Target should be: `<tunnel-id>.cfargotunnel.com`
- Not an IP address
3. **Wait for DNS propagation:**
- DNS changes can take up to 5 minutes
- Clear DNS cache: `sudo systemd-resolve --flush-caches`
4. **Test DNS resolution:**
```bash
dig ml110-01.d-bis.org
nslookup ml110-01.d-bis.org
```
### 3. Connection Timeout
**Symptoms:**
- DNS resolves correctly
- Connection times out when accessing URL
- Browser shows "ERR_CONNECTION_TIMED_OUT"
**Solutions:**
1. **Check tunnel is running:**
```bash
systemctl status cloudflared-ml110
```
2. **Check tunnel logs:**
```bash
journalctl -u cloudflared-ml110 -f
```
3. **Verify internal connectivity:**
```bash
# From VMID 102, test connection to Proxmox host
curl -k https://192.168.11.10:8006
```
4. **Check tunnel configuration:**
- Verify service URL is correct: `https://192.168.11.10:8006`
- Check tunnel ID matches Cloudflare dashboard
5. **Verify Cloudflare tunnel status:**
- Go to Cloudflare Zero Trust → Networks → Tunnels
- Check tunnel shows "Healthy" status
### 4. SSL Certificate Errors
**Symptoms:**
- Browser shows SSL certificate warnings
- "NET::ERR_CERT_AUTHORITY_INVALID"
**Solutions:**
1. **This is expected** - Proxmox uses self-signed certificates
2. **Cloudflare handles SSL termination** at the edge
3. **Verify tunnel config has:**
```yaml
originRequest:
noTLSVerify: true
```
4. **If issues persist, try HTTP instead:**
```yaml
service: http://192.168.11.10:8006
```
(Note: Less secure, but may work if HTTPS issues persist)
### 5. Cloudflare Access Not Showing
**Symptoms:**
- Direct access to Proxmox UI
- No Cloudflare Access login page
**Solutions:**
1. **Verify DNS proxy is enabled:**
- Cloudflare Dashboard → DNS → Records
- Ensure orange cloud is ON (proxied)
2. **Verify Access application exists:**
- Zero Trust → Access → Applications
- Check application for `ml110-01.d-bis.org` exists
3. **Verify Access policy:**
- Check policy is configured correctly
- Ensure policy is active
4. **Check SSL/TLS settings:**
- Cloudflare Dashboard → SSL/TLS
- Set to "Full" or "Full (strict)"
5. **Clear browser cache:**
- Hard refresh: Ctrl+Shift+R (or Cmd+Shift+R on Mac)
### 6. MFA Not Working
**Symptoms:**
- Login works but MFA prompt doesn't appear
- Can't complete authentication
**Solutions:**
1. **Verify MFA is enabled in policy:**
- Zero Trust → Access → Applications
- Edit application → Edit policy
- Check "Require" includes MFA
2. **Check identity provider settings:**
- Verify MFA is configured in your identity provider
- Test MFA with another application
3. **Check user MFA status:**
- Zero Trust → My Team → Users
- Verify user has MFA enabled
### 7. Service Fails to Start
**Symptoms:**
- `systemctl start cloudflared-ml110` fails
- Service immediately stops after starting
**Solutions:**
1. **Check service logs:**
```bash
journalctl -u cloudflared-ml110 -n 100
```
2. **Verify credentials file:**
```bash
ls -la /etc/cloudflared/tunnel-ml110.json
cat /etc/cloudflared/tunnel-ml110.json
```
3. **Verify configuration syntax:**
```bash
cloudflared tunnel --config /etc/cloudflared/tunnel-ml110.yml validate
```
4. **Check file permissions:**
```bash
chmod 600 /etc/cloudflared/tunnel-ml110.json
chmod 644 /etc/cloudflared/tunnel-ml110.yml
```
5. **Test tunnel manually:**
```bash
cloudflared tunnel --config /etc/cloudflared/tunnel-ml110.yml run
```
### 8. Multiple Tunnels Conflict
**Symptoms:**
- Only one tunnel works
- Other tunnels fail to start
**Solutions:**
1. **Check for port conflicts:**
- Each tunnel uses different metrics port
- Verify ports 9091, 9092, 9093 are not in use
2. **Verify separate credentials:**
- Each tunnel needs its own credentials file
- Verify files are separate: `tunnel-ml110.json`, `tunnel-r630-01.json`, etc.
3. **Check systemd services:**
```bash
systemctl status cloudflared-*
```
4. **Verify separate config files:**
- Each tunnel has its own config file
- Tunnel IDs are different in each config
## Advanced Troubleshooting
### Check Tunnel Connectivity
```bash
# From VMID 102, test each Proxmox host
curl -k -I https://192.168.11.10:8006
curl -k -I https://192.168.11.11:8006
curl -k -I https://192.168.11.12:8006
```
### Verify Tunnel Configuration
```bash
# Validate config file
cloudflared tunnel --config /etc/cloudflared/tunnel-ml110.yml validate
# List tunnels
cloudflared tunnel list
# Get tunnel info
cloudflared tunnel info <tunnel-id>
```
### Check Network Connectivity
```bash
# Test DNS resolution
dig ml110-01.d-bis.org
# Test HTTPS connectivity
curl -v https://ml110-01.d-bis.org
# Check Cloudflare IPs
nslookup ml110-01.d-bis.org
```
### View Real-time Logs
```bash
# Follow logs for all tunnels
journalctl -u cloudflared-ml110 -f
journalctl -u cloudflared-r630-01 -f
journalctl -u cloudflared-r630-02 -f
```
## Diagnostic Commands
### Complete Health Check
```bash
# Run comprehensive health check
./scripts/check-tunnel-health.sh
# Check all services
systemctl status cloudflared-ml110 cloudflared-r630-01 cloudflared-r630-02
# Check tunnel connectivity
for tunnel in ml110 r630-01 r630-02; do
echo "Checking $tunnel..."
systemctl is-active cloudflared-$tunnel && echo "✓ Running" || echo "✗ Not running"
done
```
### Reset Tunnel
If a tunnel is completely broken:
```bash
# Stop service
systemctl stop cloudflared-ml110
# Remove old credentials (backup first!)
cp /etc/cloudflared/tunnel-ml110.json /etc/cloudflared/tunnel-ml110.json.backup
# Re-authenticate (if needed)
cloudflared tunnel login
# Re-create tunnel (if needed)
cloudflared tunnel create tunnel-ml110
# Start service
systemctl start cloudflared-ml110
```
## Getting Help
If issues persist:
1. **Collect diagnostic information:**
```bash
./scripts/check-tunnel-health.sh > /tmp/tunnel-health.txt
journalctl -u cloudflared-* > /tmp/tunnel-logs.txt
```
2. **Check Cloudflare dashboard:**
- Zero Trust → Networks → Tunnels → Status
- Access → Logs → Recent authentication attempts
3. **Review documentation:**
- [Cloudflare Tunnel Docs](https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/)
- [Cloudflare Access Docs](https://developers.cloudflare.com/cloudflare-one/policies/access/)
4. **Common issues:**
- Invalid tunnel token
- Network connectivity issues
- Incorrect configuration
- DNS propagation delays
## Prevention
To avoid issues:
1.**Regular health checks** - Run `check-tunnel-health.sh` daily
2.**Monitor logs** - Set up log monitoring
3.**Backup configs** - Keep backups of config files
4.**Test after changes** - Test after any configuration changes
5.**Document changes** - Keep track of what was changed