Files
proxmox/docs/05-network/E2E_RPC_EDGE_LIMITATION.md

100 lines
4.9 KiB
Markdown

# E2E RPC Failures — Edge (UDM Pro) Limitation
**Last Updated:** 2026-02-05
**Status:** Active
**See also:** [E2E_CLOUDFLARE_DOMAINS_RUNBOOK.md](E2E_CLOUDFLARE_DOMAINS_RUNBOOK.md), [CLOUDFLARE_ROUTING_MASTER.md](CLOUDFLARE_ROUTING_MASTER.md)
---
## What you see
- **E2E verification:** 25 DNS pass, 14 HTTPS pass, **6 failed** (all RPC HTTP: `rpc-http-pub.d-bis.org`, `rpc.d-bis.org`, `rpc2.d-bis.org`, `rpc-http-prv.d-bis.org`, `rpc.public-0138.defi-oracle.io`, `rpc.defi-oracle.io`).
- **RPC response:** `405 Method Not Allowed` when calling any of those hostnames with POST from the internet. Body: `{"error":{"code":405,"message":"Method Not Allowed"}}`.
---
## Troubleshooting the six failures
1. **Confirm the failure**
Run the dedicated RPC troubleshooting script (same path as E2E — public FQDN):
```bash
bash scripts/verify/troubleshoot-rpc-failures.sh
```
You should see each of the 6 domains return **HTTP 405** and a short error body. The script does not change any config.
2. **Capture HTTP status in E2E**
After the latest E2E script update, RPC failures are reported with the actual HTTP code, e.g. `RPC: rpc.d-bis.org failed (HTTP 405)`. The evidence dir also has `*_rpc_response.txt` with the full response body (e.g. the 405 JSON).
3. **Verify backend (optional, from LAN)**
From a host on the same LAN as NPMplus (192.168.11.167), run:
```bash
bash scripts/verify/troubleshoot-rpc-failures.sh --lan
```
If NPMplus and the RPC backends are correct, the `--lan` test should show **HTTP 200** and a JSON-RPC `result` (chainId). That confirms the failure is at the edge (public path), not NPMplus or the nodes.
4. **Fix options**
See [How to get full E2E pass (including RPC)](#how-to-get-full-e2e-pass-including-rpc) below: **Option A** (UDM Pro allow POST) or **Option B** (Cloudflare Tunnel for RPC).
5. **UniFi API and POST filtering**
The Official UniFi Network API (firewall zones, ACL rules, DPI) does **not** expose any setting for HTTP method (GET vs POST). It is L3/L4 only. So the 405 cannot be found or changed via the API. To inspect what the API does expose, run:
```bash
./scripts/unifi/query-firewall-and-dpi-api.sh
```
Report and JSON are written to `docs/04-configuration/verification-evidence/unifi-api-firewall-query/`.
---
## Cause
- **NPMplus** is correctly configured (Wave 0 run; `block_exploits: false` for RPC hosts). From a host on the LAN, `curl -X POST https://192.168.11.167/ -H "Host: rpc.d-bis.org" ...` returns **200** and valid JSON-RPC.
- Traffic that goes **via the public IP** (76.53.10.36) hits **UDM Pro** first. The edge returns **405** for POST to those hostnames, so the 6 E2E RPC checks fail when using the direct/Fastly path.
So the limitation is at the **edge** (UDM Pro or port-forward), not NPMplus or the RPC backends.
---
## How to get full E2E pass (including RPC)
Choose one:
### Option A: UDM Pro allows POST
- In UDM Pro firewall/port-forward rules for 76.53.10.36:443 → 192.168.11.167:443, ensure there is **no** rule that restricts or blocks POST (e.g. “allow only GET”).
- If the device does not expose per-method settings, you may need a firmware update or to use Option B.
### Option B: Use Cloudflare Tunnel for RPC (bypass edge)
Follow the **Option B runbook** for step-by-step instructions and the DNS script:
- **[OPTION_B_RPC_VIA_TUNNEL_RUNBOOK.md](OPTION_B_RPC_VIA_TUNNEL_RUNBOOK.md)** — Tunnel ingress checklist, DNS switch (script or manual), and verification.
**Short version:**
1. **Fix Cloudflare Tunnel 502s** so the tunnel reaches NPMplus:
- Follow [CLOUDFLARE_TUNNEL_502_FIX_RUNBOOK.md](../04-configuration/cloudflare/CLOUDFLARE_TUNNEL_502_FIX_RUNBOOK.md): point all Public Hostnames (including the 6 RPC) to `http://192.168.11.167:80`, verify from VMID 102, restart cloudflared.
2. **Point RPC hostnames to the tunnel** in Cloudflare DNS:
- Run: `./scripts/set-rpc-dns-to-tunnel.sh` (uses `CLOUDFLARE_TUNNEL_ID` and zone IDs from `.env`), or set CNAME manually per the runbook.
3. **Re-run E2E:** After DNS propagates, run `bash scripts/verify/troubleshoot-rpc-failures.sh` and `./scripts/verify/verify-end-to-end-routing.sh --profile=public`; POST will succeed and the 6 RPC checks can pass.
---
## Treating current E2E as “success” (DNS + HTTPS only)
When the only failures are the 6 RPC (edge blocking POST), you can still treat E2E as successful for DNS and HTTPS:
```bash
E2E_SUCCESS_IF_ONLY_RPC_BLOCKED=1 ./scripts/verify/verify-end-to-end-routing.sh --profile=public
```
- Exit code is **0** when DNS and HTTPS all pass and all failures are RPC.
- Use this in CI or scripts when you accept “RPC blocked by edge” until Option A or B is done.
---
## Summary
| Goal | Action |
|------|--------|
| **Full E2E pass (including RPC)** | Fix edge: UDM Pro allow POST (Option A) or use Tunnel for RPC (Option B). |
| **Success for DNS + HTTPS only** | Run with `E2E_SUCCESS_IF_ONLY_RPC_BLOCKED=1`. |