Files
Sankofa/docs/proxmox/status/CLUSTER_SETUP.md
defiQUG a8106e24ee Remove obsolete audit and deployment documentation files
- Deleted outdated files related to repository audit and deployment status, including AUDIT_COMPLETE.md, AUDIT_FIXES_APPLIED.md, FINAL_DEPLOYMENT_STATUS.md, and others.
- Cleaned up documentation to streamline the repository and improve clarity for future maintenance.
- Updated README and other relevant documentation to reflect the removal of these files.
2025-12-12 19:42:31 -08:00

214 lines
4.4 KiB
Markdown

# Proxmox Cluster Setup Guide
**Last Updated**: 2024-12-19
## Overview
This guide explains how to create a Proxmox cluster between ML110-01 and R630-01.
## Prerequisites
- ✅ Both instances on same network (192.168.11.0/24) - **Met**
- ✅ Network connectivity between instances - **Confirmed**
- ✅ API access to both instances - **Working**
- ⚠️ SSH access to nodes (for corosync configuration)
- ⚠️ Firewall rules for clustering ports (5404-5405)
## Cluster Configuration
- **Cluster Name**: sankofa-cluster
- **Node 1**: ML110-01 (192.168.11.10)
- **Node 2**: R630-01 (192.168.11.11)
## Method 1: Using Proxmox Web UI (Recommended)
### Step 1: Create Cluster on First Node
1. Log in to ML110-01 web UI: https://ml110-01.sankofa.nexus:8006
2. Go to: **Datacenter****Cluster**
3. Click **Create Cluster**
4. Enter cluster name: `sankofa-cluster`
5. Click **Create**
### Step 2: Add Second Node
1. Log in to R630-01 web UI: https://r630-01.sankofa.nexus:8006
2. Go to: **Datacenter****Cluster**
3. Click **Join Cluster**
4. Enter:
- **Cluster Name**: `sankofa-cluster`
- **Node IP**: `192.168.11.10` (ML110-01)
- **Root Password**: (for ML110-01)
5. Click **Join**
### Step 3: Verify Cluster
On either node:
- Go to **Datacenter****Cluster**
- You should see both nodes listed
- Both nodes should show status "Online"
## Method 2: Using SSH and pvecm (Command Line)
### Step 1: Create Cluster on First Node
SSH into ML110-01:
```bash
ssh root@192.168.11.10
# Create cluster
pvecm create sankofa-cluster
# Verify
pvecm status
```
### Step 2: Add Second Node
SSH into R630-01:
```bash
ssh root@192.168.11.11
# Join cluster
pvecm add 192.168.11.10
# Verify
pvecm status
pvecm nodes
```
### Step 3: Configure Quorum (2-Node Cluster)
For a 2-node cluster, you need to configure quorum:
```bash
# On either node
pvecm expected 2
pvecm status
```
## Method 3: Using API (Limited)
The Proxmox API has limited cluster management capabilities. For full cluster creation, use Web UI or SSH.
### Check Cluster Status via API
```bash
source .env
# Check nodes in cluster
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
https://192.168.11.10:8006/api2/json/cluster/config/nodes
# Check cluster status
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
https://192.168.11.10:8006/api2/json/cluster/status
```
## Firewall Configuration
Ensure these ports are open between nodes:
- **8006**: Proxmox API (HTTPS)
- **5404-5405**: Corosync (cluster communication)
- **22**: SSH (for cluster operations)
- **3128**: Spice proxy (optional)
### Configure Firewall on Proxmox
```bash
# On each node, allow cluster traffic
pve-firewall localnet add 192.168.11.0/24
pve-firewall refresh
```
## Verification
### Check Cluster Status
```bash
# Via API
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
https://192.168.11.10:8006/api2/json/cluster/status
# Via SSH (on node)
pvecm status
pvecm nodes
```
### Test Cluster Operations
1. Create a VM on ML110-01
2. Verify it appears in cluster view
3. Try migrating VM between nodes
4. Verify storage is accessible from both nodes
## Troubleshooting
### Nodes Can't Join Cluster
1. **Check network connectivity**:
```bash
ping <other-node-ip>
```
2. **Check firewall**:
```bash
iptables -L -n | grep <other-node-ip>
```
3. **Check corosync**:
```bash
systemctl status corosync
corosync-cmapctl | grep members
```
### Quorum Issues
For 2-node cluster:
```bash
# Set expected votes
pvecm expected 2
# Check quorum
pvecm status
```
### Cluster Split-Brain
If cluster splits:
```bash
# On majority node
pvecm expected 2
# On minority node (if needed)
pvecm expected 1
```
## Post-Cluster Setup
After cluster is created:
1. **Verify both nodes visible**:
- Check Datacenter → Cluster in web UI
- Both nodes should be listed
2. **Configure shared storage** (if needed):
- Set up NFS, Ceph, or other shared storage
- Add storage to cluster
3. **Test VM operations**:
- Create VM on one node
- Verify it's visible on both nodes
- Test migration
4. **Update Crossplane ProviderConfig**:
- Cluster name can be used in provider config
- VMs can be created on cluster level
## Related Documentation
- [Inter-Instance Connectivity](./INTER_INSTANCE_CONNECTIVITY.md)
- [Deployment Guide](./DEPLOYMENT_GUIDE.md)
- [Task List](./TASK_LIST.md)