- Deleted outdated files related to repository audit and deployment status, including AUDIT_COMPLETE.md, AUDIT_FIXES_APPLIED.md, FINAL_DEPLOYMENT_STATUS.md, and others. - Cleaned up documentation to streamline the repository and improve clarity for future maintenance. - Updated README and other relevant documentation to reflect the removal of these files.
214 lines
4.4 KiB
Markdown
214 lines
4.4 KiB
Markdown
# Proxmox Cluster Setup Guide
|
|
|
|
**Last Updated**: 2024-12-19
|
|
|
|
## Overview
|
|
|
|
This guide explains how to create a Proxmox cluster between ML110-01 and R630-01.
|
|
|
|
## Prerequisites
|
|
|
|
- ✅ Both instances on same network (192.168.11.0/24) - **Met**
|
|
- ✅ Network connectivity between instances - **Confirmed**
|
|
- ✅ API access to both instances - **Working**
|
|
- ⚠️ SSH access to nodes (for corosync configuration)
|
|
- ⚠️ Firewall rules for clustering ports (5404-5405)
|
|
|
|
## Cluster Configuration
|
|
|
|
- **Cluster Name**: sankofa-cluster
|
|
- **Node 1**: ML110-01 (192.168.11.10)
|
|
- **Node 2**: R630-01 (192.168.11.11)
|
|
|
|
## Method 1: Using Proxmox Web UI (Recommended)
|
|
|
|
### Step 1: Create Cluster on First Node
|
|
|
|
1. Log in to ML110-01 web UI: https://ml110-01.sankofa.nexus:8006
|
|
2. Go to: **Datacenter** → **Cluster**
|
|
3. Click **Create Cluster**
|
|
4. Enter cluster name: `sankofa-cluster`
|
|
5. Click **Create**
|
|
|
|
### Step 2: Add Second Node
|
|
|
|
1. Log in to R630-01 web UI: https://r630-01.sankofa.nexus:8006
|
|
2. Go to: **Datacenter** → **Cluster**
|
|
3. Click **Join Cluster**
|
|
4. Enter:
|
|
- **Cluster Name**: `sankofa-cluster`
|
|
- **Node IP**: `192.168.11.10` (ML110-01)
|
|
- **Root Password**: (for ML110-01)
|
|
5. Click **Join**
|
|
|
|
### Step 3: Verify Cluster
|
|
|
|
On either node:
|
|
- Go to **Datacenter** → **Cluster**
|
|
- You should see both nodes listed
|
|
- Both nodes should show status "Online"
|
|
|
|
## Method 2: Using SSH and pvecm (Command Line)
|
|
|
|
### Step 1: Create Cluster on First Node
|
|
|
|
SSH into ML110-01:
|
|
```bash
|
|
ssh root@192.168.11.10
|
|
|
|
# Create cluster
|
|
pvecm create sankofa-cluster
|
|
|
|
# Verify
|
|
pvecm status
|
|
```
|
|
|
|
### Step 2: Add Second Node
|
|
|
|
SSH into R630-01:
|
|
```bash
|
|
ssh root@192.168.11.11
|
|
|
|
# Join cluster
|
|
pvecm add 192.168.11.10
|
|
|
|
# Verify
|
|
pvecm status
|
|
pvecm nodes
|
|
```
|
|
|
|
### Step 3: Configure Quorum (2-Node Cluster)
|
|
|
|
For a 2-node cluster, you need to configure quorum:
|
|
```bash
|
|
# On either node
|
|
pvecm expected 2
|
|
pvecm status
|
|
```
|
|
|
|
## Method 3: Using API (Limited)
|
|
|
|
The Proxmox API has limited cluster management capabilities. For full cluster creation, use Web UI or SSH.
|
|
|
|
### Check Cluster Status via API
|
|
|
|
```bash
|
|
source .env
|
|
|
|
# Check nodes in cluster
|
|
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
|
|
https://192.168.11.10:8006/api2/json/cluster/config/nodes
|
|
|
|
# Check cluster status
|
|
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
|
|
https://192.168.11.10:8006/api2/json/cluster/status
|
|
```
|
|
|
|
## Firewall Configuration
|
|
|
|
Ensure these ports are open between nodes:
|
|
|
|
- **8006**: Proxmox API (HTTPS)
|
|
- **5404-5405**: Corosync (cluster communication)
|
|
- **22**: SSH (for cluster operations)
|
|
- **3128**: Spice proxy (optional)
|
|
|
|
### Configure Firewall on Proxmox
|
|
|
|
```bash
|
|
# On each node, allow cluster traffic
|
|
pve-firewall localnet add 192.168.11.0/24
|
|
pve-firewall refresh
|
|
```
|
|
|
|
## Verification
|
|
|
|
### Check Cluster Status
|
|
|
|
```bash
|
|
# Via API
|
|
curl -k -H "Authorization: PVEAPIToken ${PROXMOX_TOKEN_ML110_01}" \
|
|
https://192.168.11.10:8006/api2/json/cluster/status
|
|
|
|
# Via SSH (on node)
|
|
pvecm status
|
|
pvecm nodes
|
|
```
|
|
|
|
### Test Cluster Operations
|
|
|
|
1. Create a VM on ML110-01
|
|
2. Verify it appears in cluster view
|
|
3. Try migrating VM between nodes
|
|
4. Verify storage is accessible from both nodes
|
|
|
|
## Troubleshooting
|
|
|
|
### Nodes Can't Join Cluster
|
|
|
|
1. **Check network connectivity**:
|
|
```bash
|
|
ping <other-node-ip>
|
|
```
|
|
|
|
2. **Check firewall**:
|
|
```bash
|
|
iptables -L -n | grep <other-node-ip>
|
|
```
|
|
|
|
3. **Check corosync**:
|
|
```bash
|
|
systemctl status corosync
|
|
corosync-cmapctl | grep members
|
|
```
|
|
|
|
### Quorum Issues
|
|
|
|
For 2-node cluster:
|
|
```bash
|
|
# Set expected votes
|
|
pvecm expected 2
|
|
|
|
# Check quorum
|
|
pvecm status
|
|
```
|
|
|
|
### Cluster Split-Brain
|
|
|
|
If cluster splits:
|
|
```bash
|
|
# On majority node
|
|
pvecm expected 2
|
|
|
|
# On minority node (if needed)
|
|
pvecm expected 1
|
|
```
|
|
|
|
## Post-Cluster Setup
|
|
|
|
After cluster is created:
|
|
|
|
1. **Verify both nodes visible**:
|
|
- Check Datacenter → Cluster in web UI
|
|
- Both nodes should be listed
|
|
|
|
2. **Configure shared storage** (if needed):
|
|
- Set up NFS, Ceph, or other shared storage
|
|
- Add storage to cluster
|
|
|
|
3. **Test VM operations**:
|
|
- Create VM on one node
|
|
- Verify it's visible on both nodes
|
|
- Test migration
|
|
|
|
4. **Update Crossplane ProviderConfig**:
|
|
- Cluster name can be used in provider config
|
|
- VMs can be created on cluster level
|
|
|
|
## Related Documentation
|
|
|
|
- [Inter-Instance Connectivity](./INTER_INSTANCE_CONNECTIVITY.md)
|
|
- [Deployment Guide](./DEPLOYMENT_GUIDE.md)
|
|
- [Task List](./TASK_LIST.md)
|
|
|