Files
Sankofa/docs/storage/CEPH_INSTALLATION.md
defiQUG 9daf1fd378 Apply Composer changes: comprehensive API updates, migrations, middleware, and infrastructure improvements
- Add comprehensive database migrations (001-024) for schema evolution
- Enhance API schema with expanded type definitions and resolvers
- Add new middleware: audit logging, rate limiting, MFA enforcement, security, tenant auth
- Implement new services: AI optimization, billing, blockchain, compliance, marketplace
- Add adapter layer for cloud integrations (Cloudflare, Kubernetes, Proxmox, storage)
- Update Crossplane provider with enhanced VM management capabilities
- Add comprehensive test suite for API endpoints and services
- Update frontend components with improved GraphQL subscriptions and real-time updates
- Enhance security configurations and headers (CSP, CORS, etc.)
- Update documentation and configuration files
- Add new CI/CD workflows and validation scripts
- Implement design system improvements and UI enhancements
2025-12-12 18:01:35 -08:00

335 lines
6.8 KiB
Markdown

# Ceph Installation Guide for Proxmox
**Last Updated**: 2024-12-19
**Infrastructure**: 2-node Proxmox cluster (ML110-01, R630-01)
## Overview
Ceph is a distributed storage system that provides object, block, and file storage. This guide covers installing Ceph on the Proxmox infrastructure to provide distributed storage for VMs.
## Architecture
### Cluster Configuration
**Nodes**:
- **ML110-01** (192.168.11.10): Ceph Monitor, OSD, Manager
- **R630-01** (192.168.11.11): Ceph Monitor, OSD, Manager
**Network**: 192.168.11.0/24
### Ceph Components
1. **Monitors (MON)**: Track cluster state (minimum 1, recommended 3+)
2. **Managers (MGR)**: Provide monitoring and management interfaces
3. **OSDs (Object Storage Daemons)**: Store data on disks
4. **MDS (Metadata Servers)**: For CephFS (optional)
### Storage Configuration
**For 2-node setup**:
- Reduced redundancy (size=2, min_size=1)
- Suitable for development/testing
- For production, add a third node or use external storage
## Prerequisites
### Hardware Requirements
**Per Node**:
- CPU: 4+ cores recommended
- RAM: 4GB+ for Ceph services
- Storage: Dedicated disks/partitions for OSDs
- Network: 1Gbps+ (10Gbps recommended)
### Software Requirements
- Proxmox VE 9.1+
- SSH access to all nodes
- Root or sudo access
- Network connectivity between nodes
## Installation Steps
### Step 1: Prepare Nodes
```bash
# On both nodes, update system
apt update && apt upgrade -y
# Install prerequisites
apt install -y chrony python3-pip
```
### Step 2: Configure Hostnames and Network
```bash
# On ML110-01
hostnamectl set-hostname ml110-01
echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts
echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts
# On R630-01
hostnamectl set-hostname r630-01
echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts
echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts
```
### Step 3: Install Ceph
```bash
# Add Ceph repository
wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
echo "deb https://download.ceph.com/debian-quincy/ bullseye main" > /etc/apt/sources.list.d/ceph.list
# Update and install
apt update
apt install -y ceph ceph-common ceph-mds
```
### Step 4: Create Ceph User
```bash
# On both nodes, create ceph user
useradd -d /home/ceph -m -s /bin/bash ceph
echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph
chmod 0440 /etc/sudoers.d/ceph
```
### Step 5: Configure SSH Key Access
```bash
# On ML110-01 (deployment node)
su - ceph
ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa
ssh-copy-id ceph@ml110-01
ssh-copy-id ceph@r630-01
```
### Step 6: Initialize Ceph Cluster
```bash
# On ML110-01 (deployment node)
cd ~
mkdir ceph-cluster
cd ceph-cluster
# Create cluster configuration
ceph-deploy new ml110-01 r630-01
# Edit ceph.conf to add network and reduce redundancy for 2-node
cat >> ceph.conf << EOF
[global]
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
public network = 192.168.11.0/24
cluster network = 192.168.11.0/24
EOF
# Install Ceph on all nodes
ceph-deploy install ml110-01 r630-01
# Create initial monitor
ceph-deploy mon create-initial
# Deploy admin key
ceph-deploy admin ml110-01 r630-01
```
### Step 7: Add OSDs
```bash
# List available disks
ceph-deploy disk list ml110-01
ceph-deploy disk list r630-01
# Prepare disks (replace /dev/sdX with actual disk)
ceph-deploy disk zap ml110-01 /dev/sdb
ceph-deploy disk zap r630-01 /dev/sdb
# Create OSDs
ceph-deploy osd create --data /dev/sdb ml110-01
ceph-deploy osd create --data /dev/sdb r630-01
```
### Step 8: Deploy Manager
```bash
# Deploy manager daemon
ceph-deploy mgr create ml110-01 r630-01
```
### Step 9: Verify Cluster
```bash
# Check cluster status
ceph -s
# Check OSD status
ceph osd tree
# Check health
ceph health
```
## Proxmox Integration
### Step 1: Create Ceph Storage Pool in Proxmox
```bash
# On Proxmox nodes, create Ceph storage
pvesm add cephfs ceph-storage --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs
```
### Step 2: Create RBD Pool for Block Storage
```bash
# Create RBD pool
ceph osd pool create rbd 128 128
# Initialize pool for RBD
rbd pool init rbd
# Create storage in Proxmox
pvesm add rbd rbd-storage --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin
```
### Step 3: Configure Proxmox Storage
1. **Via Web UI**:
- Datacenter → Storage → Add
- Select "RBD" or "CephFS"
- Configure connection details
2. **Via CLI**:
```bash
# RBD storage
pvesm add rbd ceph-rbd --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin --content images,rootdir
# CephFS storage
pvesm add cephfs ceph-fs --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs --content iso,backup
```
## Configuration Files
### ceph.conf
```ini
[global]
fsid = <cluster-fsid>
mon initial members = ml110-01, r630-01
mon host = 192.168.11.10, 192.168.11.11
public network = 192.168.11.0/24
cluster network = 192.168.11.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
```
## Monitoring
### Ceph Dashboard
```bash
# Enable dashboard module
ceph mgr module enable dashboard
# Create dashboard user
ceph dashboard ac-user-create admin <password> administrator
# Access dashboard
# https://ml110-01.sankofa.nexus:8443
```
### Prometheus Integration
```bash
# Enable prometheus module
ceph mgr module enable prometheus
# Metrics endpoint
# http://ml110-01.sankofa.nexus:9283/metrics
```
## Maintenance
### Adding OSDs
```bash
ceph-deploy disk zap <node> /dev/sdX
ceph-deploy osd create --data /dev/sdX <node>
```
### Removing OSDs
```bash
ceph osd out <osd-id>
ceph osd crush remove osd.<osd-id>
ceph auth del osd.<osd-id>
ceph osd rm <osd-id>
```
### Cluster Health
```bash
# Check status
ceph -s
# Check detailed health
ceph health detail
# Check OSD status
ceph osd tree
```
## Troubleshooting
### Common Issues
1. **Clock Skew**: Ensure NTP is configured
```bash
systemctl enable chronyd
systemctl start chronyd
```
2. **Network Issues**: Verify connectivity
```bash
ping ml110-01
ping r630-01
```
3. **OSD Issues**: Check OSD status
```bash
ceph osd tree
systemctl status ceph-osd@<id>
```
## Security
### Firewall Rules
```bash
# Allow Ceph ports
ufw allow 6789/tcp # Monitors
ufw allow 6800:7300/tcp # OSDs
ufw allow 8443/tcp # Dashboard
```
### Authentication
- Use cephx authentication (default)
- Rotate keys regularly
- Limit admin access
## Related Documentation
- [Ceph Official Documentation](https://docs.ceph.com/)
- [Proxmox Ceph Integration](https://pve.proxmox.com/pve-docs/chapter-pveceph.html)
- [Storage Configuration](../proxmox/STORAGE_CONFIGURATION.md)