# Ceph Installation Guide for Proxmox **Last Updated**: 2024-12-19 **Infrastructure**: 2-node Proxmox cluster (ML110-01, R630-01) ## Overview Ceph is a distributed storage system that provides object, block, and file storage. This guide covers installing Ceph on the Proxmox infrastructure to provide distributed storage for VMs. ## Architecture ### Cluster Configuration **Nodes**: - **ML110-01** (192.168.11.10): Ceph Monitor, OSD, Manager - **R630-01** (192.168.11.11): Ceph Monitor, OSD, Manager **Network**: 192.168.11.0/24 ### Ceph Components 1. **Monitors (MON)**: Track cluster state (minimum 1, recommended 3+) 2. **Managers (MGR)**: Provide monitoring and management interfaces 3. **OSDs (Object Storage Daemons)**: Store data on disks 4. **MDS (Metadata Servers)**: For CephFS (optional) ### Storage Configuration **For 2-node setup**: - Reduced redundancy (size=2, min_size=1) - Suitable for development/testing - For production, add a third node or use external storage ## Prerequisites ### Hardware Requirements **Per Node**: - CPU: 4+ cores recommended - RAM: 4GB+ for Ceph services - Storage: Dedicated disks/partitions for OSDs - Network: 1Gbps+ (10Gbps recommended) ### Software Requirements - Proxmox VE 9.1+ - SSH access to all nodes - Root or sudo access - Network connectivity between nodes ## Installation Steps ### Step 1: Prepare Nodes ```bash # On both nodes, update system apt update && apt upgrade -y # Install prerequisites apt install -y chrony python3-pip ``` ### Step 2: Configure Hostnames and Network ```bash # On ML110-01 hostnamectl set-hostname ml110-01 echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts # On R630-01 hostnamectl set-hostname r630-01 echo "192.168.11.10 ml110-01 ml110-01.sankofa.nexus" >> /etc/hosts echo "192.168.11.11 r630-01 r630-01.sankofa.nexus" >> /etc/hosts ``` ### Step 3: Install Ceph ```bash # Add Ceph repository wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add - echo "deb https://download.ceph.com/debian-quincy/ bullseye main" > /etc/apt/sources.list.d/ceph.list # Update and install apt update apt install -y ceph ceph-common ceph-mds ``` ### Step 4: Create Ceph User ```bash # On both nodes, create ceph user useradd -d /home/ceph -m -s /bin/bash ceph echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph chmod 0440 /etc/sudoers.d/ceph ``` ### Step 5: Configure SSH Key Access ```bash # On ML110-01 (deployment node) su - ceph ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa ssh-copy-id ceph@ml110-01 ssh-copy-id ceph@r630-01 ``` ### Step 6: Initialize Ceph Cluster ```bash # On ML110-01 (deployment node) cd ~ mkdir ceph-cluster cd ceph-cluster # Create cluster configuration ceph-deploy new ml110-01 r630-01 # Edit ceph.conf to add network and reduce redundancy for 2-node cat >> ceph.conf << EOF [global] osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 128 osd pool default pgp num = 128 public network = 192.168.11.0/24 cluster network = 192.168.11.0/24 EOF # Install Ceph on all nodes ceph-deploy install ml110-01 r630-01 # Create initial monitor ceph-deploy mon create-initial # Deploy admin key ceph-deploy admin ml110-01 r630-01 ``` ### Step 7: Add OSDs ```bash # List available disks ceph-deploy disk list ml110-01 ceph-deploy disk list r630-01 # Prepare disks (replace /dev/sdX with actual disk) ceph-deploy disk zap ml110-01 /dev/sdb ceph-deploy disk zap r630-01 /dev/sdb # Create OSDs ceph-deploy osd create --data /dev/sdb ml110-01 ceph-deploy osd create --data /dev/sdb r630-01 ``` ### Step 8: Deploy Manager ```bash # Deploy manager daemon ceph-deploy mgr create ml110-01 r630-01 ``` ### Step 9: Verify Cluster ```bash # Check cluster status ceph -s # Check OSD status ceph osd tree # Check health ceph health ``` ## Proxmox Integration ### Step 1: Create Ceph Storage Pool in Proxmox ```bash # On Proxmox nodes, create Ceph storage pvesm add cephfs ceph-storage --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs ``` ### Step 2: Create RBD Pool for Block Storage ```bash # Create RBD pool ceph osd pool create rbd 128 128 # Initialize pool for RBD rbd pool init rbd # Create storage in Proxmox pvesm add rbd rbd-storage --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin ``` ### Step 3: Configure Proxmox Storage 1. **Via Web UI**: - Datacenter → Storage → Add - Select "RBD" or "CephFS" - Configure connection details 2. **Via CLI**: ```bash # RBD storage pvesm add rbd ceph-rbd --pool rbd --monhost 192.168.11.10,192.168.11.11 --username admin --content images,rootdir # CephFS storage pvesm add cephfs ceph-fs --monhost 192.168.11.10,192.168.11.11 --username admin --fsname cephfs --content iso,backup ``` ## Configuration Files ### ceph.conf ```ini [global] fsid = mon initial members = ml110-01, r630-01 mon host = 192.168.11.10, 192.168.11.11 public network = 192.168.11.0/24 cluster network = 192.168.11.0/24 auth cluster required = cephx auth service required = cephx auth client required = cephx osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 128 osd pool default pgp num = 128 ``` ## Monitoring ### Ceph Dashboard ```bash # Enable dashboard module ceph mgr module enable dashboard # Create dashboard user ceph dashboard ac-user-create admin administrator # Access dashboard # https://ml110-01.sankofa.nexus:8443 ``` ### Prometheus Integration ```bash # Enable prometheus module ceph mgr module enable prometheus # Metrics endpoint # http://ml110-01.sankofa.nexus:9283/metrics ``` ## Maintenance ### Adding OSDs ```bash ceph-deploy disk zap /dev/sdX ceph-deploy osd create --data /dev/sdX ``` ### Removing OSDs ```bash ceph osd out ceph osd crush remove osd. ceph auth del osd. ceph osd rm ``` ### Cluster Health ```bash # Check status ceph -s # Check detailed health ceph health detail # Check OSD status ceph osd tree ``` ## Troubleshooting ### Common Issues 1. **Clock Skew**: Ensure NTP is configured ```bash systemctl enable chronyd systemctl start chronyd ``` 2. **Network Issues**: Verify connectivity ```bash ping ml110-01 ping r630-01 ``` 3. **OSD Issues**: Check OSD status ```bash ceph osd tree systemctl status ceph-osd@ ``` ## Security ### Firewall Rules ```bash # Allow Ceph ports ufw allow 6789/tcp # Monitors ufw allow 6800:7300/tcp # OSDs ufw allow 8443/tcp # Dashboard ``` ### Authentication - Use cephx authentication (default) - Rotate keys regularly - Limit admin access ## Related Documentation - [Ceph Official Documentation](https://docs.ceph.com/) - [Proxmox Ceph Integration](https://pve.proxmox.com/pve-docs/chapter-pveceph.html) - [Storage Configuration](../proxmox/STORAGE_CONFIGURATION.md)