How To Configure Single-Node Ceph Cluster To Run Properly

July 4, 2016 | By in LINUX HOWTO, STORAGE
| Reply More

Ceph is designed to be a fault-tolerant, scalable storage system. This means that in a production environment, it is expected that at a minimum, there will be three Ceph nodes in a cluster. If you can only afford a single node for now, or if you need only a single Ceph node for testing purposes, You will run into some problems. A single-node Ceph cluster will consider itself to be in a degraded state, since by default, it will be looking for another node to replicate data to. You will not be able to use it. This How-To will show you how to reconfigure a single Ceph Node so that it will be usable. This will work if your Ceph Node has at least two OSDs available. We have added an introduction to ceph in our previous article to get started.


CRUSH is the algorithm that Ceph uses to determine how and where to place data to satisfy replication and resiliency rules. The CRUSH Map gives CRUSH a view of what the cluster physically looks like, and the replication rules for each node. We will obtain a copy of the CRUSH Map from the Ceph node, edit it to replicate data only within the node's OSDs, then re-insert it into the Ceph node, overwriting the existing CRUSH Map. This will allow the single-node Ceph cluster to operate in a clean state, ready and willing to serve requests.

Obtaining The CRUSH Map

Access your ceph admin node. This may be your Ceph storage node as well, if that is how it was installed. All of the following commands are performed from the Ceph admin node.

Extract the cluster CRUSH Map and save it as a file named "crush_map_compressed"

ceph osd getcrushmap -o crush_map_compressed


This is a compressed binary file that Ceph interprets directly, we will need to decompress it into a text format that we can edit. The following command decompresses the CRUSH Map file we extracted, and saves the contents to a file named "crush_map_decompressed"

crushtool -d crush_map_compressed -o crush_map_decompressed

Now open up the decompressed CRUSH file with your favorite text editor. Assuming that the Ceph node is named "Storage-01", and that it has 6 OSDs, the CRUSH Map should look similar to this:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host Storage-01 {
id -2           # do not change unnecessarily
# weight 21.792
alg straw
hash 0  # rjenkins1
item osd.0 weight 3.632
item osd.1 weight 3.632
item osd.2 weight 3.632
item osd.3 weight 3.632
item osd.4 weight 3.632
item osd.5 weight 3.632
root default {
id -1           # do not change unnecessarily
# weight 21.792
alg straw
hash 0  # rjenkins1
item Storage-01 weight 21.792

# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit

# end crush map

Pay attention to the bottom section that starts with "# rules" - that is the section that defines how replication is done across the cluster.
Take this line

step chooseleaf firstn 0 type host

and change the "host" to "osd". It should look like this:

step chooseleaf firstn 0 type osd

Changing this will cause the CRUSH algorithm to be satisfied with just replicating data onto an OSD that is not necessarily on a separate host. This will allow the cluster to enter a clean and active state when data has been replicated from one OSD to the other.

Save the change.

Insert The CRUSH Map

Now that we have a modified CRUSH Map, let's insert it back into the cluster to override the running CRUSH Map configuration.

Compress it again:

crushtool -c crush_map_decompressed -o new_crush_map_compressed

Then insert it using the ceph CLI tool:

ceph osd setcrushmap -i new_crush_map_compressed

If you check the cluster status immediately with "ceph -s", you might catch the node replicating data into its other OSD, but it will eventually look like this:

ceph@ceph-admin:~/os-cluster$ ceph -s
cluster 15ac0bfc-9c48-4992-a2f6-b710d8f03ff4
health HEALTH_OK
monmap e1: 1 mons at {Storage-01=}
election epoch 9, quorum 0 Storage-01
osdmap e105: 6 osds: 6 up, 6 in
flags sortbitwise
pgmap v426885: 624 pgs, 11 pools, 77116 MB data, 12211 objects
150 GB used, 22164 GB / 22315 GB avail
624 active+clean
client io 0 B/s rd, 889 B/s wr, 0 op/s rd, 0 op/s wr

It is now showing an active+clean state.


Although it is designed to be in a high-availability, muti-node setup, it is possible for Ceph to be reconfigured to run as a single-node cluster. Of course, the user should be aware of the risks of data loss when running ceph in that configuration. But this allows for test setups to be made, and low-level SLAs to be fulfilled. Redhat has recently announced new Ceph Storage 2 with enhanced object storage capabilities with improved ease of use.

Tags: , ,


Share This :

Free Linux Ebook to Download

Leave a Reply

Commenting Policy:
Promotion of your products ? Comment gets deleted.
All comments are subject to moderation.