asked on

Ceph Cluster won't get healthy after one node fails

Hello
we are playing around with Ceph as a new storage server. (Version Luminous)
For the cluster we have 3 physical Ubuntu nodes which are the OSD, Monitoring and iSCSI GW Nodes. For installation purposes we have one virtual Ubuntu admin node (ceph-deploy).
We use the official installation documentation (http://docs.ceph.com) for the setup.

The installation went ok and the cluster is up and running, but if we shutdown one node, the cluster stays degraded.
If I power on the node the cluster get's healthy after a few seconds.

Any ideas what we can do or check?

The cluster has one pool.
Name: rbd
PGs: 400
Size: 3
Min_size: 2
Crush-Map: autogenerated

Cluster with all 3 nodes

  cluster:
    id:     551080bb-eada-44e4-bcbe-7c952dbca781
    health: HEALTH_OK

  services:
    mon:         3 daemons, quorum HBceph01,HBceph02,HBceph03
    mgr:         HBceph01(active), standbys: HBceph03, HBceph02
    osd:         12 osds: 12 up, 12 in
    tcmu-runner: 3 daemons active

  data:
    pools:   1 pools, 400 pgs
    objects: 1666 objects, 6416 MB
    usage:   148 GB used, 89275 GB / 89424 GB avail
    pgs:     400 active+clean

  io:
    client:   5401 B/s rd, 1941 B/s wr, 4 op/s rd, 0 op/s wr

Open in new window

Cluster with 2 node

    id:     551080bb-eada-44e4-bcbe-7c952dbca781
    health: HEALTH_WARN
            Degraded data redundancy: 1689/5067 objects degraded (33.333%), 390 pgs degraded, 400 pgs undersized
            1/3 mons down, quorum HBceph01,HBceph03

  services:
    mon:         3 daemons, quorum HBceph01,HBceph03, out of quorum: HBceph02
    mgr:         HBceph01(active), standbys: HBceph03
    osd:         12 osds: 8 up, 8 in
    tcmu-runner: 2 daemons active

  data:
    pools:   1 pools, 400 pgs
    objects: 1689 objects, 6523 MB
    usage:   97894 MB used, 59520 GB / 59616 GB avail
    pgs:     1689/5067 objects degraded (33.333%)
             390 active+undersized+degraded
             10  active+undersized

  io:
    client:   3715 B/s rd, 3 op/s rd, 0 op/s wr

Open in new window

ID CLASS WEIGHT   TYPE NAME         STATUS REWEIGHT PRI-AFF
-1       87.33000 root default
-3       29.11000     host HBceph01
 0   hdd  7.27699         osd.0         up  1.00000 1.00000
 1   hdd  7.27699         osd.1         up  1.00000 1.00000
 2   hdd  7.27699         osd.2         up  1.00000 1.00000
 3   hdd  7.27699         osd.3         up  1.00000 1.00000
-5       29.11000     host HBceph02
 4   hdd  7.27699         osd.4       down        0 1.00000
 5   hdd  7.27699         osd.5       down        0 1.00000
 6   hdd  7.27699         osd.6       down        0 1.00000
 7   hdd  7.27699         osd.7       down        0 1.00000
-7       29.11000     host HBceph03
 8   hdd  7.27699         osd.8         up  1.00000 1.00000
 9   hdd  7.27699         osd.9         up  1.00000 1.00000
10   hdd  7.27699         osd.10        up  1.00000 1.00000
11   hdd  7.27699         osd.11        up  1.00000 1.00000

Open in new window

ASKER CERTIFIED SOLUTION

David Johnson, CD

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

TREXman

ASKER

Hello David

Thank you for your hint - you are right.
I was looking at the wrong failure domains - which was the NODE and not the OSD at the crush map.