Ceph Cluster won't get healthy after one node fails

Hello
we are playing around with Ceph as a new storage server. (Version Luminous)
For the cluster we have 3 physical Ubuntu nodes which are the OSD, Monitoring and iSCSI GW Nodes. For installation purposes we have one virtual Ubuntu admin node (ceph-deploy).
We use the official installation documentation (http://docs.ceph.com) for the setup.

The installation went ok and the cluster is up and running, but if we shutdown one node, the cluster stays degraded.
If I power on the node the cluster get's healthy after a few seconds.

Any ideas what we can do or check?

The cluster has one pool.
Name: rbd
PGs: 400
Size: 3
Min_size: 2
Crush-Map: autogenerated
 
Cluster with all 3 nodes
  cluster:
    id:     551080bb-eada-44e4-bcbe-7c952dbca781
    health: HEALTH_OK

  services:
    mon:         3 daemons, quorum HBceph01,HBceph02,HBceph03
    mgr:         HBceph01(active), standbys: HBceph03, HBceph02
    osd:         12 osds: 12 up, 12 in
    tcmu-runner: 3 daemons active

  data:
    pools:   1 pools, 400 pgs
    objects: 1666 objects, 6416 MB
    usage:   148 GB used, 89275 GB / 89424 GB avail
    pgs:     400 active+clean

  io:
    client:   5401 B/s rd, 1941 B/s wr, 4 op/s rd, 0 op/s wr

Open in new window


Cluster with 2 node
    id:     551080bb-eada-44e4-bcbe-7c952dbca781
    health: HEALTH_WARN
            Degraded data redundancy: 1689/5067 objects degraded (33.333%), 390 pgs degraded, 400 pgs undersized
            1/3 mons down, quorum HBceph01,HBceph03

  services:
    mon:         3 daemons, quorum HBceph01,HBceph03, out of quorum: HBceph02
    mgr:         HBceph01(active), standbys: HBceph03
    osd:         12 osds: 8 up, 8 in
    tcmu-runner: 2 daemons active

  data:
    pools:   1 pools, 400 pgs
    objects: 1689 objects, 6523 MB
    usage:   97894 MB used, 59520 GB / 59616 GB avail
    pgs:     1689/5067 objects degraded (33.333%)
             390 active+undersized+degraded
             10  active+undersized

  io:
    client:   3715 B/s rd, 3 op/s rd, 0 op/s wr

Open in new window


ID CLASS WEIGHT   TYPE NAME         STATUS REWEIGHT PRI-AFF
-1       87.33000 root default
-3       29.11000     host HBceph01
 0   hdd  7.27699         osd.0         up  1.00000 1.00000
 1   hdd  7.27699         osd.1         up  1.00000 1.00000
 2   hdd  7.27699         osd.2         up  1.00000 1.00000
 3   hdd  7.27699         osd.3         up  1.00000 1.00000
-5       29.11000     host HBceph02
 4   hdd  7.27699         osd.4       down        0 1.00000
 5   hdd  7.27699         osd.5       down        0 1.00000
 6   hdd  7.27699         osd.6       down        0 1.00000
 7   hdd  7.27699         osd.7       down        0 1.00000
-7       29.11000     host HBceph03
 8   hdd  7.27699         osd.8         up  1.00000 1.00000
 9   hdd  7.27699         osd.9         up  1.00000 1.00000
10   hdd  7.27699         osd.10        up  1.00000 1.00000
11   hdd  7.27699         osd.11        up  1.00000 1.00000

Open in new window

TREXmanAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

David Johnson, CD, MVPOwnerCommented:
Isn't that the way it is supposed to respond?  If a node goes missing the group states it is degraded. When the node re-attaches and syncs then the group goes to healthy status.

Why are you expecting it to be different?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
TREXmanAuthor Commented:
Hello David

Thank you for your hint - you are right.
I was looking at the wrong failure domains - which was the NODE and not the OSD at the crush map.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage

From novice to tech pro — start learning today.