Building a hyperconverged Kubernetes cluster with MicroK8s and Ceph

This guide explains how to build a highly-available, hyperconverged Kubernetes cluster using MicroK8s, Ceph and MetalLB on commodity hardware or virtual machines. This could be useful for small production deployments, dev/test clusters, or a nerdy toy.

Other guides are available – this one is written from a sysadmin point of view, focusing on stability and ease of maintenance. I prefer to avoid running random scripts or fetching binaries that are then unmanaged and unmanageable. This guide uses package managers and operators wherever possible. I’ve also attempted to explain each step so readers can gain some understanding instead of just copying and pasting the commands. However, this does not absolve you from having a decent background of the components, and it is strongly recommended that you are familiar with kubectl/Kubernetes and Ceph in particular.

The technological landscape moves so fast so these instructions may become outdated quickly. I’ll link to upstream documentation wherever possible so you can check for updated versions.

Finally, this is a fairly simplistic guide that gives you the minimum possible configuration. There are many other components and configurations that you can add, and it also takes no account of security with RBAC etc.


There are a few of considerations when choosing your hardware or virtual “hardware” for use as Kubernetes nodes.

  • MicroK8s requires at least 3 nodes to work in HA mode, so we’ll start with 3 VMs
  • While MicroK8s is quite lightweight, by the time you start adding the storage capability you will need a reasonable amount of memory. Recommended minimum spec for this guide is 2 CPUs and 4GB RAM. More is obviously better, depending on your workload.
  • Each VM will need two block devices (disks). One should be partitioned, formatted and used as a normal OS disk, and the other should be left untouched so it can be claimed by Ceph later. The OS disk will also contain cached container images so could get quite large. I’ve allowed 16GB for the OS disk, and Ceph requires a minimum of 10GB for its disk.
  • If running in VirtualBox, place all VMs either in the same NAT network, or bridged to the host network. Ideally have static IPs.
  • If you are running on bare metal, make sure the machines are on the same network, or at least on networks that can talk to each other.

In my case, I used VirtualBoxc and created 3 identical VMs, kube01, kube02 and kube03.

Operating system

This guide focuses on CentOS/Fedora but should be applicable to many distributions with minor tweaks. I have started with a CentOS 8 minimal installation. Fedora Server or Ubuntu Server would also work just as well but you’ll need to tweak some of the commands.

  • Don’t create a swap partition on these machines
  • Make sure ntp is enabled for accurate time
  • Make sure the VMs have static IPs or DHCP reservations, so their IPs won’t change



Snap is a package manager that contains MicroK8s. It comes preinstalled on Ubuntu, but if you’re on CentOS, Fedora or others, you’ll need to install it on all your nodes.

sudo dnf -y install epel-release
sudo dnf -y install snapd
sudo systemctl enable --now snapd
sudo ln -s /var/lib/snapd/snap /snap



MicroK8s is a lightweight, pre-packaged Kubernetes distribution which is easy to use and works well for small deployments. It’s a lot more straightforward than following Kubernetes the hard way.


Install MicroK8s 1.19.1 or greater from Snap on all your nodes:

sudo snap install microk8s --classic --channel=latest/edge
microk8s status --wait-ready
echo 'alias kubectl="microk8s kubectl"' >> ~/.bashrc

The first time you run microk8s status, you will be prompted to add your user to the microk8s group. Follow the instructions and log in again.

Enable HA mode


Enable MicroK8s HA mode on all nodes, which allows any of the worker nodes to also behave as a master, instead of just being a worker node. This must be enabled before nodes are joined to the master. On some versions of MicroK8s this is enabled by default.

microk8s enable ha-cluster

Add firewall rules


Create firewall rules for your nodes, so they can communicate with each other.

Enable clustering


Enable Microk8s clustering, which allows you to add multiple worker nodes to your existing master node

Run this on the first node only:

[jonathan@kube01 ~]$ microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join

Then execute the join command on the second node, to join it to the master.

[jonathan@kube02 ~]$ microk8s join
Contacting cluster at
Waiting for this node to finish joining the cluster. ..

Repeat for the third node and remember to run the add-node command for each node you add, so they all get a unique token.

Verify that they are correctly joined:

[jonathan@kube01 ~]$ kubectl get nodes
NAME                         STATUS   ROLES    AGE   VERSION   Ready    <none>   35h   v1.19.1-34+08a87c75adb55c   Ready    <none>   35h   v1.19.1-34+08a87c75adb55c   Ready    <none>   35h   v1.19.1-34+08a87c75adb55c

Finally make sure that full HA mode is enabled:

[jonathan@kube01 ~]$ microk8s status
microk8s is running
high-availability: yes
  datastore master nodes:
  datastore standby nodes: none




Enable some basic addons across the cluster to provide a usable experience. Run this on any one node.

microk8s enable dns rbac


We’ve already checked that all 3 nodes are up. Now let’s make sure pods are being scheduled on all nodes:

[jonathan@kube01 ~]$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                                READY   STATUS              RESTARTS   AGE    IP             NODE                      
kube-system   calico-node-bqqqd                                   1/1     Running             0          112m
kube-system   calico-node-z4sxd                                   1/1     Running             0          110m
kube-system   calico-kube-controllers-847c8c99d-4qblz             1/1     Running             0          115m
kube-system   coredns-86f78bb79c-t2sgt                            1/1     Running             0          109m
kube-system   calico-node-t5skc                                   1/1     Running             0          111m

With the cluster in a health and operational state, let’s add the hyperconverged storage. From now on, all steps can be run on kube01.


Ceph is a clustered storage engine which can present its storage to Kubernetes as block storage or a filesystem. We will use the Rook operator to manage our Ceph deployment.



These steps are taken verbatim from the official Rook docs. Check the link above to make sure you are using the latest version of Rook.

First we install the Rook operator, which automates the rest of the Ceph installation.

git clone --single-branch --branch release-1.4
cd rook/cluster/examples/kubernetes/ceph
kubectl create -f common.yaml
kubectl create -f operator.yaml
kubectl -n rook-ceph get pod

Wait until the rook-ceph-operator pod and the rook-discover pods are all Running. This took a few minutes for me. Then we can create the actual Ceph cluster.

kubectl create -f cluster.yaml
kubectl -n rook-ceph get pod

This command will probably take a while – be patient. The operator creates various pods including canaries, monitors, a manager, and provisioners. There will be periods where it looks like it isn’t doing anything, but don’t be tempted to intervene. You can check what the operator is doing by reading its log:

kubectl -n rook-ceph logs rook-ceph-operator-775d4b6c5f-52r87



Install the Ceph toolbox and connect to it so we can run some checks.

kubectl create -f toolbox.yaml
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0]}') bash

OSDs are the individual pieces of storage. Make sure all 3 are available and check the overall health of the cluster.

[root@rook-ceph-tools-6967fc698d-5f4sh /]# ceph status
    id:     e37a9364-b2e4-42ba-a7c0-c7276bc2083d
    health: HEALTH_OK
    mon: 3 daemons, quorum a,b,d (age 2m)
    mgr: a(active, since 33s)
    osd: 3 osds: 3 up (since 89s), 3 in (since 89s)
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 45 GiB / 48 GiB avail
    pgs:     1 active+clean
[root@rook-ceph-tools-6967fc698d-5f4sh /]# ceph osd status
ID  HOST                         USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE      
 0  1027M  14.9G      0        0       0        0   exists,up  
 1  1027M  14.9G      0        0       0        0   exists,up  
 2  1027M  14.9G      0        0       0        0   exists,up  

Block storage


Ceph can provide persistent block storage to Kubernetes as a storage class which can be consumed by one pod at any one time.

kubectl create -f csi/rbd/storageclass.yaml

Verify that the block storageclass is available:

[jonathan@kube01 ~]$ kubectl get storageclass
rook-ceph-block   Delete          Immediate           true                   3m53s



Ceph can provide persistent storage which can be consumed across multiple pods simultaneously by providing a filesystem layer.

kubectl create -f filesystem.yaml

Use the toolbox again to verify that there is a metadata service (mds) available:

[root@rook-ceph-tools-6967fc698d-5f4sh /]# ceph status
    id:     e37a9364-b2e4-42ba-a7c0-c7276bc2083d
    health: HEALTH_OK
    mon: 3 daemons, quorum a,b,d (age 36m)
    mgr: a(active, since 34m)
    mds: myfs:1 {0=myfs-b=up:active} 1 up:standby-replay
    osd: 3 osds: 3 up (since 35m), 3 in (since 35m)
  task status:
    scrub status:
        mds.myfs-a: idle
        mds.myfs-b: idle
    pools:   4 pools, 97 pgs
    objects: 22 objects, 2.2 KiB
    usage:   3.0 GiB used, 45 GiB / 48 GiB avail
    pgs:     97 active+clean
    client:   852 B/s rd, 1 op/s rd, 0 op/s wr

Now we can create a new storageclass based on the filesystem:

kubectl create -f csi/cephfs/storageclass.yaml

Verify the storageclass is present:

[jonathan@kube01 ceph]$ kubectl get storageclass
rook-ceph-block (default)      Delete          Immediate           true                   49m
rook-cephfs          Delete          Immediate           true                   34m


It’s easy to consume the new Ceph storage. Use the storageClassName rook-ceph-block in ReadWriteOnce mode for persistent storage for a single pod, or rook-cephfs in ReadWriteMany mode for persistent storage that can be shared between pods.

apiVersion: v1
kind: PersistentVolumeClaim
  name: ceph-rbd-pvc
  storageClassName: rook-ceph-block
  - ReadWriteOnce
      storage: 20Gi
apiVersion: v1
kind: PersistentVolumeClaim
  name: cephfs-pvc
  storageClassName: rook-cephfs
  - ReadWriteMany
      storage: 1Gi



Probably the simplest way to expose web applications on your cluster is to use an Ingress. This binds to ports 80+443 on all your nodes and listens for http+https requests. It will effectively do name-based virtual hosting, terminate your SSL and will direct your web traffic to a Kubernetes Service with an internal ClusterIP which acts as a simple load balancer. This will require you to set up external round robin DNS to point your A record at all 3 of the node IPs.

microk8s enable ingress
sudo firewall-cmd --permanent --add-service http
sudo firewall-cmd --permanent --add-service https
sudo firewall-cmd --reload



If you want to set up more advanced load balancing, consider using MetalLB. It will load balance your Kubernetes Service and present it on a single virtual IP.


MetalLB will prompt you for one or more ranges of IPs that it can use for load-balancing. It should be fine to accept the default suggestion.

[jonathan@kube01 ~]$ microk8s enable metallb
Enabling MetalLB
Enter each IP address range delimited by comma (e.g. ','):,


Once MetalLB is installed and configured, to expose a service externally, simply create it with spec.type set to LoadBalancer, and MetalLB will do the rest.

It’s important to note that in the default config, the vIP will only appear on one of your nodes and that node will act as the entry point for all traffic before it gets load balanced between nodes, so this could be a bottleneck in busy environments.

apiVersion: v1
kind: Service
  name: nginx
  - port: 80
    targetPort: 80
    app: nginx
  type: LoadBalancer


You now have a fully-featured Kubernetes cluster with high availability, clustered storage, ingress, and load balancing. The possibilities are endless!

If you spot any mistakes, improvements or versions that need updating, please drop a comment below.

3 thoughts on “Building a hyperconverged Kubernetes cluster with MicroK8s and Ceph

  1. Hello,
    that helped me very much, thank you.

    I had to set ROOK_CSI_KUBELET_DIR_PATH: to “/var/snap/microk8s/common/var/lib/kubelet”
    Otherwise, it tries to use /var/lib/kubelet, in which case you would need to manually create the expected dir layout.

    At least, thats true for v1.5.9


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: