All blogs / When Kubernetes Collapses: Lessons from a Home Cluster Meltdown
September 02, 2023 • Matthew Duong • kubernetes • 1 minutes
During an intense weekend of running workloads for Coderone, an AI game programming tournament, and Headbot, a project involving generative AI avatars, my 6-node MicroK8s cluster unexpectedly crashed. Symptoms ranged from stuck pods and workloads to crippling network errors and sluggish kubectl commands.
After deep analysis, I found that the problems stemmed from a corrupted control plane data storage in MicroK8s. Interestingly, MicroK8s uses dqlite, a lightweight but less mature data store compared to etcd, often used in production-grade clusters.
Dqlite is praised for its simplicity and resource efficiency but falls short in terms of maturity and high availability. Etcd is designed for high availability and data consistency but demands more resources and a deeper understanding of its operational aspects.
After abandoning MicroK8s, I evaluated other Kubernetes distributions: k3s and RKE2.
K3s was promising but didn't meet my high-availability requirements. RKE2, on the other hand, proved to be reliable and secure, working seamlessly with a cluster of 6 nodes.
For single-node or experimental setups, MicroK8s is a great choice, especially with its range of free add-ons. For more robust, scalable solutions, RKE2 stands out as a better option. For a detailed account of my experience and further insights, check out the full article.