KubeAcademy by VMware
Bootstrapping Using Cluster API Part 1: Concepts, Components, and Terminology
Next Lesson

Cluster API aims to bring Kubernetes-style declarative APIs to the process of creating, configuring, and managing Kubernetes clusters. This lesson provides an introduction to Cluster API, and covers concepts, components, and terminology.

Scott Lowe

Principal Field Engineer at Kong, Inc.

Focused on Kubernetes and related cloud-native technologies. Helps customers deploy and deliver applications with cloud-native and open source technologies.

View Profile

Hi, everyone. My name is Scott Lowe and I'm a staff architect at VMware. Today I'd like to talk to you about Cluster API. Specifically, I'd like to provide an introduction to some of the concepts, components, and terminology involved with Cluster API. I'll follow this video up with a second video that actually does some hands-on work using Cluster API to deploy a Kubernetes cluster on AWS. But first, let's dig into a definition of what Cluster API is and what it attempts to accomplish.

Pulling from the upstream GitHub repo where the Kubernetes community works on the Cluster API project, Cluster API is defined as an effort to bring declarative Kubernetes-style API's to cluster creation, cluster configuration, and cluster management. Basically, Cluster API is the answer to the question, "What if we could use Kubernetes to manage Kubernetes?" You're probably already aware that the Kubernetes APIs are declarative, meaning that a user using the API will define the desired state of the cluster. In other words, I want this service to exist, or I want this many pods to be running, or I want this ingress definition to be present. This means that Kubernetes is responsible for taking that desired state as expressed by the user and making it become the actual state.

This process of making desired state an actual state match is called reconciliation and the reconciliation loop that sits at the core of Kubernetes is something we discuss in more detail in the Kubernetes one-on-one course. If you're not familiar with the importance of that reconciliation loop sitting at the heart of Kubernetes, I recommend you have a look at the Kubernetes one-on-one course for more detail. But now that we have an idea of what Cluster API attempts to do, let's take a look at some of the concepts that are involved.

First, as I mentioned already, Cluster API is designed to be declarative. You, as a user, a consumer of Cluster API, will define what the desired state of the cluster or clusters should be and then Kubernetes will be responsible for reconciling that desired state into actual state. Cluster API is also designed to be platform agnostic, meaning that it's designed to be able to be used on multiple platforms to deploy Kubernetes clusters, not only on premises, but also in the public cloud such as on AWS or Azure. However, in order to be platform agnostic and work on multiple platforms, Cluster API does have to implement some platform-specific details, and those platform-specific details are broken out into providers. So there's an AWS provider, a vSphere provider, an Azure provider, and so on. These providers implement the provider-specific details that enable the generic Cluster API implementation to work on that particular platform.

The components that are involved in making Cluster API involve, or include first Custom Resource Definitions or CRDs. Custom Resource Definitions are a way to extend the Kubernetes API to include new kinds of objects. By default, the Kubernetes API includes objects like pods or services or deployments. Using CRDs allows us to create entirely new types of objects that are managed by the Kubernetes API. And Cluster API uses a series of custom resource definitions to create new objects, like clusters and machines. Custom resources are simply instances of a custom resource definition. So when I create a cluster using the cluster CRD, then it creates a custom resource that is known as a cluster. However, we still need something to implement the reconciliation loop for these custom API objects for these CRDs, and that's where the controllers come in. Controllers are responsible for implementing and managing the reconciliation loop that takes the desired state from the user and makes it to the actual state, or the realized state.

Now, to help solidify this idea of custom resource definitions and controllers, I'd like to login to a system, so let's flip over to our terminal and let's login to an AWS EC2 instance where I have a Kubernetes cluster running and it's been enabled for Cluster API. First, let's just do a kubectl get nodes. You'll see I have a couple of nodes here running. Okay, great. Now let's look at the custom resource definitions that are present on the system. You can see there's a lot here. This particular cluster is running Calico and Calico uses a lot of different CRDs. So let's filter that out. We'll just do a grep and search for cluster and this will cut it down. And you'll see now that we are looking at a bunch of Cluster API related custom resource definitions. So we have some generic Cluster API objects, so clusters, machines, machine sets, deployments, kubeadm configs, and kubeadm configs templates. And we also have some providers specific objects.

So AWS clusters, AWS machines, AWS templates. And these are part of the Cluster API provider for AWS, often just referred to as CAPA. Now remember that the custom resource definitions are only part of the mix. We also need controllers. So let's look at pods. So here we're looking at the pods that are in the CAPA system namespace, and if I do a kubectl, get namespaces, you'll see I have three Cluster API specific namespaces. So looking at the CAPA system, what I looked at earlier, there's a controller manager that's responsible for managing the AWS specific CRDs, the instances of those CRDs. If I look at the CAPI namespace, you'll see there's a controller manager there and if I look at the CAP of PK, you'll see there's a controller manager there as well. This controls manager is responsible for managing the controllers and the controllers are in turn responsible for implementing the reconciliation loop for a particular custom resource definition. So you can see here I have a cluster that is prepared for Cluster API, so it has the custom resource definitions. It has the controllers in place.

Now finally, let's wrap up with some terminology. When we have a Kubernetes cluster that does have the Cluster API CRDs and controllers installed and operating, we call that a management cluster. It is a cluster that is Cluster API aware and it is able to then use those custom resource definitions and those controllers to create other clusters. Those other clusters that it creates, we refer to as workload clusters. Workload clusters are not Cluster API aware, but this is where you will deploy your applications. So you'll have a long lived management cluster that is Cluster API aware and you will use kubectl to interact with that management cluster in order to create, manage, configure, and destroy one or more workload clusters, which is where your applications will be deployed.

Early versions of Cluster API also use something called the Bootstrap cluster, which was a temporary or a femoral cluster that was used to then instantiate the first cluster, which was Cluster API enabled AKA the management cluster. Subsequent revisions of Cluster API have done away or deprecated the Bootstrap cluster. And so now users who want to use Cluster API are expected to create the management cluster on their own and install the Cluster API components into it to make it into a management cluster.

And that's a quick look at Cluster API. Some of the concepts, components, and terminology. In the next video we'll use Cluster API to actually create a Kubernetes cluster on AWS. Thanks for watching.

Give Feedback

Help us improve by sharing your thoughts.

Links

Share