Consolidation

In the previous section we did set the default provisioner configured with a specific ttlSecondsAfterEmpty. This instructs Karpenter to remove nodes after ttlSecondsAfterEmpty of a node being empty. Note Karpenter will take Daemonset into consideration.We also know that nodes can be removed when they reach the ttlSecondsUntilExpired. This is ideal to force node termination on the cluster while bringing new nodes that will pick up the latest AMI’s.

Automated deprovisioning is configured through the ProvisionerSpec .ttlSecondsAfterEmpty, .ttlSecondsUntilExpired and .consolidation.enabled fields. If these are not configured, Karpenter will not default values for them and will not terminate nodes.

There is another way to configure Karpenter to deprovision nodes called Consolidation. This mode is preferred for workloads such as microservices and is imcompatible with setting up the ttlSecondsAfterEmpty . When set in consolidation mode Karpenter works to actively reduce cluster cost by identifying when nodes can be removed as their workloads will run on other nodes in the cluster and when nodes can be replaced with cheaper variants due to a change in the workloads.

Before we proceed to see how Consolidation works, let’s change the default provisioner configuration:

cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  consolidation:
    enabled: true
  labels:
    intent: apps
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: karpenter.k8s.aws/instance-size
      operator: NotIn
      values: [nano, micro, small, medium, large]
  limits:
    resources:
      cpu: 1000
      memory: 1000Gi
  ttlSecondsUntilExpired: 2592000
  providerRef:
    name: default
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
  securityGroupSelector:
    alpha.eksctl.io/cluster-name: ${CLUSTER_NAME}
  tags:
    KarpenerProvisionerName: "default"
    NodeType: "karpenter-workshop"
    IntentLabel: "apps"
EOF

Challenge

You can use Kube-ops-view or just plain kubectl cli to visualize the changes and answer the questions below. In the answers we will provide the CLI commands that will help you check the resposnes. Remember: to get the url of kube-ops-view you can run the following command kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'. Note the current version of Kube-ops-view sometimes takes time to reflect the correct state of the cluster.

Before we start to deep dive into consolidation, lets set up the environment in an initial state. First lets scale the inflate application to 3 replicas, so that we provision a small node. You can do that by running:

kubectl scale deployment inflate --replicas 3

You can then check the number of nodes using the following command kubectl get nodes ; Once that there are 3 nodes in the cluster you can run again:

kubectl scale deployment inflate --replicas 10

That will create up yet another node. In total there should be now 4 nodes, 2 for the managed nodegroup, and 2 on-demand nodes, one holding 3 of our inflate replicas of size xlarge and a 2xlarge

Answer the following questions. You can expand each question to get a detailed answer and validate your understanding.

1) Scale the inflate deployment to 6 replicas, What should happen ?

Click here to show the answer

2) What should happen when we move to just 3 replicas ?

Click here to show the answer

3) Increase the replicas to 10. What will happen if we change the provisioner to support both on-demand and spot ?

Click here to show the answer

4) Scale the inflate service to 6 replicas, what should happen ?

Click here to show the answer

5) Scale the inflate service to 3 replicas, what should happen ?

Click here to show the answer

6) What other scenarios could prevent Consolidation events in your cluster ?

Click here to show the answer

7) Scale the replicas to 0.

In preparation for the next section, scale replicas to 0 using the following command.

kubectl scale deployment inflate --replicas 0

What Have we learned in this section:

In this section we have learned:

  • Karpenter can be configured to consolidate workloads using the . The ttlSecondsAfterEmpty and .consolidation.enabled are mutually exclussive within a provisioner.

  • Consolidation helps to reduce the overal cost of the cluster in two situations. Delete can ocur when the capacity of a node can be safely distributed to other nodes. Replace ocurs when a node can be replace by a smaller node thus reducing the cost in the cluster

  • Consolidation takes into consideration multiple nodes, but only acts on one node at a time. The node selected is the one that minimises the disruption in the cluster.

  • Delete Consolidation does also include events where instances are moved from on-demand to spot, however karpenter does not trigger Replace to make Spot node smaller as this can have an impact on the level of interruptions.

  • Karpenter uses cordon and drain best practices to terminate nodes. to make it safer, Karpenter adds a finalizer so that a kubernetes delete node command, results in a graceful termination that remove the node safely from the cluster.