Consolidation

Please note: this EKS and Karpenter workshop version is now deprecated since the launch of Karpenter v1beta, and has been updated to a new home on AWS Workshop Studio here: Karpenter: Amazon EKS Best Practice and Cloud Cost Optimization.

This workshop remains here for reference to those who have used this workshop before, or those who want to reference this workshop for running Karpenter on version v1alpha5.

In the previous section we did set the default provisioner configured with a specific ttlSecondsAfterEmpty. This instructs Karpenter to remove nodes after ttlSecondsAfterEmpty of a node being empty. Note Karpenter will take Daemonset into consideration.We also know that nodes can be removed when they reach the ttlSecondsUntilExpired. This is ideal to force node termination on the cluster while bringing new nodes that will pick up the latest AMI’s.

Automated deprovisioning is configured through the ProvisionerSpec .ttlSecondsAfterEmpty, .ttlSecondsUntilExpired and .consolidation.enabled fields. If these are not configured, Karpenter will not default values for them and will not terminate nodes.

There is another way to configure Karpenter to deprovision nodes called Consolidation. This mode is preferred for workloads such as microservices and is imcompatible with setting up the ttlSecondsAfterEmpty . When set in consolidation mode Karpenter works to actively reduce cluster cost by identifying when nodes can be removed as their workloads will run on other nodes in the cluster and when nodes can be replaced with cheaper variants due to a change in the workloads.

Before we proceed to see how Consolidation works, let’s change the default provisioner configuration:

cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  consolidation:
    enabled: true
  labels:
    intent: apps
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: karpenter.k8s.aws/instance-size
      operator: NotIn
      values: [nano, micro, small, medium, large]
  limits:
    resources:
      cpu: 1000
      memory: 1000Gi
  ttlSecondsUntilExpired: 2592000
  providerRef:
    name: default
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    karpenter.sh/discovery: eksspotworkshop
  securityGroupSelector:
    karpenter.sh/discovery: eksspotworkshop
  tags:
    KarpenerProvisionerName: "default"
    NodeType: "karpenter-workshop"
    IntentLabel: "apps"
EOF

When consolidation is enabled it is recommended you configure requests=limits for all non-CPU resources. As an example, pods that have a memory limit that is larger than the memory request can burst above the request. If several pods on the same node burst at the same time, this can cause some of the pods to be terminated due to an out of memory (OOM) condition. Consolidation can make this more likely to occur as it works to pack pods onto nodes only considering their requests.

Challenge

You can use Kube-ops-view or just plain kubectl cli to visualize the changes and answer the questions below. In the answers we will provide the CLI commands that will help you check the resposnes. Remember: to get the url of kube-ops-view you can run the following command kubectl get svc kube-ops-view | tail -n 1 | awk '{ print "Kube-ops-view URL = http://"$4 }'. Note the current version of Kube-ops-view sometimes takes time to reflect the correct state of the cluster.

Before we start to deep dive into consolidation, lets set up the environment in an initial state. First lets scale the inflate application to 3 replicas, so that we provision a small node. You can do that by running:

kubectl scale deployment inflate --replicas 3

You can then check the number of nodes using the following command kubectl get nodes ; Once that there are 3 nodes in the cluster you can run again:

kubectl scale deployment inflate --replicas 10

That will create up yet another node. In total there should be now 4 nodes, 2 for the managed nodegroup, and 2 on-demand nodes, one holding 3 of our inflate replicas of size xlarge and a 2xlarge

Answer the following questions. You can expand each question to get a detailed answer and validate your understanding.

1) Scale the inflate deployment to 6 replicas, What should happen ?

Click here to show the answer

2) What should happen when we move to just 3 replicas ?

Click here to show the answer

3) Increase the replicas to 10. What will happen if we change the provisioner to support both on-demand and spot ?

Click here to show the answer