View in browser

Are Node Pools Sabotaging Your Cost Optimization Efforts?

AWS recommends running highly available EKS clusters with worker nodes in node pools (Auto Scaling groups) spread across multiple Availability Zones or Regions.

This choice sure makes sense in terms of reliability. But does it help to optimize Kubernetes clusters for cost? 🤔

In many cases, using node pools leads to sub-optimal utilization and makes you waste a lot of the cloud resources you pay for.

What is a node pool anyway?

A node pool is a group of nodes inside one cluster that all have the same configuration. In the context of AWS, a node pool is essentially an Auto Scaling group.

If a scaling policy is enabled, the Auto Scaling group adjusts the desired capacity of the group between the specified minimum and maximum capacity values. It also launches or terminates the instances as needed, enabling you to scale on a schedule.

But there’s a caveat

The problem with node pools is that they often end up being just partially full.

You can easily end up with a collection of nodes containing more capacity than needed and pay for resources that are not being used.

Solution? A single node pool with maximum utilization

Imagine how great it would be to implement a “no node pools” approach that lets you avoid the mounting costs of cloud resources.

Instead of keeping a set of partially full nodes, you could run a single node pool where all the nodes are full, leaving no room for cloud waste.

Increase node utilization in 4 steps

1. Cloud resource optimization

Pick the right instance type and size, with the goal of avoiding overprovisioning your nodes. Make sure that the instance you choose addresses the CPU, memory, and network requirements of your application.

2. Pod resource configuration

Set realistic requests and limits for CPU and memory. Check out this guide to learn how to do that.

3. Pod affinity and anti-affinity

Make good use of labels, and the Kubernetes scheduler will make sure that pods with matching labels or label sets are co-located on the same node or scheduled to different nodes. More details on this topic here.

You can also specify these rules within the affinity section using the podAffinity and podAntiAffinity fields in the pod spec.

Pod affinity assumes that a pod can run on a specific node if there is already a pod meeting particular conditions.
Pod anti-affinity prevents pods from running on the same node as pods matching particular criteria.

4. Node affinity

You can set how pods get matched to nodes with labels that specify to which nodes Kube-scheduler should schedule your pods.

You can specify that by adding the .spec.affinity.nodeAffinity field in your pod.

Remember that if you specify nodeSelector and nodeAffinity, both need to be met for the pod to be scheduled.

There are two types of node affinity:

requiredDuringSchedulingIgnoredDuringExecution – the scheduler will only schedule the pod if the node meets the rule.
preferredDuringSchedulingIgnoredDuringExecution– the scheduler will try to find a node matching the rule, but it will still schedule the pod even if it doesn’t find anything suitable.

This is quite a lot of work. But it’s definitely worth it since a third of the cloud capacity teams get goes to waste.

At CAST AI, we automated this into a feature to save engineers tons of time managing instances.

Cheers,

Allen

Found this email useful? Forward it to your friends and colleagues who need more Kubernetes best practices in their lives!