Karpenter - An Intro

What is Karpenter?

was developed by AWS in 2021
is not tied to a specific cloud provider. Implementations exist for
- AWS
- Azure
- Google Cloud

monitors and finds pending pods (pods that cannot be scheduled due to a lack of resources)
continusouly monitors and checks with the kubernetes scheduler
it looks for pods that are in a pending state
it then evaluates the pod requirements of those pending pods
looks at resource requests - CPU, Memory, GPU
also looks at node selectors, affinities, and tolerations
using this information, it provisions a node
it makes a ‘just in time’ (jit) determination
and deploys the most cost effective and appropriately sized compute resource that is available
using all of this information, including utilization of nodes, node workloads, empty nodes, it can consolidate the nodes as well and utilize more efficient nodes

Karpenter offers a flexible approach and as such has some advantages to the traditional kubernetes cluster auto scaler

Feature	Karpenter	Cluster Autoscaler (CAS)
Provisioning model	Application-driven. Makes real-time decisions based on pending pods and provisions right-sized nodes on demand.	Infrastructure-driven. Manages and scales predefined node groups, only launching new nodes from those fixed specifications.
Performance	Faster. Reacts immediately to unschedulable pods and bypasses Auto Scaling Groups to call the cloud provider’s API directly, leading to provisioning in seconds.	Slower. Scans the cluster periodically and relies on Auto Scaling Groups, which can take several minutes to respond.
Cost optimization	High. Uses efficient “bin-packing” algorithms to consolidate workloads onto the fewest, most cost-effective instances. Supports using Spot Instances and falling back to On-Demand.	Lower. Can result in over-provisioning because it must scale in fixed increments based on the predefined node groups.
Management overhead	Low. Manages diverse workload capacity with a single, declarative NodePool resource instead of dozens of node groups.	High. Requires manually managing multiple node groups to support different instance types and configurations.
Multi-cloud	Growing. Initially developed for AWS, it now has official support for Azure and is expanding to other providers.	Mature. Has broad, mature integrations with all major cloud providers.

Have questions or want to share your own expat story? Leave a comment below!

(I will set up comments eventually ;)