Kubernetes Cost Optimisation: Reduce Cloud Bills by 40%

Kubernetes gives you enormous flexibility — but that flexibility comes with a cost trap. Clusters that start small balloon quickly when teams over-provision, leave idle workloads running, or skip resource limits entirely.

This post covers the exact strategies we use at CogniVeu to help clients cut their Kubernetes cloud bills by 30–40% without sacrificing reliability.

1. Right-size your resource requests

The single biggest source of waste is over-provisioned requests. If every pod requests 2 CPU but only uses 0.3, you are paying for 6x more nodes than you need.

How to fix it:

Deploy Goldilocks — it runs VPA in recommendation mode and surfaces right-size suggestions per workload in a dashboard.
Set requests to p95 of actual usage, not peak. Use kubectl top pods or Prometheus container_cpu_usage_seconds_total to measure.
Set limits only when you need hard caps (latency-sensitive services). For batch workloads, skip limits to allow bursting on spare capacity.

# Before: over-provisioned
resources:
  requests:
    cpu: "2"
    memory: "4Gi"
 
# After: right-sized
resources:
  requests:
    cpu: "300m"
    memory: "512Mi"

2. Use Spot / Preemptible nodes for batch and stateless workloads

Spot instances on AWS (or Preemptible on GKE) cost 60–80% less than On-Demand. Most stateless services, CI runners, and batch jobs can tolerate interruptions.

Karpenter makes this easy:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: spot-pool
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: node.kubernetes.io/instance-type
          operator: In
          values: ["m5.large", "m5.xlarge", "m5a.large"]
      nodeClassRef:
        name: default
  disruption:
    consolidationPolicy: WhenUnderutilized

Karpenter will prefer Spot and fall back to On-Demand automatically.

3. Enable cluster autoscaler consolidation

Idle nodes are expensive. Both Cluster Autoscaler and Karpenter can consolidate underutilised nodes.

Karpenter: Set consolidationPolicy: WhenUnderutilized (shown above) — it will drain and terminate nodes that are below 10% utilisation.
Cluster Autoscaler: Set --scale-down-utilization-threshold=0.5 to trigger scale-down when node utilisation drops below 50%.

4. Schedule non-critical workloads with low-priority classes

Use PriorityClass to ensure dev/staging workloads are evicted first during node pressure, making room for production without adding nodes.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 100
preemptionPolicy: Never
globalDefault: false

5. Namespace-level chargeback with Kubecost

You can't optimise what you can't measure. Kubecost gives you per-namespace, per-team, per-label cost breakdowns with no code changes.

Once teams see their own costs, over-provisioning drops dramatically.

Results

Applying these five techniques across a 40-node EKS cluster for one of our clients produced:

Technique	Monthly Saving
Right-sizing requests	~$1,200
Spot instance migration (70% of workloads)	~$2,800
Consolidation (removed 8 idle nodes)	~$960
Total	~$4,960 / month

Conclusion

Kubernetes cost optimisation is not a one-time exercise — it is a continuous practice. Start with right-sizing (highest ROI, lowest risk), add Karpenter for intelligent node provisioning, and use Kubecost to maintain visibility as the cluster grows.

Need help implementing this for your cluster? Get in touch with CogniVeu.

Kubernetes Cost Optimisation: Reduce Cloud Bills by 40%

1. Right-size your resource requests

2. Use Spot / Preemptible nodes for batch and stateless workloads

3. Enable cluster autoscaler consolidation

4. Schedule non-critical workloads with low-priority classes

5. Namespace-level chargeback with Kubecost

Results

Conclusion

Share this article

Related Articles

Smarter Traffic Routing with Kubernetes

Certified Kubernetes Security

Want to Stay Updated?