Skip to content

Autoscaling Intro

Need of Solution

Saving up costs by provisioning only required resources and not wasting too much of resources.
Improved application availability.
Efficient resource utilization.
Elasticity: Capability of application to adapt unpredictable traffic patterns.
Fault tolerance and recovery: Either recover from the failure or not even see the failure.
Simplified management of applications.

Kubernetes Scaling

Cluster Scaling:
- Shifting the number of nodes inside the cluster.
- Expands cluster’s total capacity (CPU, RAM, GPU, Disk, …).
- Expands cluster availability.
Pod Scaling:
- Expanding or shrinking the pods or replicas numbers or resources using HPA or VPA.
- Expands application availability and efficiency.

Manual HPA

It involves the editing of kubernetes deployment manifest replica option and applying the changes.
This is very repetitive and error prone and can lead to disasters when wrong values are set.
Scaling can changes the behaviour of the applications and their efficiency.

Manual VPA

When each replica or pod needs more resources, they need to be upscaled vertically by adding more resources to all the instances
this recreates all the instances and recreates them or replaces them with new instances and can cause some downtime and can break application availability