Autoscaling Intro
Need of Solution
Section titled “Need of Solution”- Saving up costs by provisioning only required resources and not wasting too much of resources.
- Improved application availability.
- Efficient resource utilization.
- Elasticity: Capability of application to adapt unpredictable traffic patterns.
- Fault tolerance and recovery: Either recover from the failure or not even see the failure.
- Simplified management of applications.
Kubernetes Scaling
Section titled “Kubernetes Scaling”-
Cluster Scaling:
- Shifting the number of nodes inside the cluster.
- Expands cluster’s total capacity (CPU, RAM, GPU, Disk, …).
- Expands cluster availability.
-
Pod Scaling:
- Expanding or shrinking the pods or replicas numbers or resources using HPA or VPA.
- Expands application availability and efficiency.
Manual HPA
Section titled “Manual HPA”- It involves the editing of kubernetes deployment manifest replica option and applying the changes.
- This is very repetitive and error prone and can lead to disasters when wrong values are set.
- Scaling can changes the behaviour of the applications and their efficiency.
Manual VPA
Section titled “Manual VPA”- When each replica or pod needs more resources, they need to be upscaled vertically by adding more resources to all the instances
- this recreates all the instances and recreates them or replaces them with new instances and can cause some downtime and can break application availability