Skip to content

Autoscaling Intro

  • Saving up costs by provisioning only required resources and not wasting too much of resources.
  • Improved application availability.
  • Efficient resource utilization.
  • Elasticity: Capability of application to adapt unpredictable traffic patterns.
  • Fault tolerance and recovery: Either recover from the failure or not even see the failure.
  • Simplified management of applications.
  • Cluster Scaling:

    • Shifting the number of nodes inside the cluster.
    • Expands cluster’s total capacity (CPU, RAM, GPU, Disk, …).
    • Expands cluster availability.
  • Pod Scaling:

    • Expanding or shrinking the pods or replicas numbers or resources using HPA or VPA.
    • Expands application availability and efficiency.
  • It involves the editing of kubernetes deployment manifest replica option and applying the changes.
  • This is very repetitive and error prone and can lead to disasters when wrong values are set.
  • Scaling can changes the behaviour of the applications and their efficiency.
  • When each replica or pod needs more resources, they need to be upscaled vertically by adding more resources to all the instances
  • this recreates all the instances and recreates them or replaces them with new instances and can cause some downtime and can break application availability