CPA

It is required for scaling up cluster essential services like DNS, network controller, metric server and many more as per cluster grows. These services and similar services (cluster required services) need to be scaled based on cluster growth.
It monitors cluster size, this is done using node count or CPUs or both or CPUs per worker node, also it can look at schedulable and non-schedulable nodes as well.
CPA proportionally scales the required replicas for the target service.

**CPA Scaling formula:

desired replicas = base replicas + (nodes or CPUs) * scaling factor

The base replicas are just default number of replicas that will exist even if there are no nodes, then as nodes get scheduled or provisioned, there will be more replicas added as per scaling factor.

CPA talks to kube API which is going to manipulate the deployments

Scaling Modes

Ladder:
- Scaling happens in discrete steps or ladders.
- Increases or decreases based on thresholds.
- Controlled by predictive scaling.
Linear:
- Scaling occurs in continuous manner proportional to the cluster.
- Gradual adjustments based on load.
- Dynamic workloads with the fluctuating demand for better resource utilization.
max value is taken

Installation

helm repo add cluster-proportional-autoscaler https://kubernetes-sigs.github.io/cluster-proportional-autoscaler
helm repo update
helm install cluster-proportional-autoscaler cluster-proportional-autoscaler/cluster-proportional-autoscaler --values CHART_CONF_FILE -n NAMESPACE --create-namespace

The namespace normally is kubernetes internal namespace.
This creates a pod cluster-proportional-autoscaler-*, service account cluster-proportional-autoscaler, cluster role cluster-proportional-autoscaler for watching nodes, role cluster-proportional-autoscaler, role binding cluster-proportional-autoscaler

Configuration

Before installing the chart, the configuration is added in the CHART_CONF_FILE file in the config section.
Either ladder could be provided or linear could be provided at a time

config:
  # Add only one of ladder and linear at a time. If needed both one for some deployment another for another deployment, create 2 instances of CPA (Install 2 instances of chart)
  ladder:
    coresToReplicas:
      - [ 1, 1 ]
      - [ 64, 3 ]
      - [ 512, 5 ]
      - [ 1024, 7 ]
      - [ 2048, 10 ]
      - [ 4096, 15 ]
    nodesToReplicas:
      - [ 1, 1 ]
      - [ 2, 2 ]
    includeUnschedulableNodes: true | false
  linear:
    coresPerReplica: 2
    nodesPerReplica: 1
    min: 1
    max: 100
    scaleDownLimit: VALUE # (0, 1] Limits the maximum fraction by which the autoscaler can decrease the number of replicas in one polling interval.
    dampeningPeriodSeconds: VALUE # It defines a cooldown period between autoscaling actions to prevent rapid and frequent scaling up and down (oscillations or flapping). During this period, scaling actions are suppressed, stabilizing the system.
    scaleUpLimit: VALUE # (0, 1] Limits the maximum fraction by which the autoscaler can increase the number of replicas in one polling interval.
    preventSinglePointFailure: true | false # When set to true, CPA ensures that the number of replicas is never just one if the cluster has more than one node.
    includeUnschedulableNodes: true | false