EKS - Horizontal Pod Autoscaling (HPA) ¶

Step-01: Introduction ¶

What is Horizontal Pod Autoscaling?
How HPA Works?
How HPA configured?

Step-02: Install Metrics Server ¶

# Verify if Metrics Server already Installed
kubectl -n kube-system get deployment/metrics-server

# Install Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

# Verify
kubectl get deployment metrics-server -n kube-system

Step-03: Review Deploy our Application ¶

# Deploy
kubectl apply -f kube-manifests/

# List Pods, Deploy & Service
kubectl get pod,svc,deploy

# Access Application (Only if our Cluster is Public Subnet)
kubectl get nodes -o wide
http://<Worker-Node-Public-IP>:31231

Kubernetes Manifets ¶

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-demo-deployment
  labels:
    app: hpa-nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hpa-nginx
  template:
    metadata:
      labels:
        app: hpa-nginx
    spec:
      containers:
      - name: hpa-nginx
        image: stacksimplify/kubenginx:1.0.0
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "500Mi"
            cpu: "200m"          
---
apiVersion: v1
kind: Service
metadata:
  name: hpa-demo-service-nginx
  labels:
    app: hpa-nginx
spec:
  type: NodePort
  selector:
    app: hpa-nginx
  ports:
  - port: 80
    targetPort: 80
    nodePort: 3123

Step-04: Create a Horizontal Pod Autoscaler resource for the "hpa-demo-deployment" ¶

This command creates an autoscaler that targets 50 percent CPU utilization for the deployment, with a minimum of one pod and a maximum of ten pods.
When the average CPU load is below 50 percent, the autoscaler tries to reduce the number of pods in the deployment, to a minimum of one.

When the load is greater than 50 percent, the autoscaler tries to increase the number of pods in the deployment, up to a maximum of ten

# Template
kubectl autoscale deployment <deployment-name> --cpu-percent=50 --min=1 --max=10

# Replace
kubectl autoscale deployment hpa-demo-deployment --cpu-percent=50 --min=1 --max=10

# Describe HPA
kubectl describe hpa/hpa-demo-deployment 

# List HPA
kubectl get horizontalpodautoscaler.autoscaling/hpa-demo-deployment

Step-05: Create the load & Verify how HPA is working ¶

# Generate Load
kubectl run --generator=run-pod/v1 apache-bench -i --tty --rm --image=httpd -- ab -n 500000 -c 1000 http://hpa-demo-service-nginx.default.svc.cluster.local/ 

# List all HPA
kubectl get hpa

# List specific HPA
kubectl get hpa hpa-demo-deployment 

# Describe HPA
kubectl describe hpa/hpa-demo-deployment 

# List Pods
kubectl get pods

Step-06: Cooldown / Scaledown ¶

Default cooldown period is 5 minutes.
Once CPU utilization of pods is less than 50%, it will starting terminating pods and will reach to minimum 1 pod as configured.

Step-07: Clean-Up ¶

# Delete HPA
kubectl delete hpa hpa-demo-deployment

# Delete Deployment & Service
kubectl delete -f kube-manifests/

AWS EKS - Elastic Kubernetes Service - Masterclass ¶

Step-08: Imperative vs Declarative for HPA ¶

From Kubernetes v1.18 onwards, we have a declarative way of defining HPA policies using behavior object in yaml.
Support for configurable scaling behavior
Starting from v1.18 the v2beta2 API allows scaling behavior to be configured through the HPA behavior field.

Behaviors are specified separately for scaling up and down in scaleUp or scaleDown section under the behavior field

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max

Reference: Select V1.18 from top right corner on Kubernetes website for V1.18 documentation
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Referencess ¶

Metrics Server Releases ¶

https://github.com/kubernetes-sigs/metrics-server/releases

Horizontal Pod Autoscaling - Scale based on many type of metrics ¶

https://v1-16.docs.kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/