# Auto-scaling

## Introduction

Flexible auto-scaling is a significant advantage for Patchworks users - it means you don't pay for a predetermined capacity that might only be required during peak periods, such as Black Friday.&#x20;

Our flexible, auto-scaling architecture gives peace of mind by allowing you to start on your preferred plan, with the ability to exceed soft limits as needed. If you require more resources, you can transition to a higher tier seamlessly, or manage overages with ease.

Auto-scaling adjusts computing resources dynamically, based on demand - ensuring efficient, cost-effective resource management that's always aligned with real-time demand. The auto-scaling process breaks down into four stages:

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td>The system continuously monitors traffic and resource usage (CPU, memory).</td><td></td><td></td><td><a href="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FbAeDtdK1u7zkBMI2kkYO%2Fauto%20scale%20tile%201.png?alt=media&#x26;token=e6ed08ea-4bab-45bc-be30-7a4439505f8c">auto scale tile 1.png</a></td></tr><tr><td>When usage exceeds predefined thresholds, the auto-scaler triggers.</td><td></td><td></td><td><a href="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FSrKlasBPvXrjAv2M37Up%2Fauto%20scale%20tile%202.png?alt=media&#x26;token=b67ea196-dac2-43af-bdbb-8b4bb7b8e023">auto scale tile 2.png</a></td></tr><tr><td>Additional resources/pods are deployed to handle the increased load</td><td></td><td></td><td><a href="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FSJ6A0nWMvJoxfOHapdr8%2Fauto%20scale%20tile%203.png?alt=media&#x26;token=de333352-526d-4508-a55a-9d7d97ba5d0b">auto scale tile 3.png</a></td></tr><tr><td>When demand drops, resources are reduced to optimise costs.</td><td></td><td></td><td><a href="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2F3rS004qfrwdGQzDdU9Ic%2Fauto%20scale%20tile%204.png?alt=media&#x26;token=97c53baa-1e70-47d1-b572-1b6edacfbf19">auto scale tile 4.png</a></td></tr></tbody></table>

## Kubernetes pods & nodes

At Patchworks, every process flow shape has its own microservice and its own Kubernetes pod(s). The diagram below shows how this works:

<div data-full-width="true"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FxZMFT5jrfznM4OayjIgd%2Farch%200.png?alt=media&#x26;token=34517bee-a080-4588-a0fd-bc22f2deaf5c" alt=""><figcaption></figcaption></figure></div>

### Kubernetes pod auto-scaling

Metrics for Kubernetes pods are scraped from Horizon using Prometheus. These metrics are queried by KEDA and - when the given threshold is reached - auto-scaling takes place. This process is shown below:&#x20;

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FknGoJmpZc05WWVxOyut9%2Farch%201.png?alt=media&#x26;token=d5d4f8fe-bc58-40b8-83f0-7be12419ca45" alt=""><figcaption></figcaption></figure>

1. *Prometheus JSON exporter* scrapes *Horizon* metrics for each Core microservice count.
2. *Prometheus* scrapes metrics from the *JSON exporter*.
3. *KEDA* queries *Prometheus*, checking if any Core microservice has reached the process threshold (set to **8**).
4. If the process threshold is reached, *KEDA* scales the Core microservice pod.

### Kubernetes node auto-scaling

The *Kubernetes cluster auto-scaler* monitors **pods** and decides when a **node** needs to be added. A **node** is added if a **pod** needs to be scheduled and there aren't sufficient resources to fulfill the request. This process is shown below:&#x20;

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FeHagWduRXa9HLbtl1Pe2%2Farch%202b.png?alt=media&#x26;token=f81c5475-4f4e-4f31-83a5-2a4f6f129f53" alt=""><figcaption></figcaption></figure>

1. The *Kubernetes scheduler* reads the resource request for a **pod** and decides if there are enough resources on an existing **node**.  If *yes*, the **pod** is assigned to the **node**. If *no*, the **pod** is set to a `pending` state and cannot start.
2. The *Kubernetes auto-scaler* detects that a **pod** cannot schedule due to a lack of resources.
3. The *Kubernetes auto-scaler* adds a new **node** to the **cluster node pool** - at which point, the *Kubernetes scheduler* detects the new **nod**e and schedules the **pod** on the new node.
