# Auto-scaling

## Introduction

Flexible auto-scaling is a significant advantage for Patchworks users - it means you don't pay for a predetermined capacity that might only be required during peak periods, such as Black Friday.&#x20;

Our flexible, auto-scaling architecture gives peace of mind by allowing you to start on your preferred plan, with the ability to exceed soft limits as needed. If you require more resources, you can transition to a higher tier seamlessly, or manage overages with ease.

Auto-scaling adjusts computing resources dynamically, based on demand - ensuring efficient, cost-effective resource management that's always aligned with real-time demand. The auto-scaling process breaks down into four stages:

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td>The system continuously monitors traffic and resource usage (CPU, memory).</td><td></td><td></td><td><a href="/files/mTCXJRh2j7VhkUg2N1xz">/files/mTCXJRh2j7VhkUg2N1xz</a></td></tr><tr><td>When usage exceeds predefined thresholds, the auto-scaler triggers.</td><td></td><td></td><td><a href="/files/P6nO83UOM58kKNiu9uzi">/files/P6nO83UOM58kKNiu9uzi</a></td></tr><tr><td>Additional resources/pods are deployed to handle the increased load</td><td></td><td></td><td><a href="/files/pVXjARoxfSiHXFSiLYbU">/files/pVXjARoxfSiHXFSiLYbU</a></td></tr><tr><td>When demand drops, resources are reduced to optimise costs.</td><td></td><td></td><td><a href="/files/DKiL5xZLHVjWJkRx7FDp">/files/DKiL5xZLHVjWJkRx7FDp</a></td></tr></tbody></table>

## Kubernetes pods & nodes

At Patchworks, every process flow shape has its own microservice and its own Kubernetes pod(s). The diagram below shows how this works:

<div data-full-width="true"><figure><img src="/files/GAlObYIO4culr1ceXo93" alt=""><figcaption></figcaption></figure></div>

### Kubernetes pod auto-scaling

Metrics for Kubernetes pods are scraped from Horizon using Prometheus. These metrics are queried by KEDA and - when the given threshold is reached - auto-scaling takes place. This process is shown below:&#x20;

<figure><img src="/files/FfJv98qa6FzLRXS1jULr" alt=""><figcaption></figcaption></figure>

1. *Prometheus JSON exporter* scrapes *Horizon* metrics for each Core microservice count.
2. *Prometheus* scrapes metrics from the *JSON exporter*.
3. *KEDA* queries *Prometheus*, checking if any Core microservice has reached the process threshold (set to **8**).
4. If the process threshold is reached, *KEDA* scales the Core microservice pod.

### Kubernetes node auto-scaling

The *Kubernetes cluster auto-scaler* monitors **pods** and decides when a **node** needs to be added. A **node** is added if a **pod** needs to be scheduled and there aren't sufficient resources to fulfill the request. This process is shown below:&#x20;

<figure><img src="/files/DIoWAkxoq7wW3iSe9Fw8" alt=""><figcaption></figcaption></figure>

1. The *Kubernetes scheduler* reads the resource request for a **pod** and decides if there are enough resources on an existing **node**.  If *yes*, the **pod** is assigned to the **node**. If *no*, the **pod** is set to a `pending` state and cannot start.
2. The *Kubernetes auto-scaler* detects that a **pod** cannot schedule due to a lack of resources.
3. The *Kubernetes auto-scaler* adds a new **node** to the **cluster node pool** - at which point, the *Kubernetes scheduler* detects the new **nod**e and schedules the **pod** on the new node.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.wearepatchworks.com/product-documentation/getting-started/technical-overview/patchworks-infrastructure/auto-scaling.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
