# Adding & configuring a de-dupe shape

## Introduction

The **de-dupe** shape is used to identify and then remove duplicate entries from an incoming payload. For more background information please see our [De-dupe shape](/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape.md) page.

## Need to know

{% hint style="danger" %}
Tracked de-dupe data is retained for 90 days after it's added to a data pool.
{% endhint %}

{% hint style="info" %}
Currently, the de-dupe shape supports JSON payloads.
{% endhint %}

## Adding a de-dupe shape

To add and configure a new **de-dupe** shape, follow the steps below.

**Step 1**\
In your process flow, add the **de-dupe** shape in the usual way:

<div align="left"><figure><img src="/files/h8FINyX8ND7u8oRNOctJ" alt="" width="375"><figcaption></figcaption></figure></div>

**Step 2**\
Select a **source integration** and **endpoint** to determine where the incoming payload to be de-duped originates - for example:

<div align="left"><figure><img src="/files/sE3aUoqVDOgJf8RSD05H" alt="" width="352"><figcaption></figcaption></figure></div>

{% hint style="info" %}
If your incoming data is via [manual payload](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/manual-payload-shape.md), [API request](broken://pages/EVn1AesQJtjyoIw7p0HZ), or [webhook](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/trigger-shape/trigger-shape-webhook.md) then you can remove any default source instance and endpoint selections:

![](/files/vR8caSjQJ2vjeQOlrHep)
{% endhint %}

**Step 3**\
Move down to the behaviour field and select the required option.&#x20;

{% hint style="info" %}
For more information about these options please see our [De-dupe shape behaviour](/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape.md#behaviour) section.
{% endhint %}

**Step 4**\
Move down to the **data pool** field and select the required [data pool](/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape.md#data-pools).

<div align="left"><figure><img src="/files/X9pt9b3haIS0b7seVmvc" alt="" width="357"><figcaption></figcaption></figure></div>

{% hint style="info" %}
If necessary, you can create a data pool 'on the fly' using the **create data pool** option. For more information please see [Adding a new data pool via the de-dupe shape](/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape/working-with-data-pools.md#adding-a-new-data-pool-via-the-de-dupe-shape).
{% endhint %}

**Step 5**\
In the **key field**, select/enter the data field to be used for matching duplicate records. How you do this depends on how the incoming data is being received - please see the options below:

<details>

<summary><img src="/files/mjTTZzyEQN7pfYDU3Ixl" alt="" data-size="line"> I want to choose a field from the schema associated with a connector endpoint</summary>

If the incoming payload for the de-dupe shape is received from a connection shape, you'll find that the de-dupe shape settings default to the same connection instance and endpoint. In this case, the `key field` allows you to navigate the schema that's associated with the endpoint, and select the required data item:

<img src="/files/Q3xGWbAwv8qtXGPwq6nY" alt="" data-size="original">

</details>

<details>

<summary><img src="/files/mjTTZzyEQN7pfYDU3Ixl" alt="" data-size="line"> I want to specify a field manually</summary>

If the incoming payload for the de-dupe shape is received via [manual payload](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/manual-payload-shape.md), [API request](broken://pages/EVn1AesQJtjyoIw7p0HZ), or [webhook](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/trigger-shape/trigger-shape-webhook.md), there is no associated instance/endpoint and therefore no known data schema. In this case, you should enter the required `key field` value manually - enter the dot notation path to the required field in your data - for example:  `*.customerID`:

<img src="/files/Ltq4QJfI1n6Wf7WsuRvw" alt="" data-size="original">

</details>

<details>

<summary><img src="/files/mjTTZzyEQN7pfYDU3Ixl" alt="" data-size="line"> I want to use variables to define a dynamic key field  </summary>

If the incoming payload for the de-dupe shape is received via [manual payload](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/manual-payload-shape.md), [API request](broken://pages/EVn1AesQJtjyoIw7p0HZ), or [webhook](/product-documentation/process-flows/building-process-flows/process-flow-shapes/standard-shapes/trigger-shape/trigger-shape-webhook.md), you can generate the key field value dynamically using [payload](/product-documentation/process-flows/building-process-flows/dynamic-variables/payload-variables.md), [flow](/product-documentation/process-flows/building-process-flows/dynamic-variables/flow-variables.md) and [metadata](/product-documentation/process-flows/building-process-flows/dynamic-variables/metadata-variables.md) variables.&#x20;

<img src="/files/syNjVGNiHPyQI5H59IMO" alt="" data-size="original">

Any combination of [payload](/product-documentation/process-flows/building-process-flows/dynamic-variables/payload-variables.md), [flow](/product-documentation/process-flows/building-process-flows/dynamic-variables/flow-variables.md) and [metadata](/product-documentation/process-flows/building-process-flows/dynamic-variables/metadata-variables.md) variables can be used to form cache key names. For more information please see our [Dynamic variables](/product-documentation/process-flows/building-process-flows/dynamic-variables.md) section.

</details>

{% hint style="info" %}
The selection that you make here determines how the payload is adjusted when duplicate data is removed. For more information please see [How duplicate data is handled](/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape.md#how-duplicate-data-is-handled).
{% endhint %}

**Step 5**\
Select the payload format:

<div align="left"><figure><img src="/files/gzHXAtQVgIPkxk8UoolM" alt="" width="359"><figcaption></figcaption></figure></div>

**Step 6**\
Save the shape.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.wearepatchworks.com/product-documentation/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/de-dupe-shape/adding-and-configuring-a-de-dupe-shape.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
