# Importing & exporting de-dupe data

## Introduction

If required, you can [import existing data into a de-dupe pool](#importing-de-dupe-data). For example, you may have records that you know have been processed elsewhere and want to ensure that they aren't processed via Patchworks.

Conversely, you can [export de-dupe pool data to a CSV file](#exporting-a-de-dupe-data-pool), for use outside of Patchworks.

## Need to know

#### Export file format

De-dupe data exports are completed in CSV format, delimited ONLY with a single comma between fields.&#x20;

The exported file includes two columns with `value` and `entity_type_id`  headers. For example:

{% code lineNumbers="true" %}

```
value,entity_type_id
testPerson1@wearepatchworks.com,47
testPerson2@wearepatchworks.com,47
testPerson3@wearepatchworks.com,47
```

{% endcode %}

### Imports

#### Import approach

When de-dupe data values are imported:

* All records in the import file are added to the data pool as new items
* Any existing items in the data pool are unchecked and unchanged

#### Import file format

To import de-dupe values, the import file must be in the same format as export files [above](#export-file-format), **with the same headers**. I.e.:

{% code lineNumbers="true" %}

```
value,entity_type_id
value,id
value,id
value,id
```

{% endcode %}

Where:

* The `value` is the key field value that you are matching on
* The `entity_type_id` is the internal Patchworks id for the entity type associated with the key field that you are using to match duplicates. This id must be present for every entry in your CSV file. You can download a list of ids by following steps detailed [later in this page](#downloading-the-patchworks-entity-id-type-list).

{% hint style="warning" %}
Import files cannot exceed 5MB.
{% endhint %}

## Exporting a de-dupe data pool

To export/download a de-dupe data pool, follow the steps below.&#x20;

**Step 1**\
Log into the Patchworks dashboard, then select the **settings** option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FMQogzR2C31P77eCnaFjf%2Fexport%20cross%20ref%20lookup%203.png?alt=media&#x26;token=5ce09d36-e35e-40cb-9c2e-631ef1c3f82a" alt="" width="214"><figcaption></figcaption></figure></div>

...followed by the file **data pool**s option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FH1Q526JAVNVFsmvvvU49%2Fimport%20de-dupe%20data%201.png?alt=media&#x26;token=a0af16e6-2b28-4ac1-a081-9b25e8bb7f2f" alt=""><figcaption></figcaption></figure></div>

**Step 2**\
Click the name of the data pool that you want to export:

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FOglatceT4quHR3f76Vwb%2Fchoose%20data%20pool.png?alt=media&#x26;token=26ec2b6d-be6e-4fbc-aaa0-f3829b3809d7" alt=""><figcaption></figcaption></figure>

Alternatively, you can [create a new data pool](https://doc.wearepatchworks.com/product-documentation/~/changes/J8IbZkP6ASUZu2oBhGi2/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/working-with-data-pools#adding-a-data-pool).

**Step 3**\
With the data pool in edit mode, move to the lower **tracked de-dupe data** panel and click the **download** button:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FA2TtvZ3DoKI81o8UDImK%2Fdownload%20data%20pool%201.png?alt=media&#x26;token=274f3b6e-10a4-4004-b94f-ec65c9c7c5b5" alt=""><figcaption></figcaption></figure></div>

**Step 4**\
The download job is added to a queue and a confirmation message is displayed:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FrchCBXyZOTEnm71XRd75%2Fdownload%20data%20pool%202.png?alt=media&#x26;token=294c713f-680d-4535-8ee7-38aef88a052d" alt=""><figcaption></figcaption></figure></div>

**Step 5**\
When your download is ready, you'll receive an email which includes a link to retrieve the file from the **file downloads** page. If you can't/don't want to use this link, you can access this page manually - click **data pools** in the breadcrumb trail at the top of the page:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FjsfGUsC10rtwFrhziA7z%2Fdownload%20data%20pool%203.png?alt=media&#x26;token=d4704b1a-50c1-4974-9583-c2d80ebe0478" alt="" width="563"><figcaption></figcaption></figure></div>

...followed by the **settings** element option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FckGNMZ7NYerl9bioXtfp%2Fexport%20data%20pool%204.png?alt=media&#x26;token=3e3f2999-5a3b-446e-bd48-c86460c7eb41" alt="" width="563"><figcaption></figcaption></figure></div>

**Step 6**\
Select the **file downloads** option from the **settings** page:

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FwSifiMmM7wYx4dQ8L2KN%2Fexport%20data%20pool%205.png?alt=media&#x26;token=00a3c475-b1a3-4ebd-9874-7d44e75e0820" alt=""><figcaption></figcaption></figure>

**Step 7**\
On the **file downloads** page, you'll find any exports that have been completed for your company profile in the last hour. Click the **download** button for your job - the associated CSV file is saved to the default downloads folder for your browser.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    &#x20;

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2F6ZDDSxcHVJ3GQ4TUqdz5%2Fexport%20data%20pool%206.png?alt=media&#x26;token=1ae8dd81-65d7-41f7-85ea-076c5f5a1b3a" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
This list may include exports from different parts of the dashboard, not just data pools (for example, [run log](https://doc.wearepatchworks.com/product-documentation/~/changes/J8IbZkP6ASUZu2oBhGi2/process-flows/error-reporting-and-exception-handling/run-logs-and-queue/working-with-run-logs/downloading-run-logs) and [cross-reference lookup](https://doc.wearepatchworks.com/product-documentation/~/changes/J8IbZkP6ASUZu2oBhGi2/process-flows/cross-reference-lookups/importing-and-exporting-cross-reference-lookups) data exports are added here).
{% endhint %}

**Step 8**\
Click the **download** button for your job - the associated CSV file is saved to the default downloads folder for your browser.

{% hint style="info" %}
Download files are cleared after one hour. If you don't manage to download your file within this time, don't worry - just run the export again to create a new one.&#x20;
{% endhint %}

## Downloading the Patchworks entity id type list

If you want to import data into a de-dupe data pool, you need to ensure that each record in your CSV file includes an [entity\_type\_id](#import-file-format). To find which id you should use, follow the steps below to download a current list.

**Step 1**\
Log into the Patchworks dashboard, then select the **settings** option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FMQogzR2C31P77eCnaFjf%2Fexport%20cross%20ref%20lookup%203.png?alt=media&#x26;token=5ce09d36-e35e-40cb-9c2e-631ef1c3f82a" alt="" width="214"><figcaption></figcaption></figure></div>

...followed by the file **data pool**s option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FH1Q526JAVNVFsmvvvU49%2Fimport%20de-dupe%20data%201.png?alt=media&#x26;token=a0af16e6-2b28-4ac1-a081-9b25e8bb7f2f" alt=""><figcaption></figcaption></figure></div>

**Step 2**\
Click the **download entity types** button at the top of the page:

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FVeN1WZg0v0hREXpd9lsY%2Fentity%20types%20list.png?alt=media&#x26;token=0190099d-22b0-4bfc-ba7a-9162c327c04d" alt=""><figcaption></figcaption></figure>

**Step 3**\
A CSV file is saved to the default downloads folder for your browser.

## Importing de-dupe data

To import data into a de-dupe data pool, follow the steps below.&#x20;

**Step 1**\
Log into the Patchworks dashboard, then select the **settings** option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FMQogzR2C31P77eCnaFjf%2Fexport%20cross%20ref%20lookup%203.png?alt=media&#x26;token=5ce09d36-e35e-40cb-9c2e-631ef1c3f82a" alt="" width="214"><figcaption></figcaption></figure></div>

...followed by the file **data pool**s option:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FH1Q526JAVNVFsmvvvU49%2Fimport%20de-dupe%20data%201.png?alt=media&#x26;token=a0af16e6-2b28-4ac1-a081-9b25e8bb7f2f" alt=""><figcaption></figcaption></figure></div>

**Step 2**\
If you want to import data into an existing data pool, click the name of the required data pool from the list:

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FOglatceT4quHR3f76Vwb%2Fchoose%20data%20pool.png?alt=media&#x26;token=26ec2b6d-be6e-4fbc-aaa0-f3829b3809d7" alt=""><figcaption></figcaption></figure>

Alternatively, you can [create a new data pool](https://doc.wearepatchworks.com/product-documentation/~/changes/J8IbZkP6ASUZu2oBhGi2/process-flows/building-process-flows/process-flow-shapes/advanced-shapes/working-with-data-pools#adding-a-data-pool).

**Step 3**\
Move to the lower **tracked de-dupe data** panel and click the **import** button:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FYbwbOA8oeszKOW3XRYgT%2Fimport%20de-dupe%20data%203.png?alt=media&#x26;token=87e57e85-5124-4945-a3f8-469e823ec4c4" alt=""><figcaption></figcaption></figure></div>

**Step 4**\
Navigate to the CSV file that you want to import and select it:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FT4LAYH66h3t8CvGfK8N0%2Fimport%20de-dupe%20data%204.png?alt=media&#x26;token=9830a392-b2d1-4cab-907a-6269aa9ec37b" alt="" width="375"><figcaption></figcaption></figure></div>

**Step 5**\
The file is uploaded and displayed as a button - click this button to complete the import:

<div align="left"><figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FQV8ppchh6bot6VZFTfdj%2Fimport%20de-dupe%20data%205.png?alt=media&#x26;token=5131c7f6-50cb-4bc7-8b15-643380f960f6" alt=""><figcaption></figcaption></figure></div>

**Step 6**\
The import is completed - existing values are updated and new values are added:

<figure><img src="https://2440044887-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FLYNcUBVQwSkOMG6KjZfz%2Fuploads%2FCyps45voL8fPfReionzG%2Fimport%20de-dupe%20data%206.png?alt=media&#x26;token=c8ea82ad-56ff-41ef-a149-a3f81e84b965" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
You may need to refresh the page to view the updated data pool.
{% endhint %}
