Quickstart

Clean your first spreadsheet in under five minutes: upload a CSV, read the agent's verdicts on each duplicate group, and download a clean file.

This walks you through one full run: messy file in, clean file out. No account needed.

Open the tool

Go to the dedupe tool. You'll see a drop zone with two buttons: Choose CSV and Try a sample. If this is your first time, click Try a sample to run a built-in messy customer list and see the whole flow before you upload your own data.

Upload your CSV

Drag a CSV onto the drop zone, or click Choose CSV and pick a file. Any list with duplicate rows works — contacts, customers, members, event attendees, accounts. The free tier cleans up to 1,000 rows per run; larger files are a Pro feature.

Watch it match

The matching engine works out the rules from your columns — no setup, no threshold tuning — and groups rows that refer to the same real-world entity. This usually takes a few seconds.

Read the agent's verdicts

Each duplicate group gets a verdict from the review agent:

Confirmed — the rows really are the same entity.
Uncertain — the evidence is thin or mixed; a human should look.
Rejected — the agent thinks the engine over-merged two different things.

Rejected and Uncertain groups are floated to the top under Needs review, each with a one-line explanation. Click any group to expand its rows and see which one is kept and which are dropped.

Download the clean file

Click Download clean CSV. One survivor row is kept per duplicate group, plus every row that wasn't a duplicate. The file is assembled in your browser and saved to your machine as yourfile-cleaned.csv.

What you get back

The cleaned CSV keeps your original columns and column order. For each duplicate group, the first row (in original file order) is kept as the survivor and the rest are dropped. Rows that weren't part of any group pass through untouched.

Bigger files and repeat work

Free is built for a one-off clean of a real file up to 1,000 rows. If you run the same kind of list every cycle, or your files are bigger, Pro raises the cap to 100,000 rows per run and adds saved configs and the Excel add-in.

Next steps

How it works

The two-stage pipeline behind a run: deterministic matching, then the AI review.

Why fuzzy matching wins

The near-duplicates Excel's "Remove Duplicates" can never catch.

Was this page helpful?

Edit this page on GitHub

PreviousWhat this is