Duplicates

General Ledger. Accounting use-case

{% embed url="https://owl-analytics.com/general-ledger" %}

Whether your looking for a Fuzzy matching percent or single client cleanup Owl's duplicate detection can help you sort and rank the likely hood of duplicate data.

-f file:///home/ec2-user/single_customer.csv \
-d "," \
-ds customers \
-rd 2018-01-08 \
-dupe \
-dupenocase \
-depth 4

User Table has duplicate user entry

Carrisa Rimmer vs Carrissa Rimer

ATM customer data with only a 88% match

As you can see below less than a 90% match in most cases is a false positive. Each dataset is a bit different but in many cases you should tune your duplicates to roughly a 90+% match for interesting findings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

duplicates.md

duplicates.md

Duplicates

General Ledger. Accounting use-case

User Table has duplicate user entry

ATM customer data with only a 88% match

Simple DataFrame Example

Files

duplicates.md

Latest commit

History

duplicates.md

File metadata and controls

Duplicates

General Ledger. Accounting use-case

User Table has duplicate user entry

ATM customer data with only a 88% match

Simple DataFrame Example