Finding duplicate values using a self-join relationship

Finding, sorting, and replacing data > Finding records > Finding duplicate values using a self-join relationship

This procedure uses a self-join relationship and a calculation field referencing the relationship to determine when duplicate records exist.

To find duplicate records except the first instance:

1. If you plan to delete the duplicate records that you find, make a backup copy of the file.

See Saving and copying files.

2. Identify a field that determines a unique entity in your file.

For example, in a Contacts database, the Last Name field is probably not a good choice, because you might have several people with the same last name. Employee ID is a better choice. You can also create a calculation field (returning a text result) that combines data in several fields to make a unique identifier. An example formula is First Name & Last Name & Phone Number.

Note The field you choose should not be a repeating field.

3. Define a self-join relationship.

Use your chosen identifying field as the match field in both tables in the relationship. See Creating and changing relationships.

4. Create a new calculation field named Check Duplicate with the formula:

If(Count(<name of the table’s second occurrence>::<match field>) > 1; "Duplicates"; "Unique")

5. Click Show All in the status toolbar.

6. Perform a find for Duplicates in the Check Duplicates field.

All records with duplicates are marked Duplicates.

Once set up as above, this system will identify duplicate records automatically as they are created.

Was this topic helpful? Send feedback.