deduplicating

TheToeOfPooh

Active Member
Joined
May 21, 2004
Messages
254
There are mnay postings on the Web but none that I have managed to condense into a simple format.

We currently have a database of UK names and addresses which gets ammended and updated as required.

Is there a relatively simple method to quickly identify duplicates
e.g. Mr J Jones 12 High Street Derby DE21 1QP
is the same as
Mr John Jones 12 Hihg Street Derby DE21 1QP
Full FirstName but "typo" in Address.

My understanding especially when incorporating external data, is to run Filters\Sorts (using Excel?), before deciding what is "clean" as opposed to "dirty" data?

We will then be running this against Post Codes to determine matches
e.g. How many are in Post Code area GL20?

Any advice will be much appreciated as always.

Bernard
 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
76,318
Office Version
  1. 365
Platform
  1. Windows
Mike

That was what I was going to suggest at first, and it may still be useful.

But it wouldn't pick up the type of example given in the OP.

I don't think there is a 'simple' way to do this sort of thing.
 

Andrew Fergus

MrExcel MVP
Joined
Sep 9, 2004
Messages
5,437
I agree this isn't simple but given the UK postcode identifies the street, I would try doing a match with the postcode, street number, surname and the individual's initial and possibly the salutation. It may not be 100% accurate but may be close enough. It also depends on how the data has been stored.....

HTH, Andrew. :)
 

TheToeOfPooh

Active Member
Joined
May 21, 2004
Messages
254
My thanks for these pointers.

A number of associated De Duplicating Queries got us down into the data as far as we need.

Cheers.

Bernard
 

Forum statistics

Threads
1,147,621
Messages
5,742,188
Members
423,710
Latest member
Duarte85

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top