Need to ID dups in two databases

elise12345

New Member
Joined
May 1, 2020
Messages
1
Office Version
  1. 2010
Platform
  1. Windows
Hello, I'm new here :)

I have two databases, and I need to find duplicate records between them. The complicated part: the databases are from two different companies, so there are no two records match perfectly. Both databases contain "address," but again, the records don't match.....a portion of them does (see below). Is there a way to create a function that perhaps matches a portion of the records instead of the entire cell? I am thinking I write a function that just compares the address number -- in the example below it would be "8300.". Thanks in advance for the help! I work for a non-profit, and we need to return this data to get funding.

Example data set, for a duplicate record:
database 1 address field: 8300 ST HELENA HWY,RUTHERFORD, CA 94573Census Tract: 2015.00
database 2 address field: 8300 ST HELENA HWY

Elise
 

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
Is it the case that the the text in the fields are exactly the same, up until a point? Your example above looks as though the two entries are identical up until the second entry ends and from which point, the first entry continues.

If they always, at least partially, identical, then you could make a function takes the two entries, calculates the length of each and then tests whether the shorter string is equal to the first X characters of the longer string. In your example, the identical portion is 18 characters long, so you could see whether it is the second entry is identical to the first 18 characters of the first entry. It's not perfect, and it assumes that the identical portion is going to be the length of the shorter of the two strings.

You could also set a counter to count how many of the leftmost characters are the same, and then assume that the entry that has the most characters the same as the one you're looking for is likely the match. Failing that, there is always fuzzy logic. That's a function that will look at the text in each string, and then return a percentage result which indicates the degree of similarity.

From memory, there is a fuzzy logic add-in that Microsoft made, but I think only ever released in beta. I could be wrong. I'll check, but will also see if I can find some fuzzy logic code.

I will see if I can find it.
 
Upvote 0

Forum statistics

Threads
1,215,003
Messages
6,122,655
Members
449,091
Latest member
peppernaut

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top