Finding name discrepancies

gravanoc · Jul 7, 2022

Besides going through two lists of names one by one to find naming discrepancies, I'm trying to think of some formulas for finding these discrepancies. To illustrate, here is a name & some variations that may occur: 1) Andrew Baltic, 2) Andrew Baltic , 3) Andréw Baltic, 4) Drew Baltic, 5) Andrew Baltik, 6) andrew baltic. 1) normal/default spelling, 2) extra space in between names & after last name, 3) accent or other diacritical mark, 4) first name replaced with nickname or alias, 5) misspelling of last name, 6) lowercase name. These seem to be the most common, but I may be leaving possibilities out.

Perfection isn't the goal since sometimes there is only a first name & I can't reliably match that with a full name, or because the name is too butchered to match the default or ideal name. In this case I typically use the first or most common instance to determine the default name.

So far I've used a few techniques to help find these discrepancies. Removing duplicates, sorting, and some formulas. The TRIM function helps with #2, using LOWER or both the default & positional name is good for #6 (that is, the position as I drag down a formula which may be a match or not), but the others are trickier. For #4 & #5, I'm considering using some combination of LEFT, MID, & RIGHT on the default & positional name. For #3 I'm not sure what to use other than trying something with UNICODE to identify accent marks. Thanks for any suggestions.

rlv01 · Jul 7, 2022

Applying some sort of 'normalizing' function to the names before any compare will take care of 1,2,3,6. You are never going to get a comprehensive solution to 4 or 5 because there is no way of knowing whether nicknames or mispellings are actually nicknames or misspellings or just completely different people (absent other identifying information). This is why most organizations assign unique ID numbers to their employees.

Finding name discrepancies

gravanoc

Active Member

Excel Facts

rlv01

Well-known Member

Similar threads

Forum statistics

Share this page

Finding name discrepancies

gravanoc

Active Member

Excel Facts

rlv01

Well-known Member

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock