Anonymize identifying information without losing data

yewborn

New Member
Joined
Mar 30, 2020
Messages
1
Office Version
  1. 2016
Platform
  1. MacOS
I have a large dataset (70,000 rows) on property ownership, which includes personal identifying information (i.e. people's names). I want to make the data anonymous so that other users cannot identify the property owners. However, I need to preserve the internal fidelity of the names, so that users can tell if multiple properties have the same owner. To further complicate things, the data are littered with typos. So the same name may appear several times, but spelled slightly differently. Here's an example:

Property numberOwner name
1Jeremiah Wilson
2Emily Chang
3Emily Chang
4Jeremaih Wilson
5Jeremiah W. Brown

In the above example data, I want to achieve the following:
  • the names in Column 2 cannot be identified
  • the names for properties 2 and 3 are identical to each other
  • the names for properties 1 and 4 are similar enough to know this is a typo
  • the names for property 1 and 5 are clearly different
In an ideal world, the solution would be relatively simple to implement. I am requesting these data from a government agency, and I need to provide them instructions for how to anonymize the data while preserving the internal fidelity of the names.

I apologize if I'm not using the correct terminology for aspects of this problem. Thank you for your help.
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.

Forum statistics

Threads
1,215,461
Messages
6,124,957
Members
449,200
Latest member
indiansth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top