Best Approach for Data cleansing in excel

WorkPie

New Member
Joined
Mar 22, 2013
Messages
2
I am doing a project that entails @15 excel files from different data sources. The data is however the same. There is a name, email, address etc and 2 fields of attributes- say an indicator of their expertise. There are duplicates in the names, emails, and attribute columns. There are also unique attributes that I want to keep. Example:

File1
NameEmailCategory 1Category 2
Mary SmithMary@smith.comExpert
John DoeJohnDoe@gmail.com
Jane BellJaneBell@gmailcom
File2
NameEmailCategory 1Category 2
Mary SmithMary @smith.comEntrepreneurSponsor
Joe JeffersonJaneB@gmail.com
File2
NameEmailCategory 1Category 2
Mary SmithMarysmith@gmail.comProspect
Robert JonesRobert@test.com
Bill RobertsJaneB@gmail.com
File Desired
NameEmail 1Email 2Category 1Category2Catagory3Category 4
Mary SmithMary@Smith.comMarySmith@gmail.comExpertEntrepreneurSponsorProspect

<tbody>
</tbody>

This is a simple join in a robust database- but I have never figured out a real solution in excel. I am fine with some sort of repetitive process, but I am going to need to do this same thing over and over in the future, and I need to think through the right way to approach this now, before I spend a lot of time messing around with it. Any and all help would be fantastic. WP

PS- is powerpivot likely to help here?
 

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"
I am doing a project that entails @15 excel files from different data sources. The data is however the same. There is a name, email, address etc and 2 fields of attributes- say an indicator of their expertise. There are duplicates in the names, emails, and attribute columns. There are also unique attributes that I want to keep. Example:

File1
Name
Email
Category 1
Category 2
Mary Smith
Mary@smith.com
Expert
John Doe
JohnDoe@gmail.com
Jane Bell
JaneBell@gmailcom
File2
Name
Email
Category 1
Category 2
Mary Smith
Mary @smith.com
Entrepreneur
Sponsor
Joe Jefferson
JaneB@gmail.com
File2
Name
Email
Category 1
Category 2
Mary Smith
Marysmith@gmail.com
Prospect
Robert Jones
Robert@test.com
Bill Roberts
JaneB@gmail.com
File Desired
Name
Email 1
Email 2
Category 1
Category2
Catagory3
Category 4
Mary Smith
Mary@Smith.com
MarySmith@gmail.com
Expert
Entrepreneur
Sponsor
Prospect

<TBODY>
</TBODY>

This is a simple join in a robust database- but I have never figured out a real solution in excel. I am fine with some sort of repetitive process, but I am going to need to do this same thing over and over in the future, and I need to think through the right way to approach this now, before I spend a lot of time messing around with it. Any and all help would be fantastic. WP

PS- is powerpivot likely to help here?


Have you considered simply linking the excel files as tables to an Access DB and performing the joins you refer to?
 
Upvote 0

Forum statistics

Threads
1,203,205
Messages
6,054,136
Members
444,703
Latest member
pinkyar23

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top