Please help on data Cleaning, lookups and merging

Mangolili

Board Regular
Joined
Jun 21, 2011
Messages
54
Hi All,

I have been working with large excel files ranging from 500,000 to 800,000 plus rows of data.
Cleaning, matching and merging data is time consuming. I am wondering if you have an good approach of dealing with large data files?
Any good data cleaning and matching tools to recommend?

How I work with the large data files currently:
1. Put the shifted columns back to the appropriate columns
2. Clean out data that's not useful
3. Match the customers from the original file to other files that's different.
4. Since there's no IDs in the original file, I look up the data using customer_name, address, email address and city to match other data files, if there's a match, the customerID will show up via vlookup.
5. At first, I started with vlooking up the email, then the address
6. After vlooking up like that, I have two different columns with different IDS.
7. I wanted to merge them together on as one column. Currently, I use a filter on ID using Address to get rid of the blanks and N/A.
And for the ID using email column, I use = to the values of the " ID using address" column

For example:
CustomerNameID using emailID using address
ABC company123N/A
HairSalon556N/A4351
BeautySupplyN/A235325
Makeupforever789789
HairExtension5437N/A
BeautySupplyN/A345567

<tbody>
</tbody>

CustomerNameID using email
ABC company123
HairSalon5564351
BeautySupply235325
Makeupforever789
HairExtension5437
BeautySupply34556

<tbody>
</tbody>

I think this method works well for small data files. However, for large data files, it is very time consuming.
It took me almost a week to do all the data cleaning, ID Vlookups and data merging. If would be great if you have any idea of how to do this more efficiently. Thank you so much.
 

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.

Forum statistics

Threads
1,215,049
Messages
6,122,864
Members
449,097
Latest member
dbomb1414

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top