Data cleaning email addresses by matching text array

midijay

New Member
Joined
Apr 22, 2017
Messages
2
excel-problem.PNG


So this has been driving me nuts for hours. I have a list of more than 10,000 email addresses that I need to "clean" of all the "personal" email addresses such as "gmail.com" etc. I was thinking to simply use a MATCH formula to match for each row, whether ANY of the strings in column C match the related email address. Hopefully my screenshot here explains this. If the output was a number for a match, and N/A for no match then I can just filter on that column to show the kept/cleaned rows.

Please note I've got 3000 entries in column C based on a list of known personal email domains. I've made the email domain in cell A3 "mail.com" in this example because one formula I tried before would give a false positive based on "part" of gmail.com matching it. I need to match the FULL text in column C to partial text in column A (or if positive to be specific, the END characters).

Any help would be hugely appreciated. Have scoured the web, perhaps with poor choice of search terms, but without success. Most searches result in using VLOOKUP or MATCH but it doesn't seem to work in this scenario. I think match is intended to work the other way around, i.e. find the first instance of GMAIL.COM in my email list and return the position, which is no help alas.
 

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
excel-problem.PNG


So this has been driving me nuts for hours. I have a list of more than 10,000 email addresses that I need to "clean" of all the "personal" email addresses such as "gmail.com" etc. I was thinking to simply use a MATCH formula to match for each row, whether ANY of the strings in column C match the related email address. Hopefully my screenshot here explains this. If the output was a number for a match, and N/A for no match then I can just filter on that column to show the kept/cleaned rows.

Please note I've got 3000 entries in column C based on a list of known personal email domains. I've made the email domain in cell A3 "mail.com" in this example because one formula I tried before would give a false positive based on "part" of gmail.com matching it. I need to match the FULL text in column C to partial text in column A (or if positive to be specific, the END characters).

Any help would be hugely appreciated. Have scoured the web, perhaps with poor choice of search terms, but without success. Most searches result in using VLOOKUP or MATCH but it doesn't seem to work in this scenario. I think match is intended to work the other way around, i.e. find the first instance of GMAIL.COM in my email list and return the position, which is no help alas.
You have a long description there which makes deciphering what you want a little confusing. Are you simply looking to retrieve all the text in front of the @ sign? If so...

=LEFT(A2,FIND("@",A2)-1)
 
Upvote 0
I am not able to view images, so can you show some sample data here please?

Maybe this will get you started though...
A​
B​
C​
1​
123@hotmail.comhotmailhotmail
2​
123@gmail.comgmailgmail
3​
123@private.comprivateprivate
4​
123@company1.com
#N/A​
5​
123@company2.com
#N/A​
B1=LOOKUP(9.99999999999999E+307,SEARCH($C$1:$C$3,A1),$C$1:$C$3)
copied down
C1:C3 would contain the addy's you want to find.

You could then either filter in - or out - the #NA's
 
Upvote 0
Sorry if I confused, I guess i was trying to be clear but actually rambled on.

A bit more searching and I stumbled across a post that solved this.

http://blog.contextures.com/archives/2012/10/09/find-text-with-index-and-match/

I honestly can't say I understand the final formula but it works, although you do have to press Ctrl+Shift+Enter to complete the formula as an array formula which I've never done before.
Don't leave us hanging... What were you actually looking to retrieve from those email addresses? What formula did you finally end up with?
 
Upvote 0

Forum statistics

Threads
1,215,456
Messages
6,124,939
Members
449,197
Latest member
k_bs

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top