Extract Names from long String with specific criteria?

jeffcoleky

Active Member
Joined
May 24, 2011
Messages
274
Can someone help me create either a macro or a formula to extract the first 4 names out of each row in column "A". By "Help me create" I mean "Do it for me" because I'm absolutely lost as to where to begin...:confused:

The Raw data is all in column A and I want each name output to Columns B, C, D, E depending on how many are found. Any names after the 4th in any given row are not needed. I'd also like to re-order the name to be "First Last" instead of "LAST, FIRST, MIDDLE".

I've placed examples in this sheet of how I expect it will look when finished. If this request for help is unreasonable let me know and I'll try to find another way around this issue.

DOWNLOAD SAMPLE DATA BELOW HERE

Consistent Facts about the Raw Data (Column A):

  1. If more than one name is contained in the string, each is separated by a semicolon
  2. Name formats are "Last, First, Middle;" or "Last, First;"
  3. Names will always have at least one comma
  4. Company names will have at least one or more spaces separating the words
  5. If a business name is longer than 20 characters long, a ", " is added to the string for each 20 characters

Macro or Formula should Ignore value if any of the following are true:

  1. Charaters (including spaces) exceed 19 characters (because it must be a company name)
  2. String contains "&" (because it must be a company name)
  3. Duplicate to value already gathered from same row
  4. String is after the 3rd semicolon (only need values for the first 4)

-Jeff


Excel 2013
ABCDE
1DESIRED OUTPUT
2
RAW DATA​
Name1Name2Name3Name4
3HART, JENNIFER, R; GENERAL REVENUE INC; COMMONWEALTH OF OHIO DIVIS, ION OF UNEMPLOYMENT, INSURANCEJennifer Hart
4MILLER, JEANNIE, L; MILLER, JEANNIE; MILLER, JOHN, U; MILLER, JOHN; MORTON CENTER; COMMONWEALTH OF OHIO FINAN, CE & ADMINISTRATION, CABINETJeannie MillerJohn Miller
5DELACRUZ, CHRIS; DELACRUZ, CHRIS, BChris Delacruz
6UNITED STATES OF AMERICA SECRE, TARY OF HOUSING & UR, BAN DEVELOPMENT; HARRELL, ESTELLE, C; HARRELL, DORSEY, EEstelle HarrellDorsey Harrell
7RAGAINS, SEBRINA, N; RAGAINS, RICHARD, C; RAGAINS, RICHARD; UNIVERSITY OF UROLOGY PLLC; TIPTON & UNROE PSC; MIDLANE PARK NEIGHBORHOOD ASSN, INCSabrina RagainsRichard RagainsUniversity Of Urology Pllc
8WILCHER, ANTHONY, L; REPUBLIC BANK & TRUST CO
9SIMPSON, JACQUELINE
10LE, SON, VAN; LE, THUYLIEU, THISon LeThi Thuylieu
11SCRUGGS, ALMA, H; SCRUGGS, ALMA, JEAN; CITIFINANCIAL SERVICES INC; COMMERCIAL CREDIT CORP; UNION TRUST INC; ARROWOOD IDEMNITY CO; ROYAL INDEMNITY; CAPITAL ONE BANK USA NA; CAPITAL ONE BANK; ANDERSSON ANDERSSON COUNTY ME, TRO GOVERNMENTAlma Scruggs Citifinancial Services Inc Citifinancial Services Inc
12COLVIN, TODD; COLVIN, ERIC; BANK OF ANDERSSON & TRUST CO; THE ESTATE OF ROBERT WAYNE COLVINTodd ColvinEric Colvin
13KNIGHT, NOLAN, ANolan Knight
14TATE, MARIO, EXR; TATE, MARIO, A EXR; UNITED STATES OF AMERICA SECRE, TARY OF HOUSING & UR, BAN DEVELOPMENT; ANDERSSON ANDERSSON COUNTY ME, TRO DEPARTMENT OF CO, DES & REGULATIONS; CITIMORTGAGE INC; COMMONWEALTH OF OHIO CABIN, ET FOR HUMAN RESOURC, ES DIVISION OF UNEMP; BILARCZYK, NADINEMario Tate
15SMITH, LORA, A; SMITH, LORA, ASHLEY; SMITH, ASHLEY; PROVIDIAN NATIONAL BANK; SMITH, JOSEPH; SMITH, JOSEPH, JR; SMITH, JOSEPH, L JR; BROWN, MICHAEL; NATIONAL CITY REAL ESTATE SERV, ICES LLC; NATIONAL CITY MORTGAGE CO; ARROW FINANCIAL SERVICES LLC; OEM COATINGS; AUTOVEST LLC; SMITH, MARTHA, EXR; ANDERSSON ANDERSSON COUNTY ME, TRO GOVERNMENT; CNAC; COMMONWEALTH OF OHIO WORKF, ORCE DEVELOPMENT CAB, INET EX REL DIVISION; EQUITABLE FINANCIAL SERVICES L, LC; CAPITAL ONE BANK USA NA; CAPITAL ONE BANK; ALBAN USED CARS I LLC; ALBAN USED CARS INC; FIRST RESOLUTION INVESTMENT CO, RP; COMMONWEALTH OF OHIO FINAN, CE & ADMINISTRATION, CABINET; BBC CHECK ADVANCE LLC; UNITED STATES OF AMERICA DEPAR, TMENT OF THE TREASUR, Y INTERNAL REVENUE S; NORTON HEALTHCARE INC; REPUBLIC BANK & TRUST CO; ECKART LLC; STONE CREEK FINANCIAL; ANDERSSON ANDERSSON COUNTY ME, TRO REVENUE COMMISSI, ONLora SmithAshley Smith
16FOREE, MICHAEL, W JR; FOREE, MICHAEL, JR; FOREE, DEMITRIA, V; FOREE, DEMETRIA, V; FOREE, DEMITRIA; NORTON AUDUBON HOSPITAL; NORTON HEALTHCARE INC; DR GREGORY CECIL DMD LLC; ANDERSSON ANDERSSON COUNTY ME, TRO DEPARTMENT OF IN, SPECTIONS PERMITS &Michael ForeeDemitria Foree
17TRAN, NGOC; LE, HAJ, P; LE, HAI, PNgoc TranHaj Le
18DEWALT, DAVID, J; DEWALT, KESHA, R; CHEMICAL MORTGAGE CODavid DewaltKesha DewaltMortgage Chemical
19ABERNATHY, VELDA, J; COLVIN, WILMA, M; ABERNATHY, WILMA, M; TRILOGY HEALTHCARE OF JEFFERSO, N LLC; FRANCISCAN HEALTH CARE CENTERVelda AbernathyWilma ColvinWilma Abernathy
20ZACHARY, REGINA, C; B&B FUNDING LLCRegina Zachary
21FRENCH, STENNIS; FRENCH, REGINA; NATIONAL CITY MORTGAGE CO; SECRETARY OF HOUSING & URBAN D, EVELOPMENT; HOUSEHOLD FINANCE CORP II; CAPITAL ONE BANKStennis FrenchRegina French
22CRAIG, JOHN, P JR; BLACK, LINDA; OHIO HOUSING CORPJohn CraigLinda Black
23CLAY, PAUL; CLAY, PAUL, EUGENE III; CLAY, NANCY, GPaul ClayNancy Clay
24BREWER, CRAIG, L; BREWER, DONNA; BREYER, DONNA, Y; ANDERSSON ANDERSSON COUNTY ME, TRO GOVERNMENT; ANDERSSON ANDERSSON COUNTY ME, TRO DEPARTMENT OF CO, DES & REGULATIONS; CLINICAL PATHOLOGY ASSOCIATES, INC; CPA LABS; BARDSTOWN IMMEDIATE CARE CENTE, RCraig BrewerDonna Brewer
25CRAWFORD, ROY; CRAWFORD, TRESSA; COMMONWEALTH OF OHIO EDUCA, TION CABINET EX REL, DIVISION OF UNEMPLOY; BILLTOWN FARM HOMEOWNERS ASSN, INC; DEUTSCHE BANK NATIONAL TRUST C, O, TR; FIRST FRANKLIN MORTGAGE LOAN T, RUST, TRRoy CrawfordTressa Crawford

<tbody>
</tbody>
Sheet1
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
Can you clarify a couple of points for us?

1) Why is there no output for Rows 8 and 9?

2) Why is the company name outputted in Message #18 (it is not a person's name because it does not have a comma within the first 20 characters)?

3) In Row 11, why doesn't the company name CITIFINANCIAL SERVICES INC have a comma after the last "E" in the word "SERVICES" (that is the position of the 20 character in the company name which is 26 characters long)?
 
Upvote 0
Can you clarify a couple of points for us?

1) Why is there no output for Rows 8 and 9?

Mistake on my part, updated Sample Data Google Doc to add names.

2) Why is the company name outputted in Message #18 (it is not a person's name because it does not have a comma within the first 20 characters)?

Mistake on my part, updated Sample Data Google Doc to remove.

3) In Row 11, why doesn't the company name CITIFINANCIAL SERVICES INC have a comma after the last "E" in the word "SERVICES" (that is the position of the 20 character in the company name which is 26 characters long)?

I looked at a larger dataset and it appears estimate of 20 is incorrect. It's 30 characters will get a comma, NOT including the precluding " ". So " 123456789012345678901234567890" won't generally have a comma added.

So amend the rule: from 20 characters to 31 (which includes the space).

Sorry, i didn't realize i was so sloppy. should be fixed now.
 
Upvote 0
Mistake on my part, updated Sample Data Google Doc to remove.
When I click the link you posted in Message #1 for the sample data, the webpage tells me it is locked. You need to make it shared so we can download it.
 
Upvote 0
Fixed, thanks for your attention to this for me!
Got it. Just checking, this is what you have for Row 10...

LE, SON, VAN; LE, THUYLIEU, THI

You processed the names as "Son Le" and "Thi Thuylieu". Why isn't the second name "Thuylieu Le" instead?

Also, will your data always start on Row 3 (it is hard to tell if Row 1 is a real header or simply a description)?

Where did the name "Mario Tate" come from on Row 14?
 
Last edited:
Upvote 0
Got it. Just checking, this is what you have for Row 10...

LE, SON, VAN; LE, THUYLIEU, THI

You processed the names as "Son Le" and "Thi Thuylieu". Why isn't the second name "Thuylieu Le" instead?

Also, will your data always start on Row 3 (it is hard to tell if Row 1 is a real header or simply a description)?

Where did the name "Mario Tate" come from on Row 14?
One more question in addition to the above ones... can we make the assumption that if the text has at least one space without a comma in front of it, then it must be a company name and, as such, should not be copied as a result name? Seems like that is reasonable assumption given all of your names have a comma in front of each space in it.
 
Upvote 0
One more question in addition to the above ones... can we make the assumption that if the text has at least one space without a comma in front of it, then it must be a company name and, as such, should not be copied as a result name? Seems like that is reasonable assumption given all of your names have a comma in front of each space in it.

Yes
 
Upvote 0
Got it. Just checking, this is what you have for Row 10...

LE, SON, VAN; LE, THUYLIEU, THI

You processed the names as "Son Le" and "Thi Thuylieu". Why isn't the second name "Thuylieu Le" instead?

Right, another of my mistakes. Corrected in Google Docs.

Also, will your data always start on Row 3 (it is hard to tell if Row 1 is a real header or simply a description)?

Yes

Where did the name "Mario Tate" come from on Row 14?

I don't understand why you question this one? It's the first name in row 14, why would it not be used? (In this instance, "Exr" Stands for "executor" as in "executor of an estate". So it's still a name we want)

"TATE, MARIO, EXR; TATE, MARIO, A EXR; UNITED STATES OF AMERICA SECRE, TARY OF HOUSING & UR, BAN DEVELOPMENT; ANDERSSON ANDERSSON COUNTY ME, TRO DEPARTMENT OF CO, DES & REGULATIONS; CITIMORTGAGE INC; COMMONWEALTH OF OHIO CABIN, ET FOR HUMAN RESOURC, ES DIVISION OF UNEMP; BILARCZYK, NADINE"
 
Upvote 0

Forum statistics

Threads
1,215,236
Messages
6,123,799
Members
449,127
Latest member
Cyko

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top