Formula to extract text from a string searching backwards

datadrvn

Board Regular
Joined
Apr 7, 2005
Messages
86
Been hunting for a few hours without success. I'm trying to extract a person's name from the middle of a cell. The names are in Last, First order but have text before / after within the cell.

Ex: 792-12345678-Test Project-Smith, David - technician

For the most part, the text in the cell isn't standardized so the text before / after the names varies both in length and characters, i.e., sometimes there are only two sections separated by hypens and other times they're separated by spaces. The only constants are that the there is always and only one comma in the string (between the Last and First names) and the character just before the last name is a hyphen (-). There are no middle initials of other name elements to consider.

I'm trying to extract just the name portion. I can use MID() to find the comma and go forward. However, I can't figure out how to start at the position of the comma and work backwards to find the 1st hyphen to the left of the comma, which would signify the beginning of the name information. Once I can get to the starting point, I believe I can get the rest.
 

Excel Facts

Excel motto
Not everything I do at work revolves around Excel. Only the fun parts.
This will give you the first name:
=TRIM(MID(A1,FIND(",",A1)+1,FIND("-",A1,FIND(",",A1))-FIND(",",A1)-1))

This array formula (confirmed with CTRL+SHIFT+ENTER) will give you the last name:

=MID(A1,FIND(",",A1)-MIN((FIND(",",A1))-(ROW(1:1000)*--(MID(SUBSTITUTE(LEFT(A1,FIND(",",A1)),"-","#"),ROW(1:1000),1)="#")))+1,MIN((FIND(",",A1))-(ROW(1:1000)*--(MID(SUBSTITUTE(LEFT(A1,FIND(",",A1)),"-","#"),ROW(1:1000),1)="#")))-1)
 
Upvote 0
Thank you, Ben. This works. The first name is great because it only returns the first name. The one for the second name returns all the text before it...but with what you've provided, I can go from there.

Thank you, again. Wayne
 
Upvote 0
You're not just getting Smith? I tested it on many variations of your data and it works...

Again, you need to enter this as an array formula - when you paste it into the cell, hit CTRL+SHIFT+ENTER to confirm.

Does that help?
 
Upvote 0
DOH...yepper...sorry to use your time like that...totally works as you said. If you care to, can you help me understand the role that the "ROW(1:1000)" is playing in this? Otherwise, I greatly appreciate your time and knowledge.

Wayne
 
Upvote 0
Does this regular formula work for you?
Code:
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(",",A1)),"-",REPT(" ",100)),100)
&" "&LEFT(MID(A1,FIND(",",A1)+2,LEN(A1)),FIND(" ",MID(A1,FIND(",",A1)+2,LEN(A1)))))
 
Upvote 0
Does this regular formula work for you?
Code:
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND(",",A1)),"-",REPT(" ",100)),100)
&" "&LEFT(MID(A1,FIND(",",A1)+2,LEN(A1)),FIND(" ",MID(A1,FIND(",",A1)+2,LEN(A1)))))
I knew an MVP would get a simpler formula ... Thanks!


DOH...yepper...sorry to use your time like that...totally works as you said. If you care to, can you help me understand the role that the "ROW(1:1000)" is playing in this? Otherwise, I greatly appreciate your time and knowledge.

Wayne

First let me shorten the formula - I had an unnecessary SUBSTITUTE in there...

=MID(A1,FIND(",",A1)-MIN((FIND(",",A1))-(ROW(1:1000)*--(MID(LEFT(A1,FIND(",",A1)),ROW(1:1000),1)="-")))+1,MIN((FIND(",",A1))-(ROW(1:1000)*--(MID(SUBSTITUTE(LEFT(A1,FIND(",",A1)),"-","#"),ROW(1:1000),1)="#")))-1)

ROW(1:1000) in an array formula simply generates a range of all the numbers between 1 and 1,000 (what I figured the maximum length of your text would be). Stuffed into the second argument of the MID formula, it returns a range of each character in your string separately.

We only do that MID after we do a LEFT, ending at the comma. So now we have all the characters before the comma in a range. We then check each character for ="-" which will return 1 (TRUE) if the character is a hyphen and 0 (FALSE) if it is not. So we have a range something like 0;0;0;0;0;0;1;0;0;0;0;1;0;0;0;1 - where the 1's are in the positions of the hyphens.

Then we multiply that by ROW(1:1000) again, to return the position number of the hyphens: 0;0;0;0;0;0;7;0;0;0;0;12;0;0;0;16. Then we take the position point of the comma - say it's 21, and we do that minus all the numbers in the array - so we get 21;21;21;21;21;21;14;21;21;21;21;9;21;21;21;5 and we use MIN to get the lowest one of those - so now we have the starting number (16) and the length (5) of the characters between the last hyphen and the comma.

The rest is just a MID formula :)
 
Upvote 0
I really don't even know what to say. Amazing just seems inadequate. Thank you all for your help. Simply amazing!
 
Upvote 0
No problem :) Glad I could help (y) Appreciate the feedback!

As I anticipated, an expert produced a waaaaaay simpler, shorter formula ;)
 
Upvote 0

Forum statistics

Threads
1,215,327
Messages
6,124,290
Members
449,149
Latest member
mwdbActuary

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top