Parsing Names in Separate Cells

dwgnome

Active Member
Joined
Dec 18, 2005
Messages
441
Need assistance on a smart formula (no VBA) that will reverse the order of a list of names - parsing into separate columns.

The formula needs to be able to recognize/distinguish that if a comma(s) is present, then the names are listed as last, first middle. If there is a period this means the name has a middle initial or that a suffix is present. So,if the names are listed as last, first and middle then in the next columns will have in the proper order first name, middle name or initial (keep the period) and last name (loosing the comma of course).

If the original name order is in normal order - first middle and last then the formula should just pass through without any change. Would need to preserve generational titles such as Jr., Sr., III etc. Not sure where the best place for the generational titles should end up. Probably as part of the last name. See examples:

Col A Col B Col C Col D Col E
William J. Clinton William J. Clinton
Churchill, Winston Leonard Spencer Winston Leonard Spencer Churhill
Bates, John John Bates
James Knoll Polk James Knoll Polk
Taft, William H. William H. Taft
Carter, James, Earl, Jr. James Earl Carter Jr.
Rockefeller, John Davidson, IV John Davison Rockefeller IV
Gates, William, H., III William H. Gates III
St. James, Michael, Ken Michael Ken St. James
Cher Cher

I tried using many variations of FIND, SEARCH, LEN and searching all the Excel boards to no avail.

Thank you in advance for any suggestions.
 
mikerickson,

Your modification to fmlName worked to rid of the commas in the middle name - thank you. However, I find it fustrating that the titles cannot be separated from them. In testing, in a fourth column I was able to place the formula =(MID(I7,FIND(" ",I7),4)) to separate the Jr. IV and such, but it is not very sophisticated at this point. Do like the named range approach, but I agree, it would be easier to edit if using straight formula in helper columns.

Peter,

The original request was to divide into separate cells, but I was willing to entertain drsarao's approach if it would give me satisfactory name reversal. I can see now that I will not be able to cover all possibilites such as the examples you mention, however, I was hoping to at least cover those in the names I presented. Is there a way to modify your formulas to rid of the commas in the middle names?

Thanks to all for contributing my inquiry.
 
Upvote 0

Excel Facts

VLOOKUP to Left?
Use =VLOOKUP(A2,CHOOSE({1,2},$Z$1:$Z$99,$Y$1:$Y$99),2,False) to lookup Y values to left of Z values.
In testing, in a fourth column I was able to place the formula =(MID(I7,FIND(" ",I7),4)) to separate the Jr. IV and such, but it is not very sophisticated at this point.
What is in I7?



Is there a way to modify your formulas to rid of the commas in the middle names?
Sure - it's actually a very slight simplification to my E2 formula:

=IF(C2,IF(B2,TRIM(SUBSTITUTE(SUBSTITUTE(A2,F2&", "&D2,""),",","")),MID(A2,LEN(D2)+2,LEN(A2)-LEN(D2&F2)-2)),"")



Similarly, each column of the helper column method could be given a name, hidding all that work.

The advantage of the helper column method is ease of editing the formula.
The name approach takes up less spreadsheet space and looks slicker.

Probably the very slickest approach would be to use the helper column method, testing, testing, testing until the formula are exactly what you want and then folding it into a named structure.
Or just hide the helper columns.
 
Upvote 0
I7 holds the range name midName as provided by mikerickson, in this case, Earl, Jr. The next column has lastName and the next column K has my little formula I described above.

Your modification to E2 worked well. Thank you.

Quite the challange for me to get the middle name formula right so that it can isolate the name(s) without the titles.:(


I think if I can come up with ground rules for the formula - perhaps someone can adapt them to complete the tasks in my original quest. So if no commas, then pass thorugh otherwise the following apply:

1 comma only - the first word up to the comma represents the last name. Any other name separated by a space is assumed to be first then middle.


2 commas - first word up to the comma would be last name, second word up to the second comma would represent the first name, all remaining words would represent middle name(s)

3 commas - fist word would be last name, second word first name, third word(s) would be middle name. Anything after that would be title Jr. Sr. III etc.

For situations like Taft, William H. Jr. apply a further condition to 1 comma only rule, so that if there is one letter followed by a period then that becomes the middle name and remainder the title. The idea of having two middle initials is just too rare to worry about.

Does this help in developing consistency and expectations? I'm sure I did not cover all situations, but if I can just get this I will then be successful.
 
Upvote 0
Certainly, a clear, consistent set of rules is what would be required. I'm still not too sure about this one
2 commas - first word up to the comma would be last name, second word up to the second comma would represent the first name, all remaining words would represent middle name(s)
Wouldn't that turn

Rockefeller, John Davidson, IV

into

First name: John Davidson
Middle name(s): IV
Last name: Rockefeller
 
Upvote 0
You raised a good point, however I failed to write in the assumption that with the 2 comma case, there will be only one word -representing the first name. HTH with a possible solution.
 
Upvote 0
Would really like to see this solved.:eek:
Sure, but we have to some real work (that is, paid) sometimes. :biggrin:

Besides, with such a variety of data formats, this is not a trivial task!

I won't be surprised if this doesn't actually work with all your data, though I hope it does. I think it fits the rules so far and is about as far as I would persevere with such formulas.

Excel Workbook
ABCDEFGHIJ
1Comma 1PosComma CountSpace 1PosSpace after First nameSuffixFirstMiddleLastSuffix
2William J. Clinton00880WilliamJ.Clinton 
3Churchill, Winston Leonard Spencer10111190WinstonLeonard SpencerChurchill
4Bates, John617120JohnBates
5James Knoll Polk00660JamesKnollPolk
6Taft, William H.516140WilliamH.Taft
7Carter, James, Earl, Jr.738151JamesEarlCarterJr.
8Rockefeller, John Davidson, IV12213181JohnDavidsonRockefellerIV
9Gates, William, H., III637161WilliamH.GatesIII
10St. James, Michael, Ken1024200MichaelKenSt. James
11Cher00050Cher
12Taft, William H. Jr.516141WilliamH.TaftJr.
13Winston Leonard Spencer Churchill00880WinstonLeonard SpencerChurchill
14St. James, Michael Ken, Sr1024191MichaelKenSt. JamesSr
Parse names
 
Upvote 0
Peter,

Let it be said that you have gone beyond the call of duty on this one!:pray:

Your $million dollar solution works perfectly.

Have a good day.
 
Upvote 0
I got completely lost in formulas. Here is a VBA solution:
Copy it a new VB module and run on a backup copy of your data.
I think it honors all your conditions. See if you can spot some anomalies.
Code:
Const START = 2 'row number to start processing from
Sub ParseNames()
'parses names in "A" column of active sheet starting at second row.
'Segments are written back in Cols B, C, D, E and F (First, Middle, Last, Suffix, Full)
Dim nms() '(1-First, 2-Middle, 3-Last, 4-Suffix, 5-Full)
LRow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row
cm = ","  'comma
sp = " "  'space
For i = START To LRow
    ReDim nms(1 To 1, 1 To 5)
    nm = Trim(ActiveSheet.Range("A" & i))
    slen = Len(nm) 'string length
    cm1 = InStr(nm, cm) 'first comma
    cm2 = InStr(cm1 + 1, nm, cm) 'second
    cm3 = IIf(cm2, InStr(cm2 + 1, nm, cm), 0) 'look for comma3 only if comma2 found
    sp1 = InStr(nm, sp) 'first space
    sp2 = InStr(sp1 + 1, nm, sp) 'second
    sp3 = IIf(sp2, InStr(sp2 + 1, nm, sp), 0) 'look for space3 only if space2 found
    If cm1 Then 'look for middle initial
        mdi = InStr(sp1, nm, ".") 'period after first space
        If mdi Then If Mid(nm, mdi - 2, 1) = sp Then Else mdi = 0
    End If
    If cm3 Then
        nms(1, 3) = Left(nm, cm1 - 1)
        nms(1, 1) = Mid(nm, cm1 + 2, cm2 - cm1 - 2)
        nms(1, 2) = Mid(nm, cm2 + 2, cm3 - cm2 - 2)
        nms(1, 4) = Mid(nm, cm3 + 2)
    ElseIf cm2 Then
        nms(1, 3) = Left(nm, cm1 - 1)
        nms(1, 1) = Mid(nm, cm1 + 2, cm2 - cm1 - 2)
        nms(1, 2) = Mid(nm, cm2 + 2)
    ElseIf cm1 Then
        nms(1, 3) = Left(nm, cm1 - 1)
        If sp3 Then
            nms(1, 1) = Mid(nm, cm1 + 2, sp2 - cm1 - 2)
            If mdi Then
                nms(1, 2) = Mid(nm, sp2 + 1, sp3 - sp2 - 1)
                nms(1, 4) = Mid(nm, sp3 + 1)
            Else
                nms(1, 2) = Mid(nm, sp2 + 1)
            End If
        ElseIf sp2 Then
            nms(1, 1) = Mid(nm, cm1 + 2, sp2 - cm1 - 2)
            nms(1, 2) = Mid(nm, sp2 + 1)
        Else
            nms(1, 1) = Mid(nm, sp1 + 1)
        End If
    Else  'no commas at all
        If sp3 Then
            nms(1, 1) = Left(nm, sp1 - 1)
            nms(1, 2) = Mid(nm, sp1 + 1, sp2 - sp1 - 1)
            nms(1, 3) = Mid(nm, sp2 + 1, sp3 - sp2 - 1)
            nms(1, 4) = Mid(nm, sp3 + 1) '4th word is taken as suffix
        ElseIf sp2 Then
            nms(1, 1) = Left(nm, sp1 - 1)
            nms(1, 2) = Mid(nm, sp1 + 1, sp2 - sp1 - 1)
            nms(1, 3) = Mid(nm, sp2 + 1)
        ElseIf sp1 Then
            nms(1, 1) = Left(nm, sp1 - 1)
            nms(1, 3) = Mid(nm, sp1 + 1)
        Else
            nms(1, 1) = nm
        End If
    End If
    nms(1, 5) = nms(1, 1) & sp & nms(1, 2) & sp & nms(1, 3) & sp & nms(1, 4)
    ActiveSheet.Range(Cells(i, 2), Cells(i, 6)) = nms 'write to worksheet
Next i
End Sub
Here is a sample run:
Excel Workbook
ABCDEF
1NameFirstMiddleLastTitleFull
2Bates, JohnJohnBatesJohn Bates
3Carter, James, Earl, Jr.JamesEarlCarterJr.James Earl Carter Jr.
4CherCherCher
5Churchill, Winston Leonard SpencerWinstonLeonard SpencerChurchillWinston Leonard Spencer Churchill
6Gates, William, H., IIIWilliamH.GatesIIIWilliam H. Gates III
7James Knoll PolkJamesKnollPolkJames Knoll Polk
8John WayneJohnWayneJohn Wayne
9Rockefeller, John Davidson, IVJohn DavidsonIVRockefellerJohn Davidson IV Rockefeller
10Rockefeller, John, Davidson, IVJohnDavidsonRockefellerIVJohn Davidson Rockefeller IV
11St. James, Michael, KenMichaelKenSt. JamesMichael Ken St. James
12St. James, Michael, Ken JrMichaelKen JrSt. JamesMichael Ken Jr St. James
13St. James, Michael, Ken, JrMichaelKenSt. JamesJrMichael Ken St. James Jr
14Taft, William H.WilliamH.TaftWilliam H. Taft
15Taft, William H. Jr.WilliamH.TaftJr.William H. Taft Jr.
16William J. ClintonWilliamJ.ClintonWilliam J. Clinton
17William J. Clinton JrWilliamJ.ClintonJrWilliam J. Clinton Jr
18Winston Leonard Spencer ChurchillWinstonLeonardSpencerChurchillWinston Leonard Spencer Churchill
Sheet8
Excel 2003
"Rockefeller, John Davidson, IV" should have a comma after John and "St. James, Michael, Ken Jr" after Ken as per conditions.
No way to differentiate double middle name in comma-less names so "Churchill" goes into suffix! Does not matter for the full name though.
If you have large amount of data then VBA maybe a better solution.
Bottomline is - keep your data organized into different columns at the outset itself.
Maybe you run this macro then delete the original Column A.
 
Last edited:
Upvote 0
Thank you drsarao for chiming in and offering a solution.:)

I tend to shy from VBA due to no experience in using them. If you are willing, could you provide steps on how to incorporate your VBA solution into my backup file?

Thank you.
 
Upvote 0

Forum statistics

Threads
1,214,976
Messages
6,122,541
Members
449,089
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top