Parsing Names & String Manipulation

Gravedigger

New Member
Joined
Jul 21, 2010
Messages
12
I am new to this site and drsarao's work on this thread was awesome: http://www.mrexcel.com/forum/showthread.php?t=424754&page=3

I tried to PM him, but no reply. I have a cemetery spreadsheet that I need to work on (hobby) and I need to parse the names (50,000 names). If I give you some examples, can you whip me up something? I am brand new to macros and VB, but I followed the examples on the earlier thread and they worked (not entirely for my situation). If you can get something working I could donate through Paypal.

Here are some of the example names in A1:
unknown
Smith, Paul
Smith, Paul L.
Smith, Paul Lee
Smith, Paul Lee James
Smith, Paul L., Dr.

Smith, Paul L., ,M.D.
Smith, Paul , Mr.
Smith, , Mr.
Smith, Mrs.
Smith, Paul L., , Jr.
Smith, Paul L., Jr.
Smith, Paul L., , Sr.
Smith, Paul L., Sr.
Smith, Paul R., , II
Smith, Paul R., , III
Lewis, Simeon (Infant of)
Norton, C. (Unknown )
Parker, Infant of Everett J.

Spreadsheet has names listed in column A.
I would like:
Column B: (Mr. Mrs. Or Dr.) here
Column C: Last Name (comma gone)
Column D: First Name
Column E: Middle (name, names, initial, initials, or blank if nothing)
Column F: (Jr. Sr. II III M.D. PhD) here
Column G:Extra stuff like: infant of, dau of son of etc Anything in between () but minus the ()
Thanks
 
Hi, Change the 2 lines in the second sub (similar to below), to the lines below.
Code:
dn.Offset(, 10) = Mid(dn, InStr(dn, Ray(5)), (InStr(dn, "Co.") + 6) - InStr(dn, Ray(5)))
    dn.Offset(, 11) = Right(dn, Len(dn) - (InStr(dn, "Co") + 4))
Regards Mick
 
Upvote 0

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
That fixed those problems. Thanks.
I also noticed that some of the dates get converted wrong.
Here are some examples:
<table style="border-collapse: collapse; width: 365pt;" border="0" cellpadding="0" cellspacing="0" width="486"><col style="width: 365pt;" width="486"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt; width: 365pt;" height="20" width="486">Aaron Morris M. 1-20-1841 6-5-1913 4 VT Infantry Co. D Lincoln Wyuka Lancaster</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbey Frank 1-1-1848 3-6-1929 38 WI Infantry Co. I Fairbury Fairbury Jefferson</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbink Gerrit 6-1837 9-10-1901 9 WI Infantry+ Co. A Holland Holland Lancaster</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbott Elmer Y. 2-12-1845 9-30-1906 146 IL Infantry Co. H Pilger K of P Stanton</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbott Hiram 2-3-1841 8-25-1892 4 MI Cavalry Capt Co. M St.Edward Evergreen Boone</td> </tr> </tbody></table>

Gets converted to this:
<table style="border-collapse: collapse; width: 432pt;" border="0" cellpadding="0" cellspacing="0" width="576"><col style="width: 48pt;" span="9" width="64"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt; width: 48pt;" height="20" width="64">Aaron</td> <td class="xl63" style="width: 48pt;" width="64">Morris</td> <td class="xl63" style="width: 48pt;" width="64">M.</td> <td class="xl63" style="width: 48pt;" align="right" width="64">1</td> <td class="xl63" style="width: 48pt;" align="right" width="64">20</td> <td style="width: 48pt;" align="right" width="64">1841</td> <td style="width: 48pt;" align="right" width="64">5</td> <td style="width: 48pt;" align="right" width="64">5</td> <td style="width: 48pt;" align="right" width="64">1913</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbey</td> <td class="xl63">Frank</td> <td class="xl63">
</td> <td class="xl63" align="right">1</td> <td class="xl63" align="right">1</td> <td align="right">1848</td> <td align="right">6</td> <td align="right">6</td> <td align="right">1929</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbink</td> <td class="xl63">Gerrit</td> <td class="xl63">
</td> <td class="xl63" align="right">1</td> <td class="xl63" align="right">1</td> <td align="right">1837</td> <td align="right">10</td> <td align="right">10</td> <td align="right">1901</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbott</td> <td class="xl63">Elmer</td> <td class="xl63">Y.</td> <td class="xl63" align="right">12</td> <td class="xl63" align="right">12</td> <td align="right">1845</td> <td align="right">9</td> <td align="right">30</td> <td align="right">1906</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abbott</td> <td class="xl63">Hiram</td> <td class="xl63">
</td> <td class="xl63" align="right">3</td> <td class="xl63" align="right">3</td> <td align="right">1841</td> <td align="right">8</td> <td align="right">25</td> <td align="right">1892</td> </tr> </tbody></table>
This is the last little hiccup and then i think I am set.
Much appreciated.
Tony
 
Upvote 0
Hi, I think its because the date function i used, is based on a date at the turn of the century.
Change the 2 lines in red , shown below.
These lines are also at the bottom of the Second Sub.
Rich (BB code):
If IsDate(Ray(3)) Then
        dn.Offset(, 4) = Split(Ray(3), "/")(1): dn.Offset(, 5) = Split(Ray(3), "/")(0): dn.Offset(, 6) = Split(Ray(3), "/")(2)
    ElseIf Ray(3) = "NO-DOB" Then
        dn.Offset(, 4) = "": dn.Offset(, 5) = "": dn.Offset(, 6) = ""
    End If
    dn.Offset(, 7) = Split(Ray(4), "/")(1): dn.Offset(, 8) = Split(Ray(4), "/")(0): dn.Offset(, 9) = Split(Ray(4), "/")(2)
    Next dn
Regards Mick
 
Upvote 0
That fixed it. I appreciate your time. You are a gifted programmer.
Your code will do 90% of the spreadsheet parsing for me. I can handle the rest of it.

Thanks-
Tony
 
Upvote 0
MickG-
Another spreadsheet. Only 5,000 on this one.
Here are the examples (all in one cell).
If there is a second middle name, then it is a Maiden name.
the date after the b. is birth and after the d. is death
the cityand or state is listed after the birth date and after death date

<table style="border-collapse: collapse; width: 488pt;" border="0" cellpadding="0" cellspacing="0" width="651"><col style="width: 488pt;" width="651"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt; width: 488pt;" height="20" width="651">Abley, Alice Louisa Falk, b. 10/16/1881, Dewey, Ill, d. 1/23/1960, Alliance NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abley, Amos D., b. 6/12/1872, New Troy, Mich, d. 2/12/1942, Alliance NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abley, Cliff H. d. 1897</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abley, Emory F., b. 7/29/1868, Three Oaks, Mich, d. 9/7/1967, Hemingford NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abley, Evertt A., d. 7/29/12</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Abley, Fredolin, no dates</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Antrim, Ilene B. Frock, b. 11/14/1902, Bosworth, Mo D.12/2/1995, Arizona</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Appleyard, Douglas L. d. 4/18/1950</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Appleyard, J. Lloyd, b. 7/5/1913, Goehner NE, d. 10/22/1972, Lexington NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Appleyard, Lela Mae Bowser, b. 6/21/1912, Genoa NE, d. 8/8/1998, Scottsbluff NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Appleyard, Leslter R., b. 5/31/1916, Gohner NE, d. 11/8/1965, Scottsbluff NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Elsie Maud Leavitt, b. 5/30/1894, Hemingford NE, d. 4/10/1932, Hemingford, NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Frank, b. 5/21/1887, Nebraska, d. 12/2/1976, Hemingford NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Infant Son d. 1/14/1921</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Infant Son d. 1/16/1924</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Infant Son d. 10/25/1921</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Joanna Zvacek, b. 5/24/1856, Czechoslovakia, d. 6/6/1941, Box Butte Co., NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Louis, b. 8/19/1893, Box Butte Co. NE, d. 9/27/1986, Alliance NE, WWI</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Margaret S. Kiester, b. 2/6/1905, Box Butte Co., NE D.12/27/1996, Hemingford, NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, Maurice, b. 8/12/1847, Austria, d. 2/19/1940, Box Butte Co. NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bartos, William, b. 9/7/1896, Pawnee, Okla, d. 8/10/1981, Alliance NE</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bass, Elvira, b. 3/1/1836, d. 12/18/1902</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Bass, Moses, b. 10/1/1828, d. 2/8/1902</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">Basse, Conradina, b. 1/2/1905, d. 1900</td> </tr> </tbody></table>

What I would like to see parsed:

Col B: Last Name
Col C: First name
Col D: Middle Initial or blank if there is no initial listed
Col E: Maiden name
Col F: Month # (if one is listed)
Col G: Day # (if one is listed)
Col H: Year
Col I: City and or State after birth date
Col H: Month # (if one is listed)
Col I: Day # (if one is listed)
Col J: Year
Col K: City and or State after death date

Once again- YOU ROCK!
 
Upvote 0
Hi, Try this:-
NB:- Paste data into new sheet Start Cell "A1", The code will alter the basic Data, and output Resulting data Starting Column "B".
Code:
[COLOR="Navy"]Sub[/COLOR] MG30Aug49
[COLOR="Navy"]Dim[/COLOR] st [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]String,[/COLOR] ray, Nams, Ac [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Integer,[/COLOR] Nm [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
Dn = Replace(Replace(Dn, "no dates", ""), "Infant Son", "")
Dn = Replace(Replace(Dn, "d.", "D."), "b.", "B.")
[COLOR="Navy"]If[/COLOR] IsNumeric(Mid(Dn, InStrRev(Dn, "D.") + 3, 1)) [COLOR="Navy"]Then[/COLOR]
Dn.Characters(InStrRev(Dn, "D."), 2).Text = "@ "
[COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]If[/COLOR] IsNumeric(Mid(Dn, InStrRev(Dn, "B.") + 3, 1)) Or IsNumeric(Mid(Dn, InStr(Dn, "B.") + 3, 1)) [COLOR="Navy"]Then[/COLOR]
Dn.Characters(InStrRev(Dn, "B."), 2).Text = "@ "
[COLOR="Navy"]End[/COLOR] If
st = Dn
ray = Split(st, "@")
[COLOR="Navy"]For[/COLOR] Ac = 0 To UBound(ray)
Nm = Split(ray(Ac), " ")
[COLOR="Navy"]Select[/COLOR] [COLOR="Navy"]Case[/COLOR] Ac
Case Is = 0: Dn.Offset(, 1).Resize(, UBound(Nm)) = Split(ray(Ac), " ") '[COLOR="Green"][B]: MsgBox "0  " & ray(0)[/B][/COLOR]
[COLOR="Navy"]Case[/COLOR] [COLOR="Navy"]Is[/COLOR] = 1
[COLOR="Navy"]If[/COLOR] InStr(Split(ray(Ac), ",")(0), "/") = 0 [COLOR="Navy"]Then[/COLOR] ray(Ac) = " / /" & Trim(ray(Ac))
Dn.Offset(, 5).Resize(, 3) = Split(Split(ray(Ac), ",")(0), "/")
[COLOR="Navy"]If[/COLOR] UBound(Split(ray(Ac), ",")) > 1 [COLOR="Navy"]Then[/COLOR]
Dn.Offset(, 8) = Split(ray(Ac), ",")(1)
[COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Case[/COLOR] [COLOR="Navy"]Is[/COLOR] = 2
[COLOR="Navy"]If[/COLOR] InStr(Split(ray(Ac), ",")(0), "/") = 0 [COLOR="Navy"]Then[/COLOR] ray(Ac) = " / /" & Trim(ray(Ac))
Dn.Offset(, 9).Resize(, 3) = Split(Split(ray(Ac), ",")(0), "/")
[COLOR="Navy"]If[/COLOR] UBound(Split(ray(Ac), ",")) > 1 [COLOR="Navy"]Then[/COLOR]
Dn.Offset(, 12) = Split(ray(Ac), ",")(1)
[COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]End[/COLOR] Select
[COLOR="Navy"]Next[/COLOR] Ac
[COLOR="Navy"]Next[/COLOR] Dn
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0
MickG-
That works great. I am very appreciative of all the work you have done towards these projects for me. The family of these loved ones will be able to find them faster and maybe answer some genealogy questions.

Thanks again-
Tony
 
Upvote 0
Mick-
Is there an easy way to split a single cell into multiples if there are symbols dividing the information?

Example in A1:
anything @ can be @ in these spaces @ even 123123

Parsed into:
B1: anything
C1: can be
D1: in these spaces
E1: even 123123

Is there a way to have it keep searching for @'s until there are no more (in case one has 3, another one has 2, and another one has 4 or 5?

Thanks
 
Upvote 0
Hi, In principal , Yes., but it really depends on how the data is set out.
Each Cell in column "A" of your recent data was split by "@", as you will be aware.
I used "@" to replace "d." and "b." because in some rows there where duplicates of the Letters which would have caused confusion.
In each Row of data the letters "d." or "b." where succeeded by a date and then an a location, this made it relatively easy to parse.
Because there where on two pieces of data for each "Born" and "Died" (Date & Location) I new this would only Take 4 Columns for each.
If there where more data it could be more problematical
If it is possible to find more distnct features to split the data on all, then splitting subsequent data is possible.


Regards Mick
 
Upvote 0
I have been using your last bit of code to work a few other lists and they are a little different. Can you please tweak it a bit?

Here is example data:
<table style="border-collapse: collapse; width: 268pt;" border="0" cellpadding="0" cellspacing="0" width="357"> <col style="width: 268pt;" width="357"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt; width: 268pt;" height="20" width="357">BROWN, Litha, B. Mar 5, 1875, D. Jan 20, 1961</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">BROWN, Lula V. Crutchfield, B. Jan 25, 1878, D. May 20, 1939</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">BROWN, Melba Arneice, B. Dec 4, 1960, D. Dec 4, 1960</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">BROWN, Minnie, B. Apr 20, 1883, D. Jan 18, 1914</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl63" style="height: 15pt;" height="20">BROWN, Pearl Evelyn, B. Mar 3, 1900, D. Oct 5, 1976</td> </tr> </tbody> </table>
Here is what your MS30Aug49 does:
<table style="border-collapse: collapse; width: 528pt;" border="0" cellpadding="0" cellspacing="0" width="704"><col style="width: 48pt;" span="11" width="64"> <tbody><tr style="height: 15pt;" height="20"> <td style="height: 15pt; width: 48pt;" height="20" width="64">BROWN,</td> <td style="width: 48pt;" width="64">Litha,</td> <td style="width: 48pt;" width="64">B.</td> <td style="width: 48pt;" width="64">Mar</td> <td style="width: 48pt;" width="64">5,</td> <td style="width: 48pt;" width="64">1875,</td> <td style="width: 48pt;" width="64">D.</td> <td style="width: 48pt;" width="64">Jan</td> <td style="width: 48pt;" width="64">20,</td> <td style="width: 48pt;" width="64">
</td> <td style="width: 48pt;" width="64">
</td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" height="20">BROWN,</td> <td>Lula</td> <td>V.</td> <td>Crutchfield,</td> <td>B.</td> <td>Jan</td> <td>25,</td> <td>1878,</td> <td>D.</td> <td>May</td> <td>20,</td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" height="20">BROWN,</td> <td>Melba</td> <td>Arneice,</td> <td>B.</td> <td>Dec</td> <td>4,</td> <td>1960,</td> <td>D.</td> <td>Dec</td> <td>4,</td> <td>
</td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" height="20">BROWN,</td> <td>Minnie,</td> <td>B.</td> <td>Apr</td> <td>20,</td> <td>1883,</td> <td>D.</td> <td>Jan</td> <td>18,</td> <td>
</td> <td>
</td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" height="20">BROWN,</td> <td>Pearl</td> <td>Evelyn,</td> <td>B.</td> <td>Mar</td> <td>3,</td> <td>1900,</td> <td>D.</td> <td>Oct</td> <td>5,</td> <td>
</td> </tr> </tbody></table>
Like to see:
Col B: Last Name
Col C: First name
Col D: Middle Initial or name or blank if there is no initial listed
Col E: Extra name (see above Crutchfield)
Col F: Month # (if one is listed)
Col G: Day # (if one is listed)
Col H: Year
Col I: Month # (if one is listed)
Col J: Day # (if one is listed)
Col K: Year
I can find and replace the months to convert to numbers, unless it is an easy bit of code. Commas removed would be nice, but again I can find and replace those. I do not need the B. or D. in columns for birth and death.

Once again...thanks.
Tony
 
Upvote 0

Forum statistics

Threads
1,215,514
Messages
6,125,271
Members
449,219
Latest member
daynle

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top