Parsing Text

Alex O

Active Member
Joined
Mar 16, 2009
Messages
345
Office Version
  1. 365
Platform
  1. Windows
I need a formula (in I2) that will look at H2 and parse out 5608 Ave K then J2 parse out Birmingham AL and K2 35208. This started out relatively easy enough....but the addresses that have Dr, Ave, Ln, etc are messing me up! Any ideas??

Thanks

<TABLE style="BACKGROUND-COLOR: #ffffff; PADDING-LEFT: 2pt; PADDING-RIGHT: 2pt; FONT-FAMILY: Arial,Arial; FONT-SIZE: 10pt" border=1 cellSpacing=0 cellPadding=0><COLGROUP><COL style="WIDTH: 30px; FONT-WEIGHT: bold"><COL style="WIDTH: 248px"><COL style="WIDTH: 64px"><COL style="WIDTH: 64px"><COL style="WIDTH: 64px"></COLGROUP><TBODY><TR style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt; FONT-WEIGHT: bold"><TD> </TD><TD>H</TD><TD>I</TD><TD>J</TD><TD>K</TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">1</TD><TD style="TEXT-ALIGN: center; FONT-WEIGHT: bold">Billing Address</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">2</TD><TD style="TEXT-ALIGN: left">5608 Ave K Birmingham AL 35208</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">3</TD><TD style="TEXT-ALIGN: left">105 Shire Dr Pelham AL 35214</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 17px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">4</TD><TD style="BACKGROUND-COLOR: #ffff00; FONT-WEIGHT: bold"> </TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">5</TD><TD style="TEXT-ALIGN: left">105 Shire Dr Pelham AL 35214</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">6</TD><TD style="TEXT-ALIGN: left">7735 Old Birmingham Hwy Cottondale AL 35453</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 34px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">7</TD><TD style="TEXT-ALIGN: left">109 Blackwell Ln Oakman AL 35579</TD><TD> </TD><TD> </TD><TD> </TD></TR><TR style="HEIGHT: 51px"><TD style="TEXT-ALIGN: center; BACKGROUND-COLOR: #cacaca; FONT-SIZE: 8pt">8</TD><TD style="TEXT-ALIGN: left">2005 Old Montgomery Hwy Birmingham AL 35244</TD><TD> </TD><TD> </TD><TD> </TD></TR></TBODY></TABLE>
 
Hi Alex,

Here is my shot...

Workbook Setup:

The first sheet would be a new added worksheet.
Change the tab name to "CityNames".
Change the sheet's codename to "shtCityNames".

The second sheet is the one you have the complete addresses in Col H.
I changed the tab name to "DataSheet".
Change the sheet's CodeName to "shtDataSheet"

Go to http://en.wikipedia.org/wiki/List_of_cities_in_Alabama and copy the contents of the table of cities. Paste this to cell A1 on CityNames.
This sheet will then look something like:
Excel Workbook
ABCD
1AbbevilleHenry County2,987&0000000000000015.61000015.61 sq mi (40.42 km2)
2AdamsvilleJefferson County4,965&0000000000000019.61000019.61 sq mi (50.79 km2)
3AddisonWinston County723&0000000000000003.5100003.51 sq mi (9.10 km2)
4AkronHale County521&0000000000000000.5500000.55 sq mi (1.44 km2)
5AlabasterShelby County22,619&0000000000000020.53000020.53 sq mi (53.17 km2)
CityNames


It does not matter if the cells are overly tall or too narrow etc., and we'll ditch the empty cells after un-merging. Of course I am only showing several cells for the post; you should have 500 rows data in the first column.

In a Standard Module:
Rich (BB code):
Option Explicit
    
Sub SplitAddresses()
'//*********************************************************************************//
'// Dim REX As RegExp, rexMatches As MatchCollection if Early-Bound                 //
'//*********************************************************************************//
Dim _
aryInput            As Variant, REX             As Object, _
aryOutput           As Variant, rexMatches      As Object, _
aryCityNames        As Variant, rngData         As Range, _
ele                 As Variant, rngOutput       As Range, _
i                   As Long
    
    Set REX = CreateObject("VBScript.RegExp")
    With shtCityNames
        aryCityNames = .Range(.Range("A1"), .Cells(.Rows.Count, "A").End(xlUp)).Value
        ReDim aryOutput(1 To UBound(aryCityNames, 1), 1 To 3)
    End With
    
    With shtDataSheet
        Set rngData = .Range(.Range("H2"), .Cells(.Rows.Count, "H").End(xlUp))
        Set rngOutput = rngData.Offset(, 1).Resize(, 3)
        rngOutput.Resize(.Rows.Count - rngOutput.Row).ClearContents
    End With
    aryInput = rngData.Value
    
    With REX
        .Global = False
        .IgnoreCase = False
        For i = 1 To UBound(aryInput, 1)
            If Not aryInput(i, 1) = Empty Then
                For Each ele In aryCityNames
                    .Pattern = "(.+?)(" & ele & ")(.+)"
                    If .Test(aryInput(i, 1)) Then
                        Set rexMatches = .Execute(aryInput(i, 1))
                        aryOutput(i, 1) = rexMatches(0).SubMatches(0)
                        aryOutput(i, 2) = rexMatches(0).SubMatches(1)
                        aryOutput(i, 3) = rexMatches(0).SubMatches(2)
                        Exit For
                    End If
                Next
            End If
        Next
    End With
    rngOutput.Value = aryOutput
    rngOutput.EntireColumn.AutoFit
End Sub
    
Sub SetUpCityNames()
Dim _
i           As Long, _
x           As Long, _
y           As Long, _
aryInput    As Variant, _
aryPattern  As Variant, _
rCell       As Range, _
Pattern     As String
    
    Application.ScreenUpdating = False
    With shtCityNames
        .Columns("B:D").Delete
        With .Range(.Range("A1"), .Cells(.Rows.Count, "A").End(xlUp))
            .Hyperlinks.Delete
            .HorizontalAlignment = xlGeneral
            .VerticalAlignment = xlBottom
            .WrapText = False
            .Orientation = 0
            .MergeCells = False
            For i = .Range(.Address).Rows.Count To 1 Step -1
                If Not .Cells(i, "A").Value = vbNullString Then
                    .Cells(i, "A").Value = .Cells(i, "A").Value & Chr(32) & "AL"
                    .Cells(i, "A").TextToColumns Destination:=.Cells(i, "A"), _
                                                 DataType:=xlDelimited, _
                                                 TextQualifier:=xlDoubleQuote, _
                                                 ConsecutiveDelimiter:=True, _
                                                 Tab:=False, Semicolon:=False, _
                                                 Comma:=False, Space:=True, _
                                                 Other:=False, _
                                                 FieldInfo:=Array(Array(1, xlTextFormat), _
                                                                  Array(2, xlTextFormat))
                Else
                    .Cells(i, "A").EntireRow.Delete
                End If
            Next
            .Columns("A:A").Insert
            .Columns("A:F").EntireColumn.AutoFit
            .Rows.EntireRow.AutoFit
        End With
        
        aryInput = _
            .Range(.Range("B1"), _
                   .Cells(RangeFound(.Cells, , , , , xlByRows).Row, _
                          RangeFound(.Cells, , , , , xlByColumns).Offset(, 1).Column) _
                   ).Value
        
        ReDim aryPattern(1 To UBound(aryInput, 1), 1 To 1)
        
        For x = LBound(aryInput, 1) To UBound(aryInput, 1)
            
            Pattern$ = "\b"
            For y = LBound(aryInput, 2) To UBound(aryInput, 2)
                
                Select Case aryInput(x, y)
                Case Empty
                    Pattern$ = Left(Pattern$, InStrRev(Pattern$, "\") - 1)
                    Exit For
                Case "OOR"
                    Pattern$ = Left(Pattern$, InStrRev(Pattern$, "\") - 1) & "\b|"
                Case Else
                    Pattern$ = Pattern$ & aryInput(x, y) & "\ {0,2}"
                End Select
            Next
            aryPattern(x, 1) = Pattern$ & "\b"
        Next
        .Range("A1").Resize(UBound(aryPattern, 1)).Value = aryPattern
        .Range("A1").EntireColumn.AutoFit
    End With
    Application.ScreenUpdating = True
End Sub
    
Function RangeFound(SearchRange As Range, _
                    Optional FindWhat As String = "*", _
                    Optional StartingAfter As Range, _
                    Optional LookAtTextOrFormula As XlFindLookIn = xlValues, _
                    Optional LookAtWholeOrPart As XlLookAt = xlPart, _
                    Optional SearchRowCol As XlSearchOrder = xlByRows, _
                    Optional SearchUpDn As XlSearchDirection = xlPrevious, _
                    Optional bMatchCase As Boolean = False) As Range
    
    If StartingAfter Is Nothing Then
        Set StartingAfter = SearchRange(1)
    End If
    
    Set RangeFound = SearchRange.Find(What:=FindWhat, _
                                      After:=StartingAfter, _
                                      LookIn:=LookAtTextOrFormula, _
                                      LookAt:=LookAtWholeOrPart, _
                                      SearchOrder:=SearchRowCol, _
                                      SearchDirection:=SearchUpDn, _
                                      MatchCase:=bMatchCase)
End Function

Now, run SetUpCityNames(). The CityNames sheet should now look like:
Excel Workbook
ABCD
1\bAbbeville\ {0,2}AL\bAbbevilleAL
2\bAdamsville\ {0,2}AL\bAdamsvilleAL
3\bAddison\ {0,2}AL\bAddisonAL
4\bAkron\ {0,2}AL\bAkronAL
5\bAlabaster\ {0,2}AL\bAlabasterAL
6\bAlbertville\ {0,2}AL\bAlbertvilleAL
7\bAlexander\ {0,2}City\ {0,2}AL\bAlexanderCityAL
8\bAliceville\ {0,2}AL\bAlicevilleAL
9\bAllgood\ {0,2}AL\bAllgoodAL
10\bAltoona\ {0,2}AL\bAltoonaAL
CityNames


SetUpCityNames() only needs run once, but you may wish to save it in case you add other states later. Plunking in a fake variable as an arg will keep it from being run accidently (eg - Sub SetUpCityNames(Fake)).

Although I left the names for ease of reading, you can delete Columns B and thereafter on the CityNames sheet, as only the patterns in Column A are used for the parsing code.

With the below sample data in Col H, after running SplitAddresses:

Excel Workbook
HIJK
1
2110 N. Hills Dr Akron AL 33423-12323110 N. Hills DrAkron AL33423-12323
3334 Old Miner's Rd Bayou La Batre AL 32132334 Old Miner's RdBayou La Batre AL32132
4
55608 Ave K Birmingham AL 352085608 Ave KBirmingham AL35208
6105 Shire Dr Pelham AL 35214105 Shire DrPelham AL35214
710512 N. SR 88 Alexander City AL 3521410512 N. SR 88Alexander City AL35214
8105 Shire Dr Pelham AL 35214105 Shire DrPelham AL35214
97735 Old Birmingham Hwy Cottondale AL 35453
10109 Blackwell Ln Oakman AL 35579109 Blackwell LnOakman AL35579
112005 Old Montgomery Hwy BirminghamAL 352442005 Old Montgomery HwyBirminghamAL35244
122005 Old Montgomery Hwy Bayou La Batre AL 352442005 Old Montgomery HwyBayou La Batre AL35244
DataSheet


This should handle multiple word names (rows 3,7,12), missing data (row 4), errant missing space (row 11) or extra space (row 12).

It will not correct errant missing/extra spaces, but hopefully these are few. I did see that you had a Cottondale listed, which will not return, as its not in our list. For any odd names that are in the wiki list, you could add the pattern to CityNames after running SetUpCityNames.

Hope that helps,

Mark
 
Upvote 0

Excel Facts

Control Word Wrap
Press Alt+Enter to move to a new row in a cell. Lets you control where the words wrap.
Very Nice...problem solved! I couldn't agree more hiker95...
 
Upvote 0

Forum statistics

Threads
1,215,352
Messages
6,124,449
Members
449,160
Latest member
nikijon

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top