Parse words out of string

Nuz

Board Regular
Joined
Aug 16, 2010
Messages
88
I have a string consisting of multiple words which I'd like to break down into separate words by VBA. For example the initial string is an address like "Street 66 123456 LONDON". I need to parse the city, postal code and the street out of that string and put them into separate variables.

The problem is that sometimes the string can be like "Street 66 123456 LONDON (CITY), i.e. the city is not one word but two or more. That's why for the postal code I can't always just find the second word from the right since it is not always the postal code.

I know that:
-the city is always the right-most word or words,
-postal code is always left of city
-street address is everything else and starts always from the left

Is there a foolproof way to parse out these words from the initial string?
 

Excel Facts

Waterfall charts in Excel?
Office 365 customers have access to Waterfall charts since late 2016. They were added to Excel 2019.
I rushed this a bit as this is a flying visit to the forum but try....

Code:
Function ParseString(pStr As String, pComponent As String) As String

Dim sCnt As Integer, sHld() As String, x As Integer, y As Integer
Dim q As Integer, pcAdj As Integer, p As Integer

sHld = Split(pStr, " ")

For x = UBound(sHld()) To 0 Step -1
    For y = Len(sHld(x)) To 1 Step -1
        If IsNumeric(Mid(sHld(x), y, 1)) = True Then
            'City is everything above current x which is post code
            If LCase(pComponent) = "city" Then
                For q = x + 1 To UBound(sHld())
                    If ParseString <> "" Then
                        ParseString = ParseString & " "
                    End If
                    ParseString = ParseString & sHld(q)
                Next
                Exit Function
            End If
            
            pcAdj = 0
            For p = Len(sHld(x - 1)) To 1 Step -1
                If IsNumeric(Mid(sHld(x - 1), p, 1)) = True Then
                    pcAdj = 1
                End If
            Next
            If LCase(pComponent) = "postcode" Then
                If pcAdj = 1 Then
                    ParseString = sHld(x - 1) & sHld(x)
                Else
                    ParseString = sHld(x)
                End If
                Exit Function
            End If
            If LCase(pComponent) = "address" Then
                For q = 0 To x - 1 - pcAdj
                    If ParseString <> "" Then
                        ParseString = ParseString & " "
                    End If
                    ParseString = ParseString & sHld(q)
                Next
                Exit Function
            End If
        End If
    Next
Next

End Function

You then pass the string to that function and it will return the bit you need (hopefully) based on passing "city","postcode" or "address" as the pComponent value

eg

Code:
Sub TestParse()

MsgBox ParseString("123 Ken Strt SW13 2NL London (City)", "City")
MsgBox ParseString("123 Ken Strt SW13 2NL London (City)", "PostCode")
MsgBox ParseString("123 Ken Strt SW13 2NL London (City)", "Address")
End Sub

Should return the expected values. It is based on checking backwards for a numeric value which we SHOULD be able to assume will be part of the post code, based on your specifications.

Good luck
 
Upvote 0
Very nice. Works well.
Only problem that remains is the postal code if it has leading zeros. For example, if the postal code within the string is 001234, the function returns it as 1234, dropping out the leading zeros.
Is there any way to fix it?
 
Upvote 0
Just figured it out.
Since i will eventually put these values into worksheet cells, I just set the cell format to "Text" rather tham numerical value and the leading zeros are displayed correctly.

Big thanks for your time on this.
 
Upvote 0
Nuz

I was just investigating other approaches to this problem and was wondering if you may be able to supply a few more (varied) samples of initial strings and the corresponding desired results?

Also, are the initial strings in a worksheet column with the results to be placed in columns beside?
 
Last edited:
Upvote 0
Initial strings are in one column (say colun A). Desired results are put to columns next to it (columns B, C and D).

Examples:
(a) "Street 66 012345 LONDON"
(b) "Street 66-99 012345 LONDON"
(c) "Street 66 (Street 88) 012345 LONDON"
(d) "Street 66 / Street 88 012345 LONDON"
(e) "Street 66 012345 LONDON (CITY)

Desired results:
(a) "Street 66", "012345", "LONDON"
(b) "Street 66-99", "012345", "LONDON"
(c) "Street 66 (Street 88)", "012345", "LONDON"
(d) "Street 66 / Street 88", "012345", "LONDON"
(e) "Street 66", "012345", "LONDON (CITY)"

But considering my original data (~ 200 addresses), I got an acceptable result by using the funcion presented here earlier. Of course any additional suggestions appreciated.
 
Upvote 0
Hi Nuz

You can only get a solution as good as the specification of the problem. If you do not define rigorously how to split the text you won't get a trustworthy result.

Looking at your examples, it seems that one could say:

1 - the text always has a 6 digit number

2 - the text should be split like this:

< Text before the 6 digit number > < 6 digit number > < Text after the 6 digit number >

Is this correct?
 
Upvote 0
Sorry about the too general inctructions.
Postal code is not necessarily 6 digits. I would put the process this way:

1) For postal code, give the first series of numbers (any number of digits) iterating from the right
2) For city, give everything that lies on the right side of the postal code
3) For address, give everything that lies on the left side of the postal code

For example, "Street 66 99 66 2345 LONDON" would be broken down to "Street 66 99 66" , "2345", "LONDON"

Does that make it specific enough?
 
Upvote 0
Does that make it specific enough?

Yes!

Try this udf:

Code:
Function SplitAddress(s As String, lInd As Long) As String
Static RegEx As Object
 
If RegEx Is Nothing Then
    Set RegEx = CreateObject("VBScript.RegExp")
    RegEx.Pattern = "^(.+)\b(\d+)(.+)$"
End If
 
If RegEx.test(s) Then SplitAddress = Trim(RegEx.Execute(s)(0).submatches(lInd - 1))
End Function


<table border="1" cellpadding="1" style="background:#FFF; border-collapse:collapse;border-width:2px;border-color:#CCCCCC;font-family:Arial,Arial; font-size:10pt" ><tr><th style="border-width:1px;border-color:#888888;background:#9CF " > </th><th style="border-width:1px;border-color:#888888;background:#9CF; text-align:center" >A</th><th style="border-width:1px;border-color:#888888;background:#9CF; text-align:center" >B</th><th style="border-width:1px;border-color:#888888;background:#9CF; text-align:center" >C</th><th style="border-width:1px;border-color:#888888;background:#9CF; text-align:center" >D</th><th style="border-width:1px;border-color:#888888;background:#9CF; text-align:center" width=30 >E</th></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>1</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 012345 LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">012345</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>2</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66-99 012345 LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66-99</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">012345</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>3</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 (Street 88) 012345 LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 (Street 88)</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">012345</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>4</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 / Street 88 012345 LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 / Street 88</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">012345</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">LONDON</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>5</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66 012345 LONDON (CITY)</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">Street 66</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">012345</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:left;border-width: 1px;border-color:#888888; ">LONDON (CITY)</td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;background:#9CF; text-align:center; " ><b>6</b></td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td><td style="border-width:1px;border-color:#000000; padding-left:0.5em; padding-top:0.4em; padding-right:0.5em; padding-bottom:0.25em;text-align:right;border-width: 1px;border-color:#888888; "> </td></tr><tr><td colspan=6 style="background:#9CF; padding-left:1em" > [Book1]Sheet2</td></tr></table>
 
Upvote 0
P. S. The formula I used, in B1:

=SplitAddress($A1,COLUMNS($B1:B1))

Copy down and across.
 
Upvote 0

Forum statistics

Threads
1,224,595
Messages
6,179,798
Members
452,943
Latest member
Newbie4296

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top