Extract 5 digit numbers from cell and separate them by |

dougbohr

New Member
Joined
Jun 4, 2020
Messages
6
Office Version
  1. 2013
Platform
  1. Windows
Hello, this is my first post. Any help is greatly appreciated!

I need to reformat construction cost codes that are coming from business contacts in Outlook. I am exporting a .csv file from Outlook that contains hundreds of rows (each row is a company contact). When I export, the cost code column cell also contains a text description. I ONLY need the (5) digit cost code number, and if there are multiple cost codes, they need to be separated by a vertical bar |

Example:

"00815 Land Surveyors;MBE / WBE / DBE; Subcontractor" = "00815"
"13000 Special Construction; Subcontractor" = "13000"
"02525 Curbing;02900 Landscaping;03300 Concrete;16100 Eleectrical;MBE / WBE / DBE;Subcontractor" = "02525 | 02900 | 03300 | 16100"
"08100 Hollow Metal;08200 Wood Doors;08700 Hardware;Supplier"= "08100 | 08200 | 08700"

Is there any way to extract ONLY the (5) digit cost code, and separate them by "space, vertical bar, space" (if the cell contains more than 1 cost code)?

Thanks!
Doug Bohr
 

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.
Hi

You could use regular expressions. That's a way of finding patterns in texts. So if you're looking for anything that matches 5 digits, you would use the Regular Expression: \d{5} . Here the \d means digit.

I don't know how your worksheet is structured (i.e. which cells the data is in), but you should be able to use the function below from a worksheet. You just put this code in a module in your workbook, and then you can call it from the worksheet by typing into a cell: =GetRegexDigits(TARGETCELL, "(\d{5})")

Book1
CD
900815 Land Surveyors;MBE / WBE / DBE; Subcontractor 00815
1013000 Special Construction; Subcontractor 13000
1102525 Curbing;02900 Landscaping;03300 Concrete;16100 Eleectrical;MBE / WBE / DBE;Subcontractor 02525 | 02900 | 03300 | 16100
1208100 Hollow Metal;08200 Wood Doors;08700 Hardware;Supplier08100 | 08200 | 08700
Sheet1
Cell Formulas
RangeFormula
D9:D12D9=GetRegexDigits(C9,"(\d{5})")


VBA Code:
Function GetRegexDigits(Str As String, reg As String) As String
    Dim tmpRegex As Object
    On Error Resume Next
    Set tmpRegex = CreateObject("VBScript.RegExp")
    tmpRegex.Pattern = reg
    tmpRegex.Global = True
    tmpRegex.IgnoreCase = True
    If index < 0 Then index = 0
    If tmpRegex.Test(Str) Then
        Set matches = tmpRegex.Execute(Str)
        For Each Match In matches
            tmpresult = tmpresult & Match & " | "
        Next
        tmpresult = Left(tmpresult, Len(tmpresult) - 2)
        GetRegexDigits = tmpresult
        Exit Function
    End If
    GetRegexDigits = ""
End Function
 
Upvote 0
For those who might be interested, here is a non-RegExp UDF (user defined function) that can be used...
VBA Code:
Function GetDigitNums(ByVal S As String) As String
  Dim X As Long, Arr As Variant
  For X = 1 To Len(S)
    If Mid(S, X, 1) Like "[!0-9]" Then Mid(S, X) = " "
  Next
  Arr = Split(Application.Trim(S))
  For X = 0 To UBound(Arr)
    If Len(Arr(X)) <> 5 Then Arr(X) = ""
  Next
  GetDigitNums = Replace(Application.Trim(Join(Arr)), " ", " | ")
End Function
This UDF takes only one argument... the text to be processed (which can be a quoted text string or a cell reference that contains the text); so, to call it, you would use something like this...

=GetDigitNums(A1)
 
Last edited:
Upvote 0
Beautiful guys, both of these work perfectly.
I can see these being very useful.
 
Upvote 0
This is amazing!! Thanks so much for your help. How do I buy you a beer?

Doug
 
Upvote 0
Sorry to be a pain... I've noticed an oversight on my end. :(:cry:

The end result needs to be the 5 digit number with a dash after the first two digits.

My original example:

00815 needs to be 00-815
13000 needs to be 13-000
02525 | 02900 | 03300 | 16100 needs to be 02-525 | 02-900 | 03-300 | 16-100

I've tried to format your end result to add the dashes:

=LEFT(AC10,2)&"-"&MID(AC10,3,8)&"-"&MID(AC10,11,8)&"-"&MID(AC10,19,8)&"-"&MID(AC10,27,8)

But, because some companies have 1 cost code, and another might have 5 cost codes... I will get extra dashes at the end:

02-525 | 02-900 | 03-300 | 16-100
09-310 | 09-650 | 09-680 --
02-600 | 02-200 ---
16-100 ----

Sorry to ask again! But all help is greatly appreciated.

Thanks,
Doug
 
Upvote 0
You are in luck... I do not drink beer (or any alcoholic beverages), so you do not need to wonder about that.

As for your oversight, this slight modification (shown in red) to the code I posted earlier will handle it...
Rich (BB code):
Function GetDigitNums(ByVal S As String) As String
  Dim X As Long, Arr As Variant
  For X = 1 To Len(S)
    If Mid(S, X, 1) Like "[!0-9]" Then Mid(S, X) = " "
  Next
  Arr = Split(Application.Trim(S))
  For X = 0 To UBound(Arr)
    If Len(Arr(X)) <> 5 Then Arr(X) = "" Else Arr(X) = Format(Arr(X), "00-000")
  Next
  GetDigitNums = Replace(Application.Trim(Join(Arr)), " ", " | ")
End Function
 
Last edited:
Upvote 0
I would jist like to say that I prefer Rick's code to my own, and I think it makes far more sense to use his than a regular expression.

But on the off-chance anyone needs it, I set out below the equivalent modification (with some further amends because I noticed some issues with the code). I should point out that addition of the hyphen was shamelessly plagiarised from Rick's modified code:

VBA Code:
Function GetRegexDigits(strSource As String, strPattern As String) As String
    Dim tmpRegex As Object, Match As Object, Matches As Object, tmpResult As String
    On Error Resume Next
    Set tmpRegex = CreateObject("VBScript.RegExp")
    tmpRegex.Pattern = strPattern
    tmpRegex.Global = True
    tmpRegex.IgnoreCase = True
    If tmpRegex.Test(strSource) Then
        Set Matches = tmpRegex.Execute(strSource)
        For Each Match In Matches
            tmpResult = tmpResult & Format(Match, "00-000") & " | "
        Next
        tmpResult = Left(tmpResult, Len(tmpResult) - 2)
        GetRegexDigits = tmpResult
    Else
        GetRegexDigits = ""
    End If

    Set tmpRegex = Nothing
    Set Match = Nothing
    Set Matches = Nothing
End Function

Thank you, Rick.
 
Upvote 0
would you try these formula

=CHOOSE(1+LEN(A1)-LEN(SUBSTITUTE(A1,"|","")),LEFT(A1,2)&"-"&MID(A1,3,8),LEFT(A1,2)&"-"&MID(A1,3,8)&"-"&MID(A1,11,8),LEFT(A1,2)&"-"&MID(A1,3,8)&"-"&MID(A1,11,8)&"-"&MID(A1,19,8),LEFT(A1,2)&"-"&MID(A1,3,8)&"-"&MID(A1,11,8)&"-"&MID(A1,19,8)&"-"&MID(A1,27,8))
 
Upvote 0

Forum statistics

Threads
1,214,929
Messages
6,122,317
Members
449,081
Latest member
tanurai

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top