One common pattern to extract Numerical Values using Regular Expressions

NimishK

Well-known Member
Joined
Sep 4, 2015
Messages
684
Hello

I want to use correct Syntax for Extracting Numerical Values using Regular Expressions

The below are the different Patterns of Numerical Values from different Text files

NOTE there are other Numerical Contents eg date, Time, Prices which are also in the Content of Text file. For time being
I would not like to extract those date, Time, Price Values Etc.

I would only like single pattern defined to extract below Data as shown NOTE there is Space before the numerical digit begin
1009 1020 8988 909
1961 0050 1902
1212 2161 7823 2004
1005 6033 7102 35
3 9911 9797 3406

VBA Code:
Dim numStr As Str
Dim rgx As Object
Set rgx = New RegExp

rgx.Pattern ="\s\d{4}\s\d{4}\s\d{4}\s\d{3}?"
rgx.Pattern ="\s\d{1,}\s\d{1,}\s\d{1,}\s\d{3}?"
The above pattern only applicable for 1009 1020 8988 909 other Patterns not able to extract

I've used following Link to To Test Regular Expressions in order to obtain the pattern but somehow not successful

Will appreciate your valuable inputs

Thanks
NimishK
 

Excel Facts

Can you AutoAverage in Excel?
There is a drop-down next to the AutoSum symbol. Open the drop-down to choose AVERAGE, COUNT, MAX, or MIN
Try this (add a reference to Microsoft VBScript Regular Expressions 5.5):
VBA Code:
Public Sub Test()

    Dim RE As RegExp, Matches As MatchCollection
    Dim i As Long
   
    Set RE = New RegExp
   
    With RE
        .Global = True
        .Pattern = "(\d*\d)+"
        Set Matches = .Execute(" 1009 1020 8988 909")
        Debug.Print "Matches Count = " & Matches.Count
        For i = 0 To Matches.Count - 1
            Debug.Print ">" & Matches.Item(i) & "<"
        Next
    End With
   
End Sub
 
Upvote 0
John_W Sir,

Thank you so much for the input you have incorporated. But I am afraid to mention that there is misunderstanding.

Sir, I require a RegEx Pattern where it EXTRACTS / or displays the Numerical Values Only for the above 5 different cases mentioned in #1
NOTE there are other Numerical Contents eg date, Time, Prices which are also in the Content of Text file. For time being
I would not like to extract those date, Time, Price Values Etc.

with the Pattern (.Pattern = "(\d*\d)+")you have provided I am able to see following contents being Hi-Lited
ie 1009 1020 8988 909 and addtional digits below in text file (As per Post #1 of this thread not required)
77,350.00
Monday,11 Sep 2023 , 15:46:07

so basically it selects all the digits rather than specific pattern

Regards
NimishK
 
Upvote 0
So you want to match blocks of 3 or 4 numerical values on a line?

Try the pattern "(?:\s\d+){3,4}", or for extra precision to also match the start and end of the line, "^(?:\s\d+){3,4}$".

In this code's test data only the first five (per your example) text strings match the pattern. The text string "99 99 99 99" doesn't match because it doesn't start with a space. The text string " 1 2 3 4 5" doesn't match because it contains 5 numerical values.

VBA Code:
Public Sub Test()

    Dim RE As RegExp, Matches As MatchCollection
    Dim text As Variant
    Dim i As Long
 
    Set RE = New RegExp
 
    With RE
        .Global = True
        '.Pattern = "(\s\d*\d)+"
        .Pattern = "^(?:\s\d+){3,4}$"
        For Each text In Array(" 1009 1020 8988 909", " 1961 0050 1902", " 1212 2161 7823 2004", " 1005 6033 7102 35", " 3 9911 9797 3406", "$77,350.00", "Monday,11 Sep 2023 , 15:46:07", "1 2 3 $4", "99 99 99 99", " 1 2 3 4 5")
            Set Matches = .Execute(text)
            Debug.Print "Text = " & text
            Debug.Print "Matches Count = " & Matches.Count
            For i = 0 To Matches.Count - 1
                Debug.Print ">" & Matches.Item(i) & "<"
            Next
        Next
    End With
 
End Sub
 
Last edited:
Upvote 0
John_w Sir,

Thank you so much for your efforts for the perfect Regex Match. Indeed it worked Out for this case

In this code's test data only the first five (per your example) text strings match the pattern. The text string "99 99 99 99" doesn't match because it doesn't start with a space. The text string " 1 2 3 4 5" doesn't match because it contains 5 numerical values.
Yes in this case the Numerical Match starts with space. I don't know about future data whether it will have Space in beginning or Not.

Any general Pattern with Space or No space in the beginning. Bothering you quite a bit.

Thanks and Regards
NimishK
 
Upvote 0

Forum statistics

Threads
1,215,069
Messages
6,122,956
Members
449,096
Latest member
Anshu121

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top