Extract Words using Regex

Ombir

Active Member
Joined
Oct 1, 2015
Messages
433
Hi,

I've a string like "abc.txt def.txt lmn.png jkl.jpeg pvr.txt".

THis pattern "\w*(?=\.txt)" only extract file names of text files i.e abc,def,pvr.

Now I want the opposite of this. I want to extract file names that are not text i.e lmn and jkl

Please suggest the pattern to match this. If possible I would also like to extract extension names along with file name like lmn.png and jkl.jpeg for another scenario. Is it possible with Negative lookahead ?
 

Some videos you may like

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop

pgc01

MrExcel MVP
Joined
Apr 25, 2006
Messages
19,851
Hi

Not sure I understood correctly, but try:

Code:
Sub test()
Dim regex As Object, regexMatches As Object
Dim s As String, j As Long

s = "abc.txt def.txt lmn.png jkl.jpeg pvr.txt" ' test string

Set regex = CreateObject("VBScript.RegExp")
With regex
    .Pattern = "\w+(?!\.txt)\.\w{3,4}"
    .Global = True
    Set regexMatches = .Execute(s)
End With

' display matches
For j = 0 To regexMatches.Count - 1
    MsgBox regexMatches(j)
Next j

End Sub

Remark: as you see in the pattern I assumed an extension with 3 or 4 characters.
 

Ombir

Active Member
Joined
Oct 1, 2015
Messages
433
Thanks pgc. You understood it correctly. Its working fine with extensions. What will be the pattern if I don't want the extensions in the final result i.e only lmn and jkl.
 

Ombir

Active Member
Joined
Oct 1, 2015
Messages
433
I tried this now \w+(?!\.txt)(?=\.\w{3,4}) and it returns file names without extensions. Do you've a better approach or its good to go ?
 

pgc01

MrExcel MVP
Joined
Apr 25, 2006
Messages
19,851

ADVERTISEMENT

Yes, that's how I'd do it.

Remark: I should have finished the expression with a (?= |$)

meaning that after the 3 or 4 characters there's a space or the end of the string

Code:
     .Pattern = "\w+(?!\.txt)\.\w{3,4}(?= |$)"

I wanted to specify an extension with 3 or 4 characters.
The way I posted before it will accept things like abc.defgXXX, with more than 4 characters in the extension (1.st error, it would accept the extension as valid) and return only the first 4 (2.nd error, it would seem everything was O.K.)
 
Last edited:

pgc01

MrExcel MVP
Joined
Apr 25, 2006
Messages
19,851

ADVERTISEMENT

You're welcome. Thanks for the feedback.
 

Ombir

Active Member
Joined
Oct 1, 2015
Messages
433
Hi Pgc,

Need your help again. Javascript doesn't support Negative lookbehind. Therefor I'm using regexr.com website to explain the problem. I'm using PCRE engine on this website as it supports lookbehind.

String: Mr Ombir,Mr Ajay,Miss Sakshi,Miss Suhani

I'm looking for the pattern using Negative lookbehind to extract full name whose title is not Miss. So pattern should match Mr Ombir and Mr Ajay.

If I replace the Miss to Mr in the pattern then it should also match the opposite i.e Miss Sakshi and Miss Suhani

I've come up with this pattern but its not picking up the title.

(? < ! Miss ) \s [ \s \w ] +
<!--Miss)\s[\s\w]+"

Please help?
 
Last edited:

Ombir

Active Member
Joined
Oct 1, 2015
Messages
433
Just able to solve it using \ w + ( ? < ! Miss ) \ s \ w + but don't understand how it works? Is it correct or can be improved ?
 
Last edited:

pgc01

MrExcel MVP
Joined
Apr 25, 2006
Messages
19,851
Hi

As you know, vba does not also suppout lookbehinds.

In vba, I'd use:

Code:
Sub test()
Dim regex As Object, regexMatches As Object
Dim s As String, j As Long

s = "Mr Ombir,Mr Ajay,Miss Sakshi,Miss Suhani" ' test string

Set regex = CreateObject("VBScript.RegExp")
With regex
    .Pattern = "(?:^|,)(?!Miss)([^,]+)"
    .Global = True
    Set regexMatches = .Execute(s)
End With

' display matches
For j = 0 To regexMatches.Count - 1
    MsgBox regexMatches(j).submatches(0)
Next j

End Sub
 

Watch MrExcel Video

Forum statistics

Threads
1,109,435
Messages
5,528,746
Members
409,833
Latest member
tdnhan

This Week's Hot Topics

  • Change military grades into rank
    Afternoon all Need help with formula that will change military rank (i.e. 1, 2, 3 into Amn, A1C, SrA). Running IF formula that does not work...
  • VBA COUNTIF SOLUTION
    Hi The following are the errors spread across the several columns from E to Q ie. 13 columns across several sheets with more than 500 rows per...
  • INSERT ROW WITH SPECIFIS TEXT IN A COLUMN
    Hi All! How can identify that that the row to be inserted has to be inserted before 1st row with specific text in column F. If I record the...
  • Auto-Create a monthly Sign in sheet for preschool students
    The image below is what each page looks like. Above is space for the "Child Name" "Month" "Class" School days are obviously Monday-Friday but...
  • VBA vlookup multiple results
    Hi folks, Hopefully someone out there can help. I have a list to vlookup which works (ish). the lookup only picks up the first instance of the...
  • Extract values for earliest/latest times
    I am trying to put together a formula to get the earliest start time, the latest end time from column A for each person in Column B-F without the...
Top