Results 1 to 7 of 7

Thread: Regular expression pattern
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    New Member
    Join Date
    Mar 2004
    Posts
    23
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Regular expression pattern

    Can someone please advise what the correct pattern is to return the text between two search strings?

    For example, in the html string below, I want to return the text between "selected>" and "<". In this example I want "NA".

    HTML Code:
    "<SELECT class=nfinput name=pace0><OPTION value=L>L</OPTION><OPTION value=P>P</OPTION><OPTION value=OP>OP</OPTION><OPTION value=M>M</OPTION><OPTION value=OM>OM</OPTION><OPTION value=BK>BK</OPTION><OPTION value=NA selected>NA</OPTION><OPTION value=SCR>SCR</OPTION></SELECT>"
    I've tried using the lookahead/lookbehind;
    (?<=selected>)(.*?)(?=<)
    but it returns a syntax error in regular expression (5017).

    I used Tushar Mehta's test website (http://www.tmehta.com/regexp/regexpfind.asp) as well as in VBA / Excel 2002 and get the same error.

  2. #2
    Board Regular cgcamal's Avatar
    Join Date
    May 2007
    Location
    Tegucigalpa, Honduras C.A.
    Posts
    472
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Regular expression pattern

    Hi danoneill,

    You can try with the following to isolate part you want you can do:

    1-) Search and replace with nothing this Regex
    Code:
    .*selected>

    2-) Search and replace with nothing this Regex
    Code:
    <.*

    3-) you'll get the text you want

    Hope this helps,

    Regards



  3. #3
    MrExcel MVP
    Join Date
    Apr 2006
    Posts
    19,722
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Regular expression pattern

    Hi

    Try:

    Code:
    Function GetSelected(s As String) As String
     
    With CreateObject("VBSCript.RegExp")
        .Pattern = ".*selected>([^<]+)<.*"
        GetSelected = .Replace(s, "$1")
    End With
    End Function
    Kind regards
    PGC

    To understand recursion, you must understand recursion.

  4. #4
    New Member
    Join Date
    Mar 2004
    Posts
    23
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Regular expression pattern

    Thanks pgc01, that works.

    Would you mind explaining the pattern and the .Replace?

    Dan

  5. #5
    MrExcel MVP
    Join Date
    Apr 2006
    Posts
    19,722
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Regular expression pattern

    Hi Dan

    I - the solution posted

    The pattern translates to:

    1 - any number of any characters

    followed by

    2 - selected>

    followed by

    3 - a string of characters not equal to the character "<"

    followed by

    4 - the character "<"

    followed by

    5 - any number of any characters

    This pattern reproduces your string. It includes all the characters of the string.

    I enclosed part 3 in parentheses. This means that I'm capturing this part into a submatch, in this case it's the first submatch.

    I then use the .Replace() method of the RegExp object to replace the whole string by the first submatch and that's what the function returns.

    Notice that the pattern assumes that the sequence "selected>" + "some characters" +"<" exists. If that's not the case the function returns the whole string. You may want to tweak the pattern so that if the sequence does not exist, then you return an empty string, or else test first if the there's a pattern match and only then use the .Replace().


    II - Another solution

    As usual there would be other approaches possible. For ex. in this next one I'm focusing only on the part "selected>" + "some characters" +"<". If this part exists I return the value of the first submatch.


    Code:
    Function GetSelected(s As String) As String
     
    With CreateObject("VBSCript.RegExp")
        .Pattern = "selected>([^<]+)<"
        If .Test(s) Then GetSelected = .Execute(s)(0).submatches(0)
    End With
    End Function

    Remark: in your post you say you tried a solution using a LookBehind. LookBehind is not yet implemented in the RegExp object you use in vba.

    HTH
    Kind regards
    PGC

    To understand recursion, you must understand recursion.

  6. #6
    New Member
    Join Date
    Mar 2004
    Posts
    23
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Regular expression pattern

    Thanks again. I appreciate the time you've taken to explain it all.

  7. #7
    MrExcel MVP
    Join Date
    Apr 2006
    Posts
    19,722
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Regular expression pattern

    I'm glad it helped. Cheers!
    Kind regards
    PGC

    To understand recursion, you must understand recursion.

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •