Extracting alphanumeric pattern "???###" from variable position in string

lmwilson

New Member
Joined
Mar 22, 2014
Messages
1
Hello all,

I'm running Windows 7, and Excel 2010.

I have a question very similar to the one in the thread entitled "Extracting data based on an alphanumeric pattern" which is here: http://www.mrexcel.com/forum/excel-questions/712101-extracting-data-based-alphanumeric-pattern.html

My cells in column A contain variable length strings of data including letters, characters, and numbers, and the pattern I am looking for, ???###, will occur once, in any position, within the string in the cell. I would like to extract the pattern only and have it appear in a cell in column B. I can't figure out how to modify the answer from the above thread to match my pattern.

Example of data in column A:

(A1) Supervisor: Dr. John Smith, record date opened July 5/2011, nsid ajs948, code CAF, DEF.
(A2) Jul. 9, 2010, has735, biomedical grad, key AAC

Desired result in column B:
(B1) ajs948
(B2) has735

Thank you for any help you can provide!
 

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"
Hello mwilson,

This macro will extract the substring matching the pattern you described. Case is ignored.
Code:
Sub ParseSubstring()

    Dim Cell    As Range
    Dim LastRow As Long
    Dim Matches As Object
    Dim RegExp  As Object
    Dim Rng     As Range
    
        Set Rng = Range("A1")
        LastRow = Cells(Rows.Count, Rng.Column).End(xlUp).Row
        If LastRow < Rng.Row Then Exit Sub
        
        Set Rng = Rng.Resize(LastRow - Rng.Row + 1, 1)
        
        Set RegExp = CreateObject("VBScript.RegExp")
        RegExp.IgnoreCase = True
        RegExp.Pattern = "\b[a-z]{3}[0-9]{3}\b"
        
        For Each Cell In Rng
            Set Matches = RegExp.Execute(Cell)
            If Matches.Count > 0 Then Cell.Offset(0, 1) = Matches(0)
        Next Cell
        
End Sub
 
Upvote 0
you could also use it as UDF so in B1:
Code:
=Extract3L3N(A1)

and in VBA editor (Alt+F11) insert module and the function:

Code:
Function Extract3L3N(c As String) As String
Dim colMatches As Object
    With CreateObject("VBScript.RegExp")
        .Global = False 'only first object - was True
        .Pattern = "[a-z]{3}[\d]{3}"
        If .Test(c) Then
           Set colMatches = .Execute(c)
           Extract3L3N = colMatches(0).Value
           Set colMatches = Nothing
        End If
    End With
End Function

If you do not want allow finding aaa121 in aaaa121 add \b at front and end of pattern so:
Code:
        .Pattern = "\b[a-z]{3}[\d]{3}\b"

It may be necessary to add reference to Microsoft VBScript Regular Expressions in VBA editor Tools->References
 
Upvote 0
Hello all,

I'm running Windows 7, and Excel 2010.

I have a question very similar to the one in the thread entitled "Extracting data based on an alphanumeric pattern" which is here: http://www.mrexcel.com/forum/excel-questions/712101-extracting-data-based-alphanumeric-pattern.html

My cells in column A contain variable length strings of data including letters, characters, and numbers, and the pattern I am looking for, ???###, will occur once, in any position, within the string in the cell. I would like to extract the pattern only and have it appear in a cell in column B. I can't figure out how to modify the answer from the above thread to match my pattern.

Example of data in column A:

(A1) Supervisor: Dr. John Smith, record date opened July 5/2011, nsid ajs948, code CAF, DEF.
(A2) Jul. 9, 2010, has735, biomedical grad, key AAC

Desired result in column B:
(B1) ajs948
(B2) has735
For us to know if we are giving you a correct solution, you should clarify some things for us...

1) The pattern you say you are looking for is ???###, but that would work for any three characters (letters, numbers or punctuation marks) followed by three digits, but your two examples show the question marks as three letters... is that what you meant?

2) If so, can those letters be either upper or lower case or only lower case as your two examples show?

3) Will the pattern you are looking for, when not at beginning or end of the text, always be surrounded by spaces or punctuation marks (like commas, parentheses, periods, etc) or could it appear inside other text? For example, if you had this text, would the red highlighted characters be considered your pattern...

"Some text plus nsidajs94873 followed by more text"

4) Will your text always have a pattern in it or could you have text that contained no pattern at all?
 
Upvote 0

Forum statistics

Threads
1,214,384
Messages
6,119,201
Members
448,874
Latest member
Lancelots

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top