Extracting Data According to Format From Text Field

Shuggoth

New Member
Joined
Mar 16, 2020
Messages
1
Office Version
  1. 2010
Platform
  1. Windows
Hi there,

I'm having trouble even finding a starting point with this. I've tried all manner of searches online but draw a blank - searches for extracting text based on format all bring back conditional formatting items etc. I suspect it may well need a VBA solution rather than a clever formula; if so, that's fine and I have some experience with VBA but I'm not sure how to even go about tackling the problem so would welcome any thoughts / guidance / code etc.

I have a spreadsheet with circa 40,000 rows of data and in column O for each row there is a text field that has 255 characters in the form of a manually written 'story' of sorts. As a result, there is no consistent format/layout for this text, other than it is all letters/numbers/punctuation so the data I am looking to extract will be in different positions within the cell every time.

Each cell will contain, at some point within the text, either a 14 digit number separated by a space either side of it (e.g. x 12345678901234 x), or a 6 digit number with a space followed by an 8 digit number (e.g. x 123456 12345678 x). There will be some cells where this data isn't present, some where it is only present once and some where it could be present multiple times. What I need to do is to somehow identify every instance of this pattern within every cell in column O and extract these numbers into another cell so that they can be used in other searches/filters/interrogation etc. Another complication is that there will be other numbers within the text format (e.g. dates amongst others) so I can't just pull out every number, it has to be ones that match this specific format.

Any help that anyone can offer would be appreciated - if you have any questions please let me know and I'll try to fill in as much as I can!


Regards,

Shug
 

Some videos you may like

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.

GWteB

Well-known Member
Joined
Sep 10, 2010
Messages
1,295
Office Version
  1. 2013
Platform
  1. Windows
Hi Shuggoth, welcome to the board. You're probably looking for something like this. This code searches in column "O" of the active worksheet for the two given patterns and puts the retrieved numbers (duplicates are omitted) in column "A" of a new added sheet.
VBA Code:
Public Sub GetNumbersByPattern()

    Dim NrList      As Object
    Dim arrTxt      As Variant
    Dim i           As Long
    Dim raSource    As Range
    Dim c           As Range
    Dim oWs         As Worksheet

    Set oWs = ThisWorkbook.ActiveSheet
    Set raSource = oWs.Range("O1:O" & oWs.Cells(oWs.Rows.Count, "O").End(xlUp).Row)

    Set NrList = CreateObject("System.Collections.ArrayList")

    For Each c In raSource
        If Len(c.Value) > 13 Then
            arrTxt = Split(c.Value, " ")
            For i = 0 To UBound(arrTxt) - 1
                If arrTxt(i) Like "##############" Then
                    If Not NrList.Contains(CStr(arrTxt(i))) Then NrList.Add CStr(arrTxt(i))
                ElseIf arrTxt(i) Like "######" Then
                    If arrTxt(i + 1) Like "########" Then
                        If Not NrList.Contains(CStr(arrTxt(i)) & " " & CStr(arrTxt(i + 1))) Then _
                                    NrList.Add CStr(arrTxt(i)) & " " & CStr(arrTxt(i + 1))
                    End If
                End If
            Next i
            If arrTxt(i) Like "##############" Then
                If Not NrList.Contains(CStr(arrTxt(i))) Then NrList.Add CStr(arrTxt(i))
            End If
        End If
    Next c

    Set oWs = ThisWorkbook.Sheets.Add
    oWs.Range("A1").Resize(NrList.count, 1).Value = WorksheetFunction.Transpose(NrList.ToArray)
    oWs.Columns("A:A").NumberFormat = "0"
    oWs.Columns("A:A").Columns.AutoFit
    NrList.Clear
    Set NrList = Nothing
    Set raSource = Nothing
    Set oWs = Nothing
End Sub
 

Watch MrExcel Video

Forum statistics

Threads
1,127,858
Messages
5,627,286
Members
416,236
Latest member
Lynchbox

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top