Locate criteria in a string, pull data +/-5 characters to get price

HockeyDiablo

Board Regular
Joined
Apr 1, 2016
Messages
182
I have extracted a rather large XML file. Inside the file was an old method of how I stored my invoices. Here is the string of text that I have:

Data
sp 199 blah blah blah
blah blah 60 cbl blah
Door 999 blah blah blah
blah 299 sp door 888 blah blah

Here is what I am looking to get done...

ABC
SPCBLDoor
199
60
999
299888

<tbody>
</tbody>

 

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
is data all in one cell or in 4 cells


Im sorry, the data will all be in once cell. I dont need to pull the headers as I will set them manually, but I do need information "pricing" pulled based off the header. This is an XML file from a very old program that I am trying to convert into invoices to update a brand new CRM. Our old method was just keying in a code "sp, cbl, door..." and then placing the price before or after. I am trying to automate as many of the 97k rows that I can!

Thank you
 
Upvote 0
Hello HockeyDiablo,

The code below should be close to what you need. However, I changed your data a bit. The data now reads the item "sp, Cbl, Door" followed by the number. Any data not matching this will not be parsed.

The data is in cell "A1" on "Sheet1". The output goes to "Sheet2" starting in Row 2 Column "A".

Link to download the workbook...
Pull Data ver 1.xlsm

Macro Code
Code:
Global Keys     As Variant
Global Headers  As Collection
Global RegExp   As Object


Sub ParseInvoice(ByVal Text As String)


    Dim DataRow As Range
    Dim Key     As Variant
    Dim LastRow As Long
    Dim Matches As Object
    Dim n       As Long
    Dim RegExp  As Object
    Dim Wks     As Worksheet
    
        Set Wks = ThisWorkbook.Worksheets("Sheet2")
        
        Set DataRow = Wks.Range("A2")
        LastRow = Wks.UsedRange.Cells.Find("*", , xlValues, xlWhole, xlByRows, xlPrevious, False, False, False).Row
        
        If LastRow >= DataRow.Row Then
            Set DataRow = Wks.Cells(LastRow + 1, DataRow.Column)
        End If
        
        If Headers Is Nothing Then
            Set Headers = New Collection
            For Each Key In Wks.Range(Wks.Cells(1, "A"), Wks.Cells(1, Columns.Count).End(xlToLeft)).Value
                Headers.Add n, Key
                If n > 0 Then Keys = Keys & "|" & Key Else Keys = Key
                n = n + 1
            Next Key
        End If
        
        If RegExp Is Nothing Then
            Set RegExp = CreateObject("VBScript.RegExp")
                RegExp.Global = True
                RegExp.IgnoreCase = True
                RegExp.Pattern = "\b(" & Keys & ")\s+(\d+)\b"
        End If
        
        Set Matches = RegExp.Execute(Text)
        
        For Each Match In Matches
            For j = 0 To Match.SubMatches.Count - 1 Step 2
                DataRow.Offset(0, Headers(Match.SubMatches(j))).Value = Match.SubMatches(j + 1)
            Next j
        Next Match
        
End Sub


Sub Run()


    Dim Line As Variant
    Dim Text As String
    
        Text = ThisWorkbook.Worksheets("Sheet1").Range("A1")
    
        ThisWorkbook.Worksheets("Sheet2").UsedRange.Offset(1, 0).ClearContents
    
        For Each Line In Split(Text, vbLf)
            Call ParseInvoice(Line)
        Next Line
    
End Sub
 
Upvote 0
Hello HockeyDiablo,

The code below should be close to what you need. However, I changed your data a bit. The data now reads the item "sp, Cbl, Door" followed by the number. Any data not matching this will not be parsed.

The data is in cell "A1" on "Sheet1". The output goes to "Sheet2" starting in Row 2 Column "A".

Link to download the workbook...
Pull Data ver 1.xlsm

Macro Code
Code:
Global Keys     As Variant
Global Headers  As Collection
Global RegExp   As Object


Sub ParseInvoice(ByVal Text As String)


    Dim DataRow As Range
    Dim Key     As Variant
    Dim LastRow As Long
    Dim Matches As Object
    Dim n       As Long
    Dim RegExp  As Object
    Dim Wks     As Worksheet
    
        Set Wks = ThisWorkbook.Worksheets("Sheet2")
        
        Set DataRow = Wks.Range("A2")
        LastRow = Wks.UsedRange.Cells.Find("*", , xlValues, xlWhole, xlByRows, xlPrevious, False, False, False).Row
        
        If LastRow >= DataRow.Row Then
            Set DataRow = Wks.Cells(LastRow + 1, DataRow.Column)
        End If
        
        If Headers Is Nothing Then
            Set Headers = New Collection
            For Each Key In Wks.Range(Wks.Cells(1, "A"), Wks.Cells(1, Columns.Count).End(xlToLeft)).Value
                Headers.Add n, Key
                If n > 0 Then Keys = Keys & "|" & Key Else Keys = Key
                n = n + 1
            Next Key
        End If
        
        If RegExp Is Nothing Then
            Set RegExp = CreateObject("VBScript.RegExp")
                RegExp.Global = True
                RegExp.IgnoreCase = True
                RegExp.Pattern = "\b(" & Keys & ")\s+(\d+)\b"
        End If
        
        Set Matches = RegExp.Execute(Text)
        
        For Each Match In Matches
            For j = 0 To Match.SubMatches.Count - 1 Step 2
                DataRow.Offset(0, Headers(Match.SubMatches(j))).Value = Match.SubMatches(j + 1)
            Next j
        Next Match
        
End Sub


Sub Run()


    Dim Line As Variant
    Dim Text As String
    
        Text = ThisWorkbook.Worksheets("Sheet1").Range("A1")
    
        ThisWorkbook.Worksheets("Sheet2").UsedRange.Offset(1, 0).ClearContents
    
        For Each Line In Split(Text, vbLf)
            Call ParseInvoice(Line)
        Next Line
    
End Sub

Amazing work, I was able to get your program to run and it did exactly what I had asked, but it unfortunately doesn't work with my data. Would there be a way to have a list of items on one sheet that the program can look for? That way if my variable on spring changes on the data end "sp, springs, spring, spr [depends on what the technician wrote]"

Sample Data:
349 for belt with remote 59. so 408 but $326.00 total with 20% off coupon

<tbody>
</tbody>


So for this one it would be 349 belt; 59 remote; 326 total; 20% coupon
 
Upvote 0
Hello HockeyDiablo,

The more data you can provide me with the easier it will be to create a working solution. Can you post a copy of your workbook on a file sharing service? If you have sensitive information you do not want made public then perhaps you can email a copy.

The item and number appear either as item before number or number before item in your examples. This will slow the process of parsing down quite a bit because 2 checks will need to made per item. Adding multiple entries for an item will slow this down even
more. Since you have a large amount of data, I will make this process as quick as possible.
 
Upvote 0
I could provide data but then that would only work for the data I chose. I am going to have to find a way to create variables on the fly to adjust this XML file and place them back into invoices.
 
Upvote 0
Hello HockeyDiablo,

Can you post a copy of the XML code?
 
Upvote 0
Hello HockeyDiablo,

Can you post a copy of the XML code?



Nothing too detailing in the XML file tho: Its basically the example above with a carriage return between them. I replaced the "%0D%0A" With a unique symbol, then ran a text to column based on the delimitation of that symbol. I managed to remove name, address, city, state, zip, email etc... now all that remains are the notes that I am trying to change into an invoice of sorts.

name= "349 for belt with remote 59.%0D%0A So 408 but $326.00 total%0D%0A with 20% off coupon"


Appreciate all the assistance.
 
Upvote 0

Forum statistics

Threads
1,215,398
Messages
6,124,694
Members
449,179
Latest member
kfhw720

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top