Excel VBA Parse Column A into multiple columns help but with conditions

BalloutMoe

New Member
Joined
Jun 4, 2021
Messages
41
Office Version
  1. 365
Platform
  1. Windows
Hello, I have a TXT file that gets imported in excel and it always ends up in column A. I am trying to split the data up however. The middle columns with text usually vary. For example

1.000 1.000 EA MWPTS16 MIGHTY 16in TecSelect WIPER BLADE EA 1.9900 1.99
1.000 1.000 EA MWPTS18 MIGHTY 18in TecSelect WIPER BLADE EA 1.9900 1.99
1.000 1.000 EA MWPTS21 MIGHTY 21in TecSelect WIPER BLADE EA 1.9900 1.99
1.000 1.000 EA MLE 9012LL MIGHTY HIR2/12V LONG LIFE HALOGEN 16.9800 16.98
EA
1.000 1.000 EA MLE H11-55W MIGHTY HALOGEN HEADLAMP EA 7.9500 7.95
4.000 4.000 EA MSL 80-39 MIGHTY DRAIN PLUG EA 4.9500 19.80
1.000 1.000 EA MFPTC2064 MIGHTY TECSELECT CABIN AIR FILTER 5.4900 5.49
EA

I would like to split like this: 1.000 | 1.000 EA | rest of the text here | Then 1.9900 | 1.99. Ending up with 5 columns. However the length of the text varies every time. So if I split them normally some number won't be inline with each other. Can this be done in any specific way. Or any way to guide me get started.
 

Excel Facts

Enter current date or time
Ctrl+: enters current time. Ctrl+; enters current date. Use Ctrl+: Ctrl+; Enter for current date & time.

Marc L

Well-known Member
Joined
Apr 5, 2021
Messages
927
Office Version
  1. 2010
Platform
  1. Windows
So weird to zip the pdf file rather than the text file and as I can do nothin' with a pdf file …​
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
49,095
Office Version
  1. 365
Platform
  1. Windows
I was able to get that figured out by trial and error
Glad you got something figured out, though we don't know what that final result was. :unsure:

It looks like your requirement changed a little as originally there was to be 5 result columns and now 6. Also, originally you showed "EA" in the second column but now it looks like you don't want that at all.

I suspect your files where this is to be done will not be very large so speed might not be an issue. However, here are a couple of alternatives to consider in case.

A.
Using the original approach but with the changed conditions, makes the string manipulation a fair bit simpler. This is what I can up with.

VBA Code:
Sub Split6_1()
  Dim a As Variant
  Dim i As Long
 
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value2
  For i = 1 To UBound(a)
    a(i, 1) = StrReverse(Replace(StrReverse(Replace(a(i, 1), " ", ";", 1, 4)), " ", ";", 1, 2))
  Next i
  Application.ScreenUpdating = False
  With Range("B2").Resize(UBound(a))
    .Value = a
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False, _
        FieldInfo:=Array(Array(1, 1), Array(2, 1), Array(3, 9), Array(4, 1), Array(5, 1), Array(6, 1), Array(7, 1))
    .Resize(, 6).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub

B.
As the number of data rows increases, this becomes significantly faster than the code above.

VBA Code:
Sub Split6_2()
  Dim RX As Object
  Dim a As Variant
  Dim i As Long
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value2
  For i = 1 To UBound(a)
    a(i, 1) = RX.Replace(a(i, 1), "$1;$3;$6;$8;$10;$12")
  Next i
  Application.ScreenUpdating = False
  With Range("B2").Resize(UBound(a))
    .Value = a
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
    .Resize(, 6).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub
 

BalloutMoe

New Member
Joined
Jun 4, 2021
Messages
41
Office Version
  1. 365
Platform
  1. Windows

ADVERTISEMENT

Glad you got something figured out, though we don't know what that final result was. :unsure:

It looks like your requirement changed a little as originally there was to be 5 result columns and now 6. Also, originally you showed "EA" in the second column but now it looks like you don't want that at all.

I suspect your files where this is to be done will not be very large so speed might not be an issue. However, here are a couple of alternatives to consider in case.

A.
Using the original approach but with the changed conditions, makes the string manipulation a fair bit simpler. This is what I can up with.

VBA Code:
Sub Split6_1()
  Dim a As Variant
  Dim i As Long
 
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value2
  For i = 1 To UBound(a)
    a(i, 1) = StrReverse(Replace(StrReverse(Replace(a(i, 1), " ", ";", 1, 4)), " ", ";", 1, 2))
  Next i
  Application.ScreenUpdating = False
  With Range("B2").Resize(UBound(a))
    .Value = a
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False, _
        FieldInfo:=Array(Array(1, 1), Array(2, 1), Array(3, 9), Array(4, 1), Array(5, 1), Array(6, 1), Array(7, 1))
    .Resize(, 6).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub

B.
As the number of data rows increases, this becomes significantly faster than the code above.

VBA Code:
Sub Split6_2()
  Dim RX As Object
  Dim a As Variant
  Dim i As Long
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value2
  For i = 1 To UBound(a)
    a(i, 1) = RX.Replace(a(i, 1), "$1;$3;$6;$8;$10;$12")
  Next i
  Application.ScreenUpdating = False
  With Range("B2").Resize(UBound(a))
    .Value = a
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
    .Resize(, 6).Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub
The problem is the actual column has all kinds of different data not just the one I posted. For example shop name address and what not. Take a look at the zipped link above from drop box if you do not mind. It has the text file and a sample of what I need to extract from it. I ended up with two modules, one of which is the one your provided and another doing some minor cleaning up here and there to get it to what I need. Below is the code your provided with a bit of modification. I just ran through the strreverse and replace code until I figured out the concept behind it. I am super new to this whole coding thing and do not know all the ins and outs. So I appreciated your code as I learned something new from it. I will take a look at the other ones and see what I come up with. But please do take a look at the zip files. Thank you

VBA Code:
Sub Split5()
  Dim a As Variant
  Dim i As Long
  
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value2
  For i = 1 To UBound(a)
   a(i, 1) = StrReverse(Replace(Replace(StrReverse(Replace(Replace(Replace(a(i, 1), " ", ";", 1, 5), ";", " ", 1, 2), " ", ";", 1, 1)), " ", ";", 3, 2), " ", ";", 1, 1))
  'a(i, 1) = Replace(Replace(Replace(a(i, 1), " ", ";", 1, 3), ";", " ", 1, 2), " ", ";", 1, 1) 'testing what this line of code does and what the array returns
  'a(i, 1) = Replace(Replace(Replace(a(i, 1), " ", ";", 1, 5), ";", " ", 1, 2), " ", ";", 1, 1)' test #2
  Next i
  Application.ScreenUpdating = False
  With Worksheets("Sheet1").Range("A1").Resize(UBound(a))
    .Value = a
    .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
    .Resize(, 6).Columns.AutoFit
  End With

  Application.ScreenUpdating = True
  Call Delete_entire_row_if_contains_text
End Sub

I still would like to add more things to it of course. but the whole goal was to extract the pdf into excel. I tried a few things but I could not rearrange the data how I wanted. I still would like to extract the item ID from column D. See attached dropbox link above. Thanks guys
 

Attachments

  • Mighty.jpg
    Mighty.jpg
    207.9 KB · Views: 3

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
49,095
Office Version
  1. 365
Platform
  1. Windows
Not entirely clear what you want to do about the separate invoices, but give this a try. It lets you choose the relevant text file and adds a new worksheet to the active workbook to receive the results.

VBA Code:
Sub GetData()
  Dim RX As Object
  Dim a As Variant
  Dim sFile As String, s As String, InvNo As String
  Dim k As Long
  Dim bStarted As Boolean, bInv As Boolean
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  sFile = Application.GetOpenFilename()
  If sFile <> "False" Then
    Open sFile For Input As #1
    ReDim a(1 To Rows.Count, 1 To 1)
    Do Until EOF(1)
        Line Input #1, s
        s = Application.Trim(s)
        Select Case True
          Case s = "INVOICE"
            bInv = True
          Case bInv And IsNumeric(Left(s, 1)) And s <> InvNo
            k = k + 1
            a(k, 1) = "Inv: " & s
            InvNo = s
            bInv = False
          Case LCase(s) Like "carrier:*"
            bStarted = True
            bInv = False
          Case s Like "[*]*"
            bStarted = False
            bInv = False
          Case bStarted And IsNumeric(Left(s, 1))
            k = k + 1
            a(k, 1) = RX.Replace(s, "$1;$3;$6;$8;$10;$12")
            bInv = False
          Case Else
            bInv = False
        End Select
    Loop
    Close #1
    Sheets.Add
    With Range("A2").Resize(k)
      .Value = a
      .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
      .Resize(, 6).Rows(0).Value = Array("Ordered", "Shipped", "Item ID", "Item Description", "Unit Price", "Ext price")
      .Resize(, 6).EntireColumn.AutoFit
    End With
  End If
End Sub
 

BalloutMoe

New Member
Joined
Jun 4, 2021
Messages
41
Office Version
  1. 365
Platform
  1. Windows

ADVERTISEMENT

Not entirely clear what you want to do about the separate invoices, but give this a try. It lets you choose the relevant text file and adds a new worksheet to the active workbook to receive the results.

VBA Code:
Sub GetData()
  Dim RX As Object
  Dim a As Variant
  Dim sFile As String, s As String, InvNo As String
  Dim k As Long
  Dim bStarted As Boolean, bInv As Boolean
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  sFile = Application.GetOpenFilename()
  If sFile <> "False" Then
    Open sFile For Input As #1
    ReDim a(1 To Rows.Count, 1 To 1)
    Do Until EOF(1)
        Line Input #1, s
        s = Application.Trim(s)
        Select Case True
          Case s = "INVOICE"
            bInv = True
          Case bInv And IsNumeric(Left(s, 1)) And s <> InvNo
            k = k + 1
            a(k, 1) = "Inv: " & s
            InvNo = s
            bInv = False
          Case LCase(s) Like "carrier:*"
            bStarted = True
            bInv = False
          Case s Like "[*]*"
            bStarted = False
            bInv = False
          Case bStarted And IsNumeric(Left(s, 1))
            k = k + 1
            a(k, 1) = RX.Replace(s, "$1;$3;$6;$8;$10;$12")
            bInv = False
          Case Else
            bInv = False
        End Select
    Loop
    Close #1
    Sheets.Add
    With Range("A2").Resize(k)
      .Value = a
      .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
      .Resize(, 6).Rows(0).Value = Array("Ordered", "Shipped", "Item ID", "Item Description", "Unit Price", "Ext price")
      .Resize(, 6).EntireColumn.AutoFit
    End With
  End If
End Sub
Thank you very much this code works great and the format is exactly what I need, however it seems like its skipping data. The text file is converted manually at the moment from pdf to accessible that contains a number of pages. I even tried one file that has three pages which is one invoice. It only took the first table data. Any help? Again I am going to be studying your code to help me understand better and learn the logic behind it. Thank you again very much
 

BalloutMoe

New Member
Joined
Jun 4, 2021
Messages
41
Office Version
  1. 365
Platform
  1. Windows
It is taking only the first page of an invoice number
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
49,095
Office Version
  1. 365
Platform
  1. Windows
It is taking only the first page of an invoice number
:oops: Try this one.

VBA Code:
Sub GetData_v2()
  Dim RX As Object
  Dim a As Variant
  Dim sFile As String, s As String, InvNo As String
  Dim k As Long
  Dim bInv As Boolean
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  sFile = Application.GetOpenFilename()
  If sFile <> "False" Then
    Open sFile For Input As #1
    ReDim a(1 To Rows.Count, 1 To 1)
    Do Until EOF(1)
        Line Input #1, s
        s = Application.Trim(s)
        Select Case True
          Case s = "INVOICE"
            bInv = True
          Case bInv And IsNumeric(Left(s, 1)) And s <> InvNo
            k = k + 1
            a(k, 1) = "Inv: " & s
            InvNo = s
            bInv = False
          Case s Like "#*.* #*.* EA * #*.* #*.*#"
            k = k + 1
            a(k, 1) = RX.Replace(s, "$1;$3;$6;$8;$10;$12")
            bInv = False
          Case Else
            bInv = False
        End Select
    Loop
    Close #1
    Sheets.Add
    With Range("A2").Resize(k)
      .Value = a
      .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
      .Resize(, 6).Rows(0).Value = Array("Ordered", "Shipped", "Item ID", "Item Description", "Unit Price", "Ext price")
      .Resize(, 6).EntireColumn.AutoFit
    End With
  End If
End Sub
 

BalloutMoe

New Member
Joined
Jun 4, 2021
Messages
41
Office Version
  1. 365
Platform
  1. Windows
:oops: Try this one.

VBA Code:
Sub GetData_v2()
  Dim RX As Object
  Dim a As Variant
  Dim sFile As String, s As String, InvNo As String
  Dim k As Long
  Dim bInv As Boolean
 
  Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = "^([^ ]+)( )([^ ]+)( )([^ ]+ )([^ ]+)( )(.+)( )([^ ]+)( )([^ ]+)$"
  sFile = Application.GetOpenFilename()
  If sFile <> "False" Then
    Open sFile For Input As #1
    ReDim a(1 To Rows.Count, 1 To 1)
    Do Until EOF(1)
        Line Input #1, s
        s = Application.Trim(s)
        Select Case True
          Case s = "INVOICE"
            bInv = True
          Case bInv And IsNumeric(Left(s, 1)) And s <> InvNo
            k = k + 1
            a(k, 1) = "Inv: " & s
            InvNo = s
            bInv = False
          Case s Like "#*.* #*.* EA * #*.* #*.*#"
            k = k + 1
            a(k, 1) = RX.Replace(s, "$1;$3;$6;$8;$10;$12")
            bInv = False
          Case Else
            bInv = False
        End Select
    Loop
    Close #1
    Sheets.Add
    With Range("A2").Resize(k)
      .Value = a
      .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
      .Resize(, 6).Rows(0).Value = Array("Ordered", "Shipped", "Item ID", "Item Description", "Unit Price", "Ext price")
      .Resize(, 6).EntireColumn.AutoFit
    End With
  End If
End Sub
Thank you very much, I appreciate the help it works great
 

Watch MrExcel Video

Forum statistics

Threads
1,132,890
Messages
5,655,820
Members
418,239
Latest member
RSadler11

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top