Split cells based on specific text which may occur multiple times in cell

charlie_580

Board Regular
Joined
Feb 3, 2007
Messages
56
Hi

I wonder if anyone can help?

I need to split multiple cells by code, however the code may occur within the original cell more than once. and example of my original cell has the following:

D11|1|2|text|numbers|text|numbers|text|numbers|text|numbers||||||||||||text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|numberstext|numbers|text|numbers||| |text|numbers|text|555|D11|32232|text|numbers|text|numbers||||text|numbers|text|3442 F12|4332|text|numbers|text|numbers|text|numbers|text|numbers|||||5353 D11||text|numbers|text|numbers||||| D11||text|numbers|text|numbers| |text|numbers|text|numbers|text|numbers|text|numbers|F17 text|numbers|text|numbers|text|numbers |text|numbers|text|numbers|text|numbers D11||text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|numbers| D11||text|numbers|text|numbers D11||text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|D40|numbers|text|numbers D11|text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|numbers

I need the text extracted from the first code up to the second code and so on. My first cell of extracted data would be
D11|1|2|text|numbers|text|numbers|text|numbers|text|numbers||||||||||||text|numbers|text|numbers|text|numbers|text|numbers|text|numbers|text|numberstext|numbers|text|numbers||| |text|numbers|text|555|

The second would be
D11|32232|text|numbers|text|numbers||||text|numbers|text|3442

and the third
F12|4332|text|numbers|text|numbers|text|numbers|text|numbers|||||5353
and so on.

Once extracted, I would like to then insert the number of rows required for each extracted cell and list them all in the one column.
If I don't make sense, or if anyone can help, let me know.
Thanks
 
One approach
VBA Code:
'
' This example assumes that the data (the text with the codes) is in cell A1
'
Sub CodeFindAndExtractExample()
    Dim WS As Worksheet
    Dim I As Long, J As Long, Ofs As Long, SLen As Long
    Dim S As String, DataStr As String
    Dim SA As Variant
    Dim CodeArr() As Variant, PosArr() As Variant

    Set WS = ActiveSheet

    DataStr = WS.Range("A1")                          'data is assumed to be in cell A1

    'find codes
    SA = Split(Application.Trim(Replace(DataStr, "|", " ")), " ")
    ReDim CodeArr(100)                                'increase this number if you think you will have more than 100 codes
    J = 0
    For I = LBound(SA) To UBound(SA)
        Select Case SA(I)
        Case "F12", "F17", "F18", "F19", "F20", "D04", "D11", "D20"    'your list of codes
            CodeArr(J) = SA(I)
            J = J + 1
        End Select
    Next I

    ReDim Preserve CodeArr(J - 1)
    ReDim PosArr(J - 1)

    'find code positions
    Ofs = 1
    For I = LBound(CodeArr) To UBound(CodeArr)
        PosArr(I) = InStr(Ofs, DataStr, CodeArr(I))
        Ofs = PosArr(I) + 1
    Next I

    'extract data for each code
    WS.Range("A3").Value = "Extracted Data:"
    For I = LBound(PosArr) To UBound(PosArr)
        If I < UBound(PosArr) Then
            SLen = PosArr(I + 1) - PosArr(I)
        Else
            SLen = Len(DataStr)
        End If
        S = Mid(DataStr, PosArr(I), SLen)

        WS.Range("A4").Offset(I) = S                  'write extracted data to worksheet, starting at cell A4
    Next I
End Sub
 
Upvote 0

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Yes. Mostly preceeded by a bar and a space and followed by a space and a bar.

The numerical codes are more tricky. Sometimes they are in the above format but in some cases they are together eg 51;63 but these will also be within the bars.
Sorry. I should have been more specific. The codes beginning with a letter are always in this format. The numerical ones are the same with the difference being that sometimes there are 2 numbers separated by a semicolon.
Unless I am misunderstanding, you seem to be saying codes like D11 are always preceded by a bar then a space and always followed by a space then bar... but that is not the case in the example you posted in Message #1. Please clarify.

And I am still unclear about the numbers. Are they preceded by a bar then space and followed by a space then bar or not?

And finally, if you have a code like 51;63... is that to be broken only at the 51? In other words, the cell value it delineates would start with 51;63 as written (the 63 would be ignored as a delimiter here), correct?
 
Upvote 0
One approach
VBA Code:
'
' This example assumes that the data (the text with the codes) is in cell A1
'
Sub CodeFindAndExtractExample()
    Dim WS As Worksheet
    Dim I As Long, J As Long, Ofs As Long, SLen As Long
    Dim S As String, DataStr As String
    Dim SA As Variant
    Dim CodeArr() As Variant, PosArr() As Variant

    Set WS = ActiveSheet

    DataStr = WS.Range("A1")                          'data is assumed to be in cell A1

    'find codes
    SA = Split(Application.Trim(Replace(DataStr, "|", " ")), " ")
    ReDim CodeArr(100)                                'increase this number if you think you will have more than 100 codes
    J = 0
    For I = LBound(SA) To UBound(SA)
        Select Case SA(I)
        Case "F12", "F17", "F18", "F19", "F20", "D04", "D11", "D20"    'your list of codes
            CodeArr(J) = SA(I)
            J = J + 1
        End Select
    Next I

    ReDim Preserve CodeArr(J - 1)
    ReDim PosArr(J - 1)

    'find code positions
    Ofs = 1
    For I = LBound(CodeArr) To UBound(CodeArr)
        PosArr(I) = InStr(Ofs, DataStr, CodeArr(I))
        Ofs = PosArr(I) + 1
    Next I

    'extract data for each code
    WS.Range("A3").Value = "Extracted Data:"
    For I = LBound(PosArr) To UBound(PosArr)
        If I < UBound(PosArr) Then
            SLen = PosArr(I + 1) - PosArr(I)
        Else
            SLen = Len(DataStr)
        End If
        S = Mid(DataStr, PosArr(I), SLen)

        WS.Range("A4").Offset(I) = S                  'write extracted data to worksheet, starting at cell A4
    Next I
End Sub

Thanks for this. Much appreciated.

It works, however the data is in column P. Also, as there are up to 900 lines of data already (each line represents a different product) I would like to extract the data to the columns starting with Q so I could then try a Row to columns macro to insert the number of lines needed and move the data. I'm not sure what I'm asking is possible but I thought I'd ask and see.
 
Upvote 0
Unless I am misunderstanding, you seem to be saying codes like D11 are always preceded by a bar then a space and always followed by a space then bar... but that is not the case in the example you posted in Message #1. Please clarify.

And I am still unclear about the numbers. Are they preceded by a bar then space and followed by a space then bar or not?

And finally, if you have a code like 51;63... is that to be broken only at the 51? In other words, the cell value it delineates would start with 51;63 as written (the 63 would be ignored as a delimiter here), correct?

Again, I apologise. My initial example was created in notepad and copied across as I was typing out the post on the forum and the page refreshed and I lost it. All the error codes are in the format space bar space error code space bar space (eg " | D11 | ").

I have had a look at the lines with the numerical codes again, and to be honest, there aren't that many, so I'm not too worried if I have to separate those manually. The alphanumeric codes run into thousands though so these are the ones I'm most interested in.

Thanks
 
Upvote 0
What about the first code at the beginning of the cell's text... your example shows it starting with D11... is that correct or is there a space bar space in front of it?

As for the numbers, if I understand how they are delimited (space bar space on each side, whether a single number or two with a semi-colon,) then it is an easy matter to include them in the code so you won't have to search for them afterwards.
 
Upvote 0
No space or bar in front of the first code. Just after it.

If I'm reading
What about the first code at the beginning of the cell's text... your example shows it starting with D11... is that correct or is there a space bar space in front of it?


First code has no space or bar preceeding it.

As for the numbers, if I understand how they are delimited (space bar space on each side, whether a single number or two with a semi-colon,) then it is an easy matter to include them in the code so you won't have to search for them afterwards.

The numbers are the same format - space bar space number space bar space
The first number before the semi-colon (if present) is the identifying number and is where the split would be.

thanks
 
Upvote 0
Again, I apologise. My initial example was created in notepad and copied across as I was typing out the post on the forum and the page refreshed and I lost it.

Can you please post some corrected and accurate sample data?
 
Upvote 0
Need some accurate data to test this against.
VBA Code:
'
' This example assumes that the data (the text with the codes) is in column P and that row 1 is a header row.
' Extracted data to worksheet starting at col Q
'
Sub CodeFindAndExtractExample2()
    Dim WS As Worksheet
    Dim RangeOfCells As Range, R As Range
    Dim I As Long, J As Long, Ofs As Long, SLen As Long
    Dim S As String, DataStr As String
    Dim SA As Variant, NumCode As Variant
    Dim CodeArr() As Variant, PosArr() As Variant

    Application.ScreenUpdating = False
    Set WS = ActiveSheet

    Set RangeOfCells = WS.Range("P2:P" & WS.Range("P" & WS.Rows.Count).End(xlUp).Row)

    For Each R In RangeOfCells
        DataStr = LTrim(R.Value)
        If Left(DataStr, 1) <> "|" Then
            DataStr = "|" & DataStr
        End If
        'find codes
        SA = Split(Application.Trim(Replace(DataStr, " ", "")), "|")
        ReDim CodeArr(100)                            'increase this number if you think you will have more than 100 codes
        J = 0
        For I = LBound(SA) To UBound(SA)
            Select Case SA(I)
            Case "F12", "F17", "F18", "F19", "F20", "D40", "D11", "D20", "51", "63", "65"    'your list of codes
                CodeArr(J) = SA(I)
                J = J + 1
            Case Else
                If InStr(SA(I), ";") > 0 Then
                    NumCode = Split(Replace(SA(I), "|", ""), ";")
                    If UBound(NumCode) = 1 And IsNumeric(NumCode(0)) And IsNumeric(NumCode(1)) Then
                        CodeArr(J) = SA(I)
                        J = J + 1
                    End If
                End If
            End Select
        Next I

        ReDim Preserve CodeArr(J - 1)
        ReDim PosArr(J - 1)

        'find code positions
        Ofs = 1
        For I = LBound(CodeArr) To UBound(CodeArr)
            PosArr(I) = InStr(Ofs, DataStr, CodeArr(I))
            Ofs = PosArr(I) + 1
        Next I

        'extract data for each code
        For I = LBound(PosArr) To UBound(PosArr)
            If I < UBound(PosArr) Then
                SLen = PosArr(I + 1) - PosArr(I)
            Else
                SLen = Len(DataStr)
            End If
            S = Mid(DataStr, PosArr(I), SLen)
            R.Offset(, I + 1).Value = S               'write extracted data to worksheet, starting at col Q
        Next I
    Next R
    Application.ScreenUpdating = True
End Sub
 
Upvote 0
Does this macro work correctly with your data...
VBA Code:
Sub SplitAndRearrange()
  Dim R As Long, V As Variant, Data As Variant, Arr As Variant
  Data = Range("A1", Cells(Rows.Count, "A").End(xlUp)).Value
  For R = 1 To UBound(Data)
    For Each V In Array("| F12 |", "| F17 |", "| F18 |", "| F19 |", "| F20 |", "| D04 |", "| D11 |", "| D20 |", "| 51 |", "| 63 |", "| 65 |", "| 51;", "| 63;", "| 65;")
      Data(R, 1) = Replace(Data(R, 1), V, Chr(1) & V)
    Next
    Arr = Split(Data(R, 1), Chr(1) & "| ")
    Cells(Rows.Count, "B").End(xlUp).Offset(1).Resize(1 + UBound(Arr)) = Application.Transpose(Arr)
  Next
End Sub
 
Upvote 0
Need some accurate data to test this against.
VBA Code:
'
' This example assumes that the data (the text with the codes) is in column P and that row 1 is a header row.
' Extracted data to worksheet starting at col Q
'
Sub CodeFindAndExtractExample2()
    Dim WS As Worksheet
    Dim RangeOfCells As Range, R As Range
    Dim I As Long, J As Long, Ofs As Long, SLen As Long
    Dim S As String, DataStr As String
    Dim SA As Variant, NumCode As Variant
    Dim CodeArr() As Variant, PosArr() As Variant

    Application.ScreenUpdating = False
    Set WS = ActiveSheet

    Set RangeOfCells = WS.Range("P2:P" & WS.Range("P" & WS.Rows.Count).End(xlUp).Row)

    For Each R In RangeOfCells
        DataStr = LTrim(R.Value)
        If Left(DataStr, 1) <> "|" Then
            DataStr = "|" & DataStr
        End If
        'find codes
        SA = Split(Application.Trim(Replace(DataStr, " ", "")), "|")
        ReDim CodeArr(100)                            'increase this number if you think you will have more than 100 codes
        J = 0
        For I = LBound(SA) To UBound(SA)
            Select Case SA(I)
            Case "F12", "F17", "F18", "F19", "F20", "D40", "D11", "D20", "51", "63", "65"    'your list of codes
                CodeArr(J) = SA(I)
                J = J + 1
            Case Else
                If InStr(SA(I), ";") > 0 Then
                    NumCode = Split(Replace(SA(I), "|", ""), ";")
                    If UBound(NumCode) = 1 And IsNumeric(NumCode(0)) And IsNumeric(NumCode(1)) Then
                        CodeArr(J) = SA(I)
                        J = J + 1
                    End If
                End If
            End Select
        Next I

        ReDim Preserve CodeArr(J - 1)
        ReDim PosArr(J - 1)

        'find code positions
        Ofs = 1
        For I = LBound(CodeArr) To UBound(CodeArr)
            PosArr(I) = InStr(Ofs, DataStr, CodeArr(I))
            Ofs = PosArr(I) + 1
        Next I

        'extract data for each code
        For I = LBound(PosArr) To UBound(PosArr)
            If I < UBound(PosArr) Then
                SLen = PosArr(I + 1) - PosArr(I)
            Else
                SLen = Len(DataStr)
            End If
            S = Mid(DataStr, PosArr(I), SLen)
            R.Offset(, I + 1).Value = S               'write extracted data to worksheet, starting at col Q
        Next I
    Next R
    Application.ScreenUpdating = True
End Sub

This works better. However there are a few cases of codes not being picked up meaning there are 2 sets of codes within the one cell after extraction.

I've posted an example below
D11|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|||||TEXT|TEXT|TEXT|TEXT|TEXT|||||||||||TEXT|TEXT|TEXT||||| 63|TEXT|TEXT F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT||||| F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT||||| F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT|||||

After the macro is run I have
Q1
D11|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|||||TEXT|TEXT|TEXT|TEXT|TEXT|||||||||||TEXT|TEXT|TEXT||||| 63|TEXT|TEXT


R1
F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT|||||

S1
F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT||||| F19|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT|TEXT||||||||||||TEXT|TEXT|||||

 
Upvote 0

Forum statistics

Threads
1,214,611
Messages
6,120,513
Members
448,967
Latest member
screechyboy79

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top