Split comma delimited data but ignoring commas between brackets

DeonM

New Member
Joined
Sep 18, 2014
Messages
23
Hi

I have a set of data in the following format:

codeother membersmodified
ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)2014-05-08 15:27:10

<tbody>
</tbody>

I need to split column B into separate rows and repeat the other data. The data is essentially comma delimited.

I have the following code that successfully splits Column B to new rows as needed:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        ar = Split(r.Value, ",")
        If UBound(ar) >= 0 Then r.Value = ar(0)
        For i = UBound(ar) To 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = ar(i)
        Next
        Set r = r.Offset(-1)
    Loop
End Sub
My problem is that some of the data sometimes includes commas that appear between brackets (marked red in data above) that should be ignored as delimiters. I assume I should first replace all commas between brackets with something else and then reverse later, but have no idea how to do this.

Any help appreciated.
 

Some videos you may like

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.

JazzSP8

Well-known Member
Joined
Sep 30, 2005
Messages
1,218
Alternatively, if every entry has bracketed content at the end then you can use ")" as your delimiter and append it back on again once the data has been split?
 

DeonM

New Member
Joined
Sep 18, 2014
Messages
23
Thanks. The entries don't always have bracketed position titles. So many are also just 'Name Surname, Name Surname, etc.' Then commas are the only delimiters.
 

sandy666

Well-known Member
Joined
Oct 24, 2015
Messages
5,449
Is that what you want?

codeother membersmodifiedcodeother membersmodified
ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury), John Doe, Hamri Salaam
08/05/2014 15:27​
ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture)
08/05/2014 15:27​
ABCFikile Desi (Constituency Coordinator)
08/05/2014 15:27​
ABCThembinkosi Mafana (Safety and Security)
08/05/2014 15:27​
ABCNomamerika Magopeni (Sport, Recreation, Arts and Culture)
08/05/2014 15:27​
ABCWandisile Jikeka (Corporate Services)
08/05/2014 15:27​
ABCMbuyiseli Mkavu (Human Settlements)
08/05/2014 15:27​
ABCAndile Mfunda (Infrastructure, Engineering and Energy)
08/05/2014 15:27​
ABCPaticia Ndlovu (Public Health)
08/05/2014 15:27​
ABCBalu Naran (Budget and Treasury)
08/05/2014 15:27​
ABCJohn Doe
08/05/2014 15:27​
ABCHamri Salaam
08/05/2014 15:27​
 
Last edited:

steve the fish

Well-known Member
Joined
Oct 20, 2009
Messages
8,080
Office Version
365
Platform
Windows
Altering your code a bit you could do this:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar, l As Long, flag As Boolean
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        flag = False
        For l = 1 To Len(r)
            If Mid(r, l, 1) = "(" Then flag = True
            If Mid(r, l, 1) = ")" Then flag = False
            If flag = True And Mid(r, l, 1) = "," Then
                r = Left(r, l - 1) & "|" & Mid(r, l + 1, Len(r))
            End If
        Next
        ar = Split(r, ",")
        If UBound(ar) >= 0 Then r.Value = Trim(Replace((ar(0)), "|", ","))
        For i = UBound(ar) To LBound(ar) + 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = Trim(Replace((ar(i)), "|", ","))
        Next
        Set r = r.Offset(-1)
    Loop
End Sub
Give it a test and see how you get on.
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
44,754
Office Version
365
Platform
Windows
You could also give this a try. At the moment, I have it writing results in columns E:G rather than over-writing the original data in columns A:C. Can change that later if it appears to be doing what you want.

Code:
Sub splitByColB_v2()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Resize(, 3).Value
  uba2 = UBound(a, 2)
  ReDim b(1 To Rows.Count, 1 To uba2)
  For i = 1 To UBound(a)
    For Each m In RX.Execute(a(i, 2))
      k = k + 1
      For j = 1 To uba2
        b(k, j) = a(i, j)
      Next j
      b(k, 2) = m
    Next m
  Next i
  Range("E2").Resize(k, uba2).Value = b
End Sub
Here is my sample data and results of code:

Excel Workbook
ABCDEFG
1codeother membersmodified
2ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)8/05/2014 15:27ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture)8/05/2014 15:27
3DEFThembinkosi Mafana, Nomamerika Magopeni, Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)xzscABCFikile Desi (Constituency Coordinator)8/05/2014 15:27
4ABCThembinkosi Mafana (Safety and Security)8/05/2014 15:27
5ABCNomamerika Magopeni (Sport, Recreation, Arts and Culture)8/05/2014 15:27
6ABCWandisile Jikeka (Corporate Services)8/05/2014 15:27
7ABCMbuyiseli Mkavu (Human Settlements)8/05/2014 15:27
8ABCAndile Mfunda (Infrastructure, Engineering and Energy)8/05/2014 15:27
9ABCPaticia Ndlovu (Public Health)8/05/2014 15:27
10ABCBalu Naran (Budget and Treasury)8/05/2014 15:27
11DEFThembinkosi Mafanaxzsc
12DEFNomamerika Magopenixzsc
13DEFWandisile Jikeka (Corporate Services)xzsc
14DEFMbuyiseli Mkavu (Human Settlements)xzsc
15DEFAndile Mfunda (Infrastructure, Engineering and Energy)xzsc
16DEFPaticia Ndlovu (Public Health)xzsc
17DEFBalu Naran (Budget and Treasury)xzsc
18
Sheet1
 
Last edited:

DeonM

New Member
Joined
Sep 18, 2014
Messages
23
Brilliant. That's exactly what I was looking for. Thank you.

When doing the last = Trim(Replace((ar(i)), "|", ",")) is there a simple way to also remove any carriage returns that may have been left over in column B?
 

DeonM

New Member
Joined
Sep 18, 2014
Messages
23
Altering your code a bit you could do this:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar, l As Long, flag As Boolean
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        flag = False
        For l = 1 To Len(r)
            If Mid(r, l, 1) = "(" Then flag = True
            If Mid(r, l, 1) = ")" Then flag = False
            If flag = True And Mid(r, l, 1) = "," Then
                r = Left(r, l - 1) & "|" & Mid(r, l + 1, Len(r))
            End If
        Next
        ar = Split(r, ",")
        If UBound(ar) >= 0 Then r.Value = Trim(Replace((ar(0)), "|", ","))
        For i = UBound(ar) To LBound(ar) + 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = Trim(Replace((ar(i)), "|", ","))
        Next
        Set r = r.Offset(-1)
    Loop
End Sub
Give it a test and see how you get on.



Brilliant. That's exactly what I was looking for. Thank you.

When doing the last = Trim(Replace((ar(i)), "|", ",")) is there a simple way to also remove any carriage returns that may have been left over in column B?

Thanks to all the other suggestions.
 

Peter_SSs

MrExcel MVP, Moderator
Joined
May 28, 2005
Messages
44,754
Office Version
365
Platform
Windows
.. is there a simple way to also remove any carriage returns that may have been left over in column B?
For my code I have
- adjusted for that, assuming that the "carriage return" is CHAR(10), and added TRIM.
- set to specifically target sheet 'Metropolitans' which my earlier code had not
- set to now over-write the original data

Code:
Sub splitByColB_v3()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  With Sheets("Metropolitans")
    a = .Range("A2", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 3).Value
    uba2 = UBound(a, 2)
    ReDim b(1 To Rows.Count, 1 To uba2)
    For i = 1 To UBound(a)
      For Each m In RX.Execute(Replace(a(i, 2), Chr(10), ""))
        k = k + 1
        For j = 1 To uba2
          b(k, j) = a(i, j)
        Next j
        b(k, 2) = Trim(m)
      Next m
    Next i
    .Range("A2").Resize(k, uba2).Value = b
  End With
End Sub

Thanks to all the other suggestions.
Not sure if you tried the other suggestions or how much data you have but for me, testing on about 200 rows of data like your sample data, my code took about 0.2 seconds versus nearly 30 seconds for the other code.
 

DeonM

New Member
Joined
Sep 18, 2014
Messages
23
For my code I have
- adjusted for that, assuming that the "carriage return" is CHAR(10), and added TRIM.
- set to specifically target sheet 'Metropolitans' which my earlier code had not
- set to now over-write the original data

Code:
Sub splitByColB_v3()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  With Sheets("Metropolitans")
    a = .Range("A2", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 3).Value
    uba2 = UBound(a, 2)
    ReDim b(1 To Rows.Count, 1 To uba2)
    For i = 1 To UBound(a)
      For Each m In RX.Execute(Replace(a(i, 2), Chr(10), ""))
        k = k + 1
        For j = 1 To uba2
          b(k, j) = a(i, j)
        Next j
        b(k, 2) = Trim(m)
      Next m
    Next i
    .Range("A2").Resize(k, uba2).Value = b
  End With
End Sub

Not sure if you tried the other suggestions or how much data you have but for me, testing on about 200 rows of data like your sample data, my code took about 0.2 seconds versus nearly 30 seconds for the other code.
Hey Peter

Thanks for that. It works perfectly. I have been testing with about 200 rows too - and get similar time results. Your code is much, much faster. My total dataset has about 10 000 rows. Even though I only really have to run this procedure once to convert legacy data, the speed will be a big factor.

Much appreciated.
 

Watch MrExcel Video

Forum statistics

Threads
1,099,916
Messages
5,471,488
Members
406,766
Latest member
Parasoner

This Week's Hot Topics

Top