INDEX and MATCH to find duplicates, then CONCATENATE all Text in a single column that matches the results of the INDEX|MATCH, or Loop to find dupes?

blbat

Active Member
Joined
Mar 24, 2010
Messages
325
This is going to be painful to explain- apologies in advance!

I have an exported Excel Spreadsheet that contains a record set in which each row is uniquely identified by the "Agent Name" Column and the "Ticket #" Column. Only one unique Ticket number can be assigned per sale, per Agent.

Unfortunately for me, the data base that this file is exported from is butchering the export because the last field ("Remarks") is a free-form text field, and the text is messed up with hidden line breaks, and other hidden characters in the data base itself...the result is the export of each record is getting duplicate rows of the same sale, with the "Remarks" field being split out in multiple rows.

(I don't have admin rights on the data-base, so can't fix the problem on that side)

I hope the following illustrates what I'm trying to do...

This is an example of what is exported, Note column E text:

1626970715613.png


This is what it SHOULD be:

1626970728235.png


any help would be appreciated!

blbat
Excel 2016
 

Excel Facts

How to change case of text in Excel?
Use =UPPER() for upper case, =LOWER() for lower case, and =PROPER() for proper case. PROPER won't capitalize second c in Mccartney

DanteAmor

Well-known Member
Joined
Dec 3, 2018
Messages
14,177
Office Version
  1. 2007
Platform
  1. Windows
Your data on sheet 1, starting at A1.
The results on sheet 2, from A1 onwards.

VBA Code:
Sub ConcatenateText()
  Dim a As Variant, b As Variant
  Dim dic As Object
  Dim i As Long, j As Long, k As Long
  
  Set dic = CreateObject("Scripting.Dictionary")
  a = Sheets("Sheet1").Range("A1:E" & Sheets("Sheet1").Range("A" & Rows.Count).End(3).Row).Value
  ReDim b(1 To UBound(a, 1), 1 To UBound(a, 2))
  
  For i = 1 To UBound(a)
    If Not dic.exists(a(i, 1) & "|" & a(i, 3)) Then
      j = j + 1
      dic(a(i, 1) & "|" & a(i, 3)) = j
      For k = 1 To UBound(a, 2)
        b(j, k) = a(i, k)
      Next
    Else
      j = dic(a(i, 1) & "|" & a(i, 3))
      b(j, UBound(a, 2)) = b(j, UBound(a, 2)) & " " & a(i, UBound(a, 2))
    End If
  Next
  
  Sheets("Sheet2").Range("A1").Resize(j, UBound(b, 2)).Value = b
End Sub
 
Solution

blbat

Active Member
Joined
Mar 24, 2010
Messages
325
Dante!!!

I never cease to be amazed at some of your solutions.
This functions on first try.

Now, I am struggling to learn from what you wrote here-

>I don't understand why the use of ReDim in the: "ReDim b(1 To UBound(a, 1), 1 To UBound(a, 2))"
line... b was dimmed as a variant, but never set to any value prior to this line? I know there is logic at work here, I just don't know it.

>I understand your use of the Scripting Dictionary,
but let me see if I read this correct:

j = dic(a(i, 1) & "|" & a(i, 3))

you're pipe-ing the contents of Column 1 and Column 3 into one Variable "J"
then iterating through the rows where those two are the same; correct?

Thanks again...this is just great.
 

DanteAmor

Well-known Member
Joined
Dec 3, 2018
Messages
14,177
Office Version
  1. 2007
Platform
  1. Windows
I wrote some comments hoping to clarify your doubts.

ReDim b(1 To UBound(a, 1), 1 To UBound(a, 2))
The matrix 'b' is resized to receive the "possible" result, with a total of rows from the matrix 'a' UBound (a, 1) and the number of columns from the same matrix 'a' UBound(a, 2).
I mean "possible" results, because they can be the same number of rows in the matrix 'a' in the case of no duplicates. Or it can be fewer rows, if there are duplicates. In either case, the matrix 'b' must be sized to receive the maximum number of rows.
_____________________________________________________________


j = dic(a(i, 1) & "|" & a(i, 3))
To explain that line, I must go back in the code.

VBA Code:
    If Not dic.exists(a(i, 1) & "|" & a(i, 3)) Then
      j = j + 1
      dic(a(i, 1) & "|" & a(i, 3)) = j

If the index of the data in columns 1 "|" 3, does not exist, then two pieces of data are stored in the dictionary, the key that contains the data in columns 1 "|" 3 and the value of j.
At the beginning j is equal to 0, then there is the line:
j = j + 1
j = 0 + 1
j = 1
so the item for the key is 1
That means, in the logic of this code, that in row 1 we have the first key

So, let's see what happens if the key of the data in columns 1 "|" 3 already exist.
j = dic(a(i, 1) & "|" & a(i, 3))
With that statement I retrieve the row number according to the index of the data in columns 1 "|" 3.

Let's run the code to make it more understandable

First time:
not exists
j = j +1
j = 0 + 1
j = 1
key = "agent one|20110"
item = 1
b(1, 5) = "the text in this cell"

Second time:
exists
j = dic(a(i, 1) & "|" & a(i, 3))
j = 1
b(1, 5) = b(1, 5) & " " & "is suppose to be"
b(1, 5) = "the text in this cell" & "is suppose to be"

...

fifth time
not exists
j = j +1
j = 1 + 1
j = 2
key = "agent one|20025"
item = 2
b(2, 5) = "text one"

sixth time:
exists
j = dic(a(i, 1) & "|" & a(i, 3))
I retrieve the item stored in the dictionary according to the data
j = 2
b(2, 5) = b(2, 5) & " " & "text two"
b(2, 5) = "text one" & "text two"

I hope it helps.
 

blbat

Active Member
Joined
Mar 24, 2010
Messages
325

ADVERTISEMENT

i'm going to re-read this a few times!
 

blbat

Active Member
Joined
Mar 24, 2010
Messages
325
What a good link!
Now I know why you used Late binding by Dimming the Dictionary as an Object.
 
Master Excel Bundle

Excel contains over 450 functions, with more added every year. That’s a huge number, so where should you start? Right here with this bundle.

Forum statistics

Threads
1,152,043
Messages
5,767,797
Members
425,436
Latest member
MSPaperclipMan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top