Merge duplicates in ColA while picking most frequent value in ColB

c2rn75

New Member
Joined
Sep 14, 2016
Messages
2
Hi,

I have a spreadsheet with 130000 lines, and I would really appreciate some help in merging some of the data. I have searched the forum for help, and even though I found some related post, it did not solve my problem. I have lost of rows with duplicates of an unique ID (column A), and each ID may be associated with different values (Column B). I need to merge all the duplicate ID's into one row, and then pick the most frequent value in Column B as the new value for that row. Column A contain numbers, and Column B contain text strings. If there are eg 2 values in Column B that occur equally often, I dont care which one is chosen.

I have this:

A B
1a
1a
1a
1b
1c
1d
2a
2b
2b
2b
2c
3a
3a
3a
3a
3g
4c
4c
4d
4e
4f

<colgroup><col span="2"></colgroup><tbody>
</tbody>

I want this:

1a
2b
3a
4c

<colgroup><col span="2"></colgroup><tbody>
</tbody>

Help on this is much appreciated, because manually doing this is not going to happen :)

Thanks
Cato
 

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
1a1aa
1a1aaa
1a1aaaa
1b1baaab
1c1caaabc
1d1daaabcdaaabcd
2a2aa
2b2bab
2b2babb
2b2babbb
2c2cabbbcabbbc
3a3aa
3a3aaa
3a3aaaa
3a3aaaaa
3g3gaaaagaaaag
4c4cc
4c4ccc
4d4dccd
4e4eccde
4f4fccdefccdef
I want this:
1a
2bby concatenating all the letters against 1,2,3,4
3aand then finding the longest concatenation
4c
1daaabcdaaabcd
2cabbbcabbbc
3gaaaagaaaag
4fccdefccdef
that reduces your 130,000 rows somewhat
I do not know how long your text strings are
or how many different text strings there are

<colgroup><col span="13"></colgroup><tbody>
</tbody>
 
Upvote 0
Try this for data starting "A2"and results starting "C1".
Code:
[COLOR="Navy"]Sub[/COLOR] MG14Sep50
[COLOR="Navy"]Dim[/COLOR] Ray             [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Dim[/COLOR] n               [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] Dic             [COLOR="Navy"]As[/COLOR] Object
[COLOR="Navy"]Dim[/COLOR] k               [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Dim[/COLOR] p               [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Dim[/COLOR] c               [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
[COLOR="Navy"]Dim[/COLOR] oMax            [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long[/COLOR]
    
 Ray = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp)).Resize(, 2)
 [COLOR="Navy"]Set[/COLOR] Dic = CreateObject("Scripting.Dictionary")
    Dic.CompareMode = 1
   [COLOR="Navy"]For[/COLOR] n = 1 To UBound(Ray)
            [COLOR="Navy"]If[/COLOR] Not Dic.exists(Ray(n, 1)) [COLOR="Navy"]Then[/COLOR]
                [COLOR="Navy"]Set[/COLOR] Dic(Ray(n, 1)) = CreateObject("Scripting.Dictionary")
            [COLOR="Navy"]End[/COLOR] If
        
            [COLOR="Navy"]If[/COLOR] Not Dic(Ray(n, 1)).exists(Ray(n, 2)) [COLOR="Navy"]Then[/COLOR]
                Dic(Ray(n, 1)).Add (Ray(n, 2)), 1
            [COLOR="Navy"]Else[/COLOR]
                Dic(Ray(n, 1)).Item(Ray(n, 2)) = Dic(Ray(n, 1)).Item(Ray(n, 2)) + 1
            [COLOR="Navy"]End[/COLOR] If
    [COLOR="Navy"]Next[/COLOR] n
   
    [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] k [COLOR="Navy"]In[/COLOR] Dic.Keys
        c = c + 1
        Cells(c, "C") = k
           [COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] p [COLOR="Navy"]In[/COLOR] Dic(k)
               [COLOR="Navy"]If[/COLOR] Dic(k).Item(p) > oMax [COLOR="Navy"]Then[/COLOR]
                    oMax = Dic(k).Item(p)
                    Cells(c, "D") = p
               [COLOR="Navy"]End[/COLOR] If
            [COLOR="Navy"]Next[/COLOR] p
            oMax = 0
    [COLOR="Navy"]Next[/COLOR] k
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0

Forum statistics

Threads
1,214,649
Messages
6,120,732
Members
448,987
Latest member
marion_davis

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top