making two uinque list with "similarity" and "counting"

aminexcel

Board Regular
Joined
May 2, 2009
Messages
63
Mr.Excel Members
with hello and regard
i have Column A and B of this table. Column A code are commodity codes and column B are countries that import this commodities (my data are in 325678 rows and i am working with excel 2007):

<table style="border-collapse: collapse; width: 303pt;" border="0" cellpadding="0" cellspacing="0" width="404"><col style="width: 41pt;" width="55"> <col style="width: 23pt;" width="30"> <col style="width: 39pt;" width="52"> <col style="width: 41pt;" width="55"> <col style="width: 159pt;" width="212"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl67" style="height: 15pt; width: 41pt;" align="center" height="20" width="55">A</td> <td class="xl67" style="border-left: medium none; width: 23pt;" align="center" width="30">B</td> <td class="xl67" style="border-left: medium none; width: 39pt;" align="center" width="52">C</td> <td class="xl67" style="border-left: medium none; width: 41pt;" align="center" width="55">D</td> <td class="xl67" style="border-left: medium none; width: 159pt;" align="center" width="212">E</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl68" style="height: 15pt; border-top: medium none;" align="center" height="20">8-digit</td> <td class="xl68" style="border-top: medium none; border-left: medium none;" align="center">iso</td> <td class="xl68" style="border-top: medium none; border-left: medium none;" align="center">6-digit</td> <td class="xl68" style="border-top: medium none; border-left: medium none;" align="center">unique</td> <td class="xl68" style="border-top: medium none; border-left: medium none;" align="center">Number of different countries</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01041000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">ARE</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010410</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010410</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">1</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01042000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">AFG</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">4</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01042010</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">AFG</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040130</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">1</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01042020</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">QAT</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040310</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">1</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01042030</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">IRQ</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040630</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">1</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">01042040</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">KWT</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">010420</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040690</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">3</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04013000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">TJK</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040130</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">060310</td> <td class="xl69" style="border-top: medium none; border-left: medium none;" align="center">4</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04031090</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">USA</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040310</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04063000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">SAU</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040630</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04069000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">CIV</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040690</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04069010</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">PAK</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040690</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04069020</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">GBR</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040690</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">04069070</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">CIV</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">040690</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">06031040</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">KAZ</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">060310</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">06031060</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">RUS</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">060310</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">06031070</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">UKR</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">060310</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" align="center" height="20">06031080</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="center">TWN</td> <td class="xl65" style="border-top: medium none; border-left: medium none;" align="center">060310</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> <td class="xl70" style="border-top: medium none; border-left: medium none;" align="center">
</td> </tr> </tbody></table>
i want a macro that reproduce columns C, D and E:
column C = left(A,2)
column D = unique list of C
column E = number of different countries in column B for each column D codes
note: similar codes in column C may have similar or different countries in column B

thanks a lot
aminexcel
 
Last edited:

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Try this:
Code:
Option Explicit

Sub Summarize()
'JBeaucaire  2/22/2010
Dim Lastrow As Long
Application.ScreenUpdating = False
Lastrow = Range("A" & Rows.Count).End(xlUp).Row

Range("C:E").ClearContents
Range("C1") = "6-digit"
Range("C2:C" & Lastrow).FormulaR1C1 = "=LEFT(RC1,6)"
Range("C1:C" & Lastrow).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1"), Unique:=True
Range("D1:E1") = ["Unique","Number of Countries"}]

Lastrow = Range("D" & Rows.Count).End(xlUp).Row
    With Range("E2:E" & Lastrow)
        .FormulaR1C1 = "=COUNTIF(C1,RC3&""*"")"
        .Value = .Value
    End With

Application.ScreenUpdating = True
End Sub
 
Upvote 0
Yes, you're right, the final column was incorrect. Sorry about that. I hoped a simple worksheet formula would suffice, but it didn't. So try this instead...
Code:
Option Explicit

Sub Summarize()
'JBeaucaire  2/22/2010
Dim Lastrow As Long, BottomRow As Long
Dim u As Long, c As Long, MyCount As Long
Dim Buf As String, Uniques As Variant, Countries As Variant

Application.ScreenUpdating = False
Lastrow = Range("A" & Rows.Count).End(xlUp).Row

Range("C:E").ClearContents
Range("C1") = "6-digit"
Range("C2:C" & Lastrow).FormulaR1C1 = "=LEFT(RC1,6)"
Range("C1:C" & Lastrow).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1"), Unique:=True
Range("D1:E1") = [{"Unique","Number of Countries"}]

BottomRow = Range("D" & Rows.Count).End(xlUp).Row
Uniques = Application.WorksheetFunction.Transpose(Range("D2:D" & BottomRow).Value2)
Buf = ","

For u = LBound(Uniques) To UBound(Uniques)
    Range("C:C").AutoFilter Field:=1, Criteria1:=Uniques(u)
    Countries = Application.WorksheetFunction.Transpose(Range("B2:B" & Lastrow).SpecialCells(xlCellTypeVisible).Value2)
    If IsArray(Countries) Then
        For c = LBound(Countries) To UBound(Countries)
            If Not InStr(Buf, Countries(c)) > 0 Then
                Buf = Buf & Countries(c) & ","
                    MyCount = MyCount + 1
            End If
        Next c
        Uniques(u) = MyCount
        MyCount = 0
    Else
        Uniques(u) = 1
    End If
    Buf = ","
Next u

ActiveSheet.AutoFilterMode = False
ActiveSheet.Range("E2").Resize(UBound(Uniques)).Value = Application.WorksheetFunction.Transpose(Uniques)
Application.ScreenUpdating = True
End Sub
 
Upvote 0
it does not work
it can not reproduce C, D and E columns described in question, please if it is possible, try your macro on this two column before sending:

<table style="border-collapse: collapse; width: 72pt;" border="0" cellpadding="0" cellspacing="0" width="96"><col style="width: 42pt;" width="56"> <col style="width: 30pt;" width="40"> <tbody><tr style="height: 15pt;" height="20"> <td style="height: 15pt; width: 42pt;" align="center" height="20" width="56">8-digit </td> <td style="width: 30pt;" align="center" width="40">iso </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1041000</td> <td align="center">ARE </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1042000</td> <td align="center">AFG </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1042010</td> <td align="center">AFG </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1042020</td> <td align="center">QAT </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1042030</td> <td align="center">IRQ </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">1042040</td> <td align="center">KWT </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4013000</td> <td align="center">TJK </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4031090</td> <td align="center">USA </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4063000</td> <td align="center">SAU </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4069000</td> <td align="center">CIV </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4069010</td> <td align="center">PAK </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4069020</td> <td align="center">GBR </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">4069070</td> <td align="center">CIV </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">6031040</td> <td align="center">KAZ </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">6031060</td> <td align="center">RUS </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">6031070</td> <td align="center">UKR </td> </tr> <tr style="height: 15pt;" height="20"> <td style="height: 15pt;" align="center" height="20">6031080</td> <td align="center">TWN </td> </tr> </tbody></table>
 
Upvote 0
It works great.

Excel Workbook
ABCDE
18-digitiso***
201041000ARE***
301042000AFG***
401042010AFG***
501042020QAT***
601042030IRQ***
701042040KWT***
804013000TJK***
904031090USA***
1004063000SAU***
1104069000CIV***
1204069010PAK***
1304069020GBR***
1404069070CIV***
1506031040KAZ***
1606031060RUS***
1706031070UKR***
1806031080TWN***
BEFORE
Excel Workbook
ABCDE
18-digitiso6-digitUniqueNumber of Countries
201041000ARE0104100104101
301042000AFG0104200104204
401042010AFG0104200401301
501042020QAT0104200403101
601042030IRQ0104200406301
701042040KWT0104200406903
804013000TJK0401300603104
904031090USA040310**
1004063000SAU040630**
1104069000CIV040690**
1204069010PAK040690**
1304069020GBR040690**
1404069070CIV040690**
1506031040KAZ060310**
1606031060RUS060310**
1706031070UKR060310**
1806031080TWN060310**
AFTER


NOTE: Column A is text values, your sample data in post #1 shows leading zeros and the only way to do that and have the zeros actually exist is for the data to BE text. Your column C shows the leading zeros as part of the 6-digit string, so the zeros have to be real...thus text strings.

When the column A is changed to numbers like you changed in post #6, everything is different. Does that mean column C should become 5 digits (leading zero stripped there, too) or what?

I can adjust the macro so that is works with numerics in column A, but you'll need to present the full sample data set again, like in post #1. This would be notably different.
 
Last edited:
Upvote 0
Assuming you want column A to be numeric and column C to be 5-digits for the match, then this adjusted version creates the same results as your original post:

Excel Workbook
ABCDE
18-digitiso5-digitUniqueNumber of Countries
21041000ARE10410104101
31042000AFG10420104204
41042010AFG10420401301
51042020QAT10420403101
61042030IRQ10420406301
71042040KWT10420406903
84013000TJK40130603104
94031090USA40310
104063000SAU40630
114069000CIV40690
124069010PAK40690
134069020GBR40690
144069070CIV40690
156031040KAZ60310
166031060RUS60310
176031070UKR60310
186031080TWN60310
Sheet3


Code:
Sub SummarizeNumeric()
'JBeaucaire  2/23/2010
Dim Lastrow As Long, BottomRow As Long
Dim u As Long, c As Long, MyCount As Long
Dim Buf As String, Uniques As Variant, Countries As Variant

Application.ScreenUpdating = False
Lastrow = Range("A" & Rows.Count).End(xlUp).Row

Range("C:E").ClearContents
Range("C1") = "5-digit"
Range("C2:C" & Lastrow).FormulaR1C1 = "=LEFT(RC1,5)+0"
Range("C1:C" & Lastrow).AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1"), Unique:=True
Range("D1:E1") = [{"Unique","Number of Countries"}]

BottomRow = Range("D" & Rows.Count).End(xlUp).Row
Uniques = Application.WorksheetFunction.Transpose(Range("D2:D" & BottomRow).Value2)
Buf = ","


For u = LBound(Uniques) To UBound(Uniques)
    Range("C:C").AutoFilter Field:=1, Criteria1:=Uniques(u)
    Countries = Application.WorksheetFunction.Transpose(Range("B2:B" & Lastrow).SpecialCells(xlCellTypeVisible).Value2)
    If IsArray(Countries) Then
        For c = LBound(Countries) To UBound(Countries)
            If Not InStr(Buf, Countries(c)) > 0 Then
                Buf = Buf & Countries(c) & ","
                    MyCount = MyCount + 1
            End If
        Next c
        Uniques(u) = MyCount
        MyCount = 0
    Else
        Uniques(u) = 1
    End If
    Buf = ","
Next u

ActiveSheet.AutoFilterMode = False
ActiveSheet.Range("E2").Resize(UBound(Uniques)).Value = Application.WorksheetFunction.Transpose(Uniques)
Application.ScreenUpdating = True
End Sub
 
Upvote 0

Forum statistics

Threads
1,214,387
Messages
6,119,225
Members
448,877
Latest member
gb24

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top