# Unique list of occurrences in another tab help

#### abutler1986

##### New Member
Hi Experts,

I was wondering if you can please assist me.

I need to somehow create a list of unique entities based on a column, then for each occurrence show the percentage of the total? Just to make it slightly harder, I only want to take the first two characters from each occurrence.

Example:

Column A:
HP18 8GH
HP18 8GH
HP18 8GH
SE21 17H
SE21 17H
LU7 6HH
LU8 1JJ
HP17 7HG
S045 4HW

9 in Total..

I need to show on a separate tab as:

HP - 4 - (4/9)
SE - 2 - (2/9)
LU - 2 - (2/9)
SO - 1 - (1/9)

Hope someone can help me!

Thanks,

### Excel Facts

To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
Data column "A", Results column "A" sheet2.
Code:
``````[COLOR="Navy"]Sub[/COLOR] MG26May23
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] Q [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp))
[COLOR="Navy"]With[/COLOR] CreateObject("scripting.dictionary")
.CompareMode = vbTextCompare
ReDim Ray(1 To Rng.Count)
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
[COLOR="Navy"]If[/COLOR] Not .Exists(Left(Dn.Value, 2)) [COLOR="Navy"]Then[/COLOR]
n = n + 1
Ray(n) = Left(Dn.Value, 2) & " - 1 - (1/" & Rng.Count & ")"
[COLOR="Navy"]Else[/COLOR]
Q = .Item(Left(Dn.Value, 2))
Q(1) = Q(1) + 1
Ray(Q(0)) = Left(Dn.Value, 2) & " -" & Q(1) & " - (" & Q(1) & "/" & Rng.Count & ")"
.Item(Left(Dn.Value, 2)) = Q
[COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR]
Sheets("Sheet2").Range("A1").Resize(.Count) = Application.Transpose(Ray)
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]``````
Regards Mick

Data column "A", Results column "A" sheet2.
Code:
``````[COLOR=Navy]Sub[/COLOR] MG26May23
[COLOR=Navy]Dim[/COLOR] Rng [COLOR=Navy]As[/COLOR] Range, Dn [COLOR=Navy]As[/COLOR] Range, n [COLOR=Navy]As[/COLOR] [COLOR=Navy]Long,[/COLOR] Q [COLOR=Navy]As[/COLOR] Variant
[COLOR=Navy]Set[/COLOR] Rng = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp))
[COLOR=Navy]With[/COLOR] CreateObject("scripting.dictionary")
.CompareMode = vbTextCompare
ReDim Ray(1 To Rng.Count)
[COLOR=Navy]For[/COLOR] [COLOR=Navy]Each[/COLOR] Dn [COLOR=Navy]In[/COLOR] Rng
[COLOR=Navy]If[/COLOR] Not .Exists(Left(Dn.Value, 2)) [COLOR=Navy]Then[/COLOR]
n = n + 1
Ray(n) = Left(Dn.Value, 2) & " - 1 - (1/" & Rng.Count & ")"
[COLOR=Navy]Else[/COLOR]
Q = .Item(Left(Dn.Value, 2))
Q(1) = Q(1) + 1
Ray(Q(0)) = Left(Dn.Value, 2) & " -" & Q(1) & " - (" & Q(1) & "/" & Rng.Count & ")"
.Item(Left(Dn.Value, 2)) = Q
[COLOR=Navy]End[/COLOR] If
[COLOR=Navy]Next[/COLOR]
Sheets("Sheet2").Range("A1").Resize(.Count) = Application.Transpose(Ray)
[COLOR=Navy]End[/COLOR] [COLOR=Navy]Sub[/COLOR]``````
Regards Mick

Thanks for this, to be a royal pain though is there anyway of doing this in a formula, so to not have to use a macro basically?

Hi all,

to be confirmed with control+shift+enter and copied down

Code:
``=IFERROR(INDEX(LEFT(\$A\$2:\$A\$10,2),SMALL(IF(ISNUMBER(MATCH(ROW(\$A\$2:\$A\$10)-1,MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),0)),MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),""),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT(A2:A10,2)=INDEX(LEFT(\$A\$2:\$A\$10,2),SMALL(IF(ISNUMBER(MATCH(ROW(\$A\$2:\$A\$10)-1,MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),0)),MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),""),ROW(A1)))))&" -  ("&SUMPRODUCT(--(LEFT(A2:A10,2)=INDEX(LEFT(\$A\$2:\$A\$10,2),SMALL(IF(ISNUMBER(MATCH(ROW(\$A\$2:\$A\$10)-1,MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),0)),MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),""),ROW(A1)))))&"/"&COUNTA(A\$2:A\$10)&")","")``

Vba is the right way

Regards

Hi again

if you are using Excel 2010 or newer versions, the formula does not need array status. In B2 to be copied down:

Code:
``=IFERROR(INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT(\$A\$2:\$A\$10,2)=INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))))&" - "&"("&SUMPRODUCT(--(LEFT(\$A\$2:\$A\$10,2)=INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))))&"/"&COUNTA(\$A\$2:\$A\$10)&")","")``

Regards

Hi again

if you are using Excel 2010 or newer versions, the formula does not need array status. In B2 to be copied down:

Code:
``=IFERROR(INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT(\$A\$2:\$A\$10,2)=INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))))&" - "&"("&SUMPRODUCT(--(LEFT(\$A\$2:\$A\$10,2)=INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1)))))&"/"&COUNTA(\$A\$2:\$A\$10)&")","")``

Regards

Hi thanks for this but perhaps I have confused matters.

The main bulk of the formula just needs to take the first two characters from a column and de-dupe

Sheet 1:

LU7 2GH
LU7 2GH
LU7 2GH
LU7 2GH
LU7 2GH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ

Sheet 2:

 String (A) No. of Occurrences (B) Percentage (C) LU 5 =B2/B6 HP 11 =B3/B6 SE 8 =B4/B6 Total =B2+B3+B4

<tbody>
</tbody>

Hi,

using excel 2010, you could use the first segment of previous formula (...untill &" -"&sumproduct...)

Code:
``=IFERROR(INDEX(LEFT(A\$2:A\$10,2),AGGREGATE(15,6,ROW(\$A\$2:\$A\$10)-1/(FREQUENCY(MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0),MATCH(LEFT(A\$2:A\$10,2),INDEX(LEFT(A\$2:A\$10,2),),0))>0),ROW(A1))),"")``

No need to control+shift+enter

-----------------------------

Excel 2007, to be confirmed with control+shift+enter and then copied down

Code:
``=IFERROR(INDEX(LEFT(\$A\$2:\$A\$10,2),SMALL(IF(ISNUMBER(MATCH(ROW(\$A\$2:\$A\$10)-1,MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),0)),MATCH(LEFT(\$A\$2:\$A\$10,2),LEFT(\$A\$2:\$A\$10,2),0),""),ROW(A1))),"")``

Regards

Hope it helps

Last edited:
Hi again

Book1
ABCDE
2LU7 2GHLU55/23
3LU7 2GHHP1010/23
4LU7 2GHSE88/23
5LU7 2GH
6LU7 2GH
7HP21 8YH
8HP21 8YH
9HP21 8YH
10HP21 8YH
11HP21 8YH
12HP21 8YH
13HP21 8YH
14HP21 8YH
15HP21 8YH
16HP21 8YH
17SE18 9JJ
18SE18 9JJ
19SE18 9JJ
20SE18 9JJ
21SE18 9JJ
22SE18 9JJ
23SE18 9JJ
24SE18 9JJ
Foglio1
Cell Formulas
RangeFormula
D2=COUNTIF(\$A\$2:\$A\$1000,C2&"*")

Hope it helps

Last edited:
Suggest you use a pivot table. SQL could be like,

Code:
``````SELECT LEFT(YourHeader,2) AS [First2Chars]
FROM [YourWorksheetName\$]``````

Readily handles huge datasets & is a smart way to work.

Manual set up. Save your data file. CTRL-N for new file, ALT-D-P to start wizard, choose external data source at first step. Get Data, Excel files, OK. Choose your file, OK to any message about no tables then via options select system tables. See worksheet/s names. Choose yours then a field continue to end of wizard & take option to edit in MS Query. Via 'SQL' icon edit SQL. OK to this & any message about not being able to graphically represent. 'Open door' icon to exit MS Query and finish pivot table. If you like, resultant worksheet can be moved into your source data file.

There are other ways to efficiently handle data, but the manual pivot table is easy to explain & generally well known. If pivot tables are unfamiliar to you, suggest you google. (If you have large datasets I strongly recommend you avoid array formulas, and even all formulas for very large datasets. A database may also be better than Excel for large datasets.)

regards

1) Either a pivot system...

Sheet1

 Row\Col A​ B​ 1​ Items T-Items 2​ LU7 2GH LU 3​ LU7 2GH LU 4​ LU7 2GH LU 5​ LU7 2GH LU 6​ LU7 2GH LU 7​ HP21 8YH HP 8​ HP21 8YH HP 9​ HP21 8YH HP 10​ HP21 8YH HP 11​ HP21 8YH HP 12​ HP21 8YH HP 13​ HP21 8YH HP 14​ HP21 8YH HP 15​ HP21 8YH HP 16​ HP21 8YH HP 17​ SE18 9JJ SE 18​ SE18 9JJ SE 19​ SE18 9JJ SE 20​ SE18 9JJ SE 21​ SE18 9JJ SE 22​ SE18 9JJ SE 23​ SE18 9JJ SE 24​ SE18 9JJ SE

B2, just enter and copy down:
Rich (BB code):
``````
=LEFT(A2,2)
``````

Now run a pivot table...

 Row\Col A​ B​ 2​ 3​ Row Labels Count of T-Items 4​ HP 10​ 5​ LU 5​ 6​ SE 8​ 7​ Grand Total 23​ 8​

You can add a column showing the percentage distribution per t-item (transformed item).

2) Or a formula system...

Sheet1

 Row\Col A​ 1​ Items 2​ LU7 2GH 3​ LU7 2GH 4​ LU7 2GH 5​ LU7 2GH 6​ LU7 2GH 7​ HP21 8YH 8​ HP21 8YH 9​ HP21 8YH 10​ HP21 8YH 11​ HP21 8YH 12​ HP21 8YH 13​ HP21 8YH 14​ HP21 8YH 15​ HP21 8YH 16​ HP21 8YH 17​ SE18 9JJ 18​ SE18 9JJ 19​ SE18 9JJ 20​ SE18 9JJ 21​ SE18 9JJ 22​ SE18 9JJ 23​ SE18 9JJ 24​ SE18 9JJ

Select the data range in column A and name the selection DATA using either the Name Box or the option Formulas | Name Manager. Note that we can define DATA as a dynamic named range if needed i.e., if it's not a static, unchanging range.

Define Ivec by means of Formulas | Name Manager as referring to:
Rich (BB code):
``````
=ROW(DATA)-ROW(INDEX(DATA,1,1))+1
``````

Sheet2

 Row\Col A​ B​ C​ 1​ 3​ 23​ 2​ Distinct T-Items Count % distrib 3​ LU 5​ 21.74%​ 4​ HP 10​ 43.48%​ 5​ SE 8​ 34.78%​ 6​

A1, control+shift+enter (cse), not just enter:
Rich (BB code):
``````
=SUM(IF(FREQUENCY(IF(DATA<>"",MATCH(DATA,DATA,0)),Ivec),1))
``````

A3, cse and copy down:
Rich (BB code):
``````
=IF(ROWS(\$A\$3:A3)<=\$A\$1,
LEFT(INDEX(DATA,SMALL(IF(FREQUENCY(IF(DATA<>"",
MATCH(DATA,DATA,0)),Ivec),Ivec),ROWS(\$A\$3:A3))),2),"")
``````

B3, just enter and copy down:
Rich (BB code):
``````
=COUNTIFS(DATA,A3&"*")
``````

C1, just enter:
Rich (BB code):
``````
=SUM(B:B)
``````

C3, just enter and copy down:
Rich (BB code):
``````
=B3/\$C\$1
``````

See the workbook which works out the foregoing set ups: https://dl.dropboxusercontent.com/u...rence count, and percentage distribution.xlsx.

3) Run a pivot table with SQL (see Fazza's duggestion).

Replies
1
Views
304
Replies
7
Views
562
Replies
2
Views
3K

### Forum statistics

1,196,042
Messages
6,013,051
Members
441,746
Latest member
ArtemisAlex ### We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.

### Which adblocker are you using?    1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option. Go back

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com". Go back

### Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button. Go back

### Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button. Go back