Unique list of occurrences in another tab help

abutler1986

New Member
Joined
Mar 11, 2015
Messages
34
Hi Experts,

I was wondering if you can please assist me.

I need to somehow create a list of unique entities based on a column, then for each occurrence show the percentage of the total? Just to make it slightly harder, I only want to take the first two characters from each occurrence.

Example:

Column A:
HP18 8GH
HP18 8GH
HP18 8GH
SE21 17H
SE21 17H
LU7 6HH
LU8 1JJ
HP17 7HG
S045 4HW

9 in Total..

I need to show on a separate tab as:

HP - 4 - (4/9)
SE - 2 - (2/9)
LU - 2 - (2/9)
SO - 1 - (1/9)

Hope someone can help me!

Thanks,
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
Data column "A", Results column "A" sheet2.
Code:
[COLOR="Navy"]Sub[/COLOR] MG26May23
[COLOR="Navy"]Dim[/COLOR] Rng [COLOR="Navy"]As[/COLOR] Range, Dn [COLOR="Navy"]As[/COLOR] Range, n [COLOR="Navy"]As[/COLOR] [COLOR="Navy"]Long,[/COLOR] Q [COLOR="Navy"]As[/COLOR] Variant
[COLOR="Navy"]Set[/COLOR] Rng = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp))
[COLOR="Navy"]With[/COLOR] CreateObject("scripting.dictionary")
.CompareMode = vbTextCompare
ReDim Ray(1 To Rng.Count)
[COLOR="Navy"]For[/COLOR] [COLOR="Navy"]Each[/COLOR] Dn [COLOR="Navy"]In[/COLOR] Rng
    [COLOR="Navy"]If[/COLOR] Not .Exists(Left(Dn.Value, 2)) [COLOR="Navy"]Then[/COLOR]
        n = n + 1
        Ray(n) = Left(Dn.Value, 2) & " - 1 - (1/" & Rng.Count & ")"
        .Add Left(Dn.Value, 2), Array(n, 1)
    [COLOR="Navy"]Else[/COLOR]
        Q = .Item(Left(Dn.Value, 2))
        Q(1) = Q(1) + 1
        Ray(Q(0)) = Left(Dn.Value, 2) & " -" & Q(1) & " - (" & Q(1) & "/" & Rng.Count & ")"
        .Item(Left(Dn.Value, 2)) = Q
    [COLOR="Navy"]End[/COLOR] If
[COLOR="Navy"]Next[/COLOR]
Sheets("Sheet2").Range("A1").Resize(.Count) = Application.Transpose(Ray)
[COLOR="Navy"]End[/COLOR] [COLOR="Navy"]Sub[/COLOR]
Regards Mick
 
Upvote 0
Data column "A", Results column "A" sheet2.
Code:
[COLOR=Navy]Sub[/COLOR] MG26May23
[COLOR=Navy]Dim[/COLOR] Rng [COLOR=Navy]As[/COLOR] Range, Dn [COLOR=Navy]As[/COLOR] Range, n [COLOR=Navy]As[/COLOR] [COLOR=Navy]Long,[/COLOR] Q [COLOR=Navy]As[/COLOR] Variant
[COLOR=Navy]Set[/COLOR] Rng = Range(Range("A2"), Range("A" & Rows.Count).End(xlUp))
[COLOR=Navy]With[/COLOR] CreateObject("scripting.dictionary")
.CompareMode = vbTextCompare
ReDim Ray(1 To Rng.Count)
[COLOR=Navy]For[/COLOR] [COLOR=Navy]Each[/COLOR] Dn [COLOR=Navy]In[/COLOR] Rng
    [COLOR=Navy]If[/COLOR] Not .Exists(Left(Dn.Value, 2)) [COLOR=Navy]Then[/COLOR]
        n = n + 1
        Ray(n) = Left(Dn.Value, 2) & " - 1 - (1/" & Rng.Count & ")"
        .Add Left(Dn.Value, 2), Array(n, 1)
    [COLOR=Navy]Else[/COLOR]
        Q = .Item(Left(Dn.Value, 2))
        Q(1) = Q(1) + 1
        Ray(Q(0)) = Left(Dn.Value, 2) & " -" & Q(1) & " - (" & Q(1) & "/" & Rng.Count & ")"
        .Item(Left(Dn.Value, 2)) = Q
    [COLOR=Navy]End[/COLOR] If
[COLOR=Navy]Next[/COLOR]
Sheets("Sheet2").Range("A1").Resize(.Count) = Application.Transpose(Ray)
[COLOR=Navy]End[/COLOR] [COLOR=Navy]Sub[/COLOR]
Regards Mick


Thanks for this, to be a royal pain though is there anyway of doing this in a formula, so to not have to use a macro basically?
 
Upvote 0
Hi all,

to be confirmed with control+shift+enter and copied down


Code:
=IFERROR(INDEX(LEFT($A$2:$A$10,2),SMALL(IF(ISNUMBER(MATCH(ROW($A$2:$A$10)-1,MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),0)),MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),""),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT(A2:A10,2)=INDEX(LEFT($A$2:$A$10,2),SMALL(IF(ISNUMBER(MATCH(ROW($A$2:$A$10)-1,MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),0)),MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),""),ROW(A1)))))&" -  ("&SUMPRODUCT(--(LEFT(A2:A10,2)=INDEX(LEFT($A$2:$A$10,2),SMALL(IF(ISNUMBER(MATCH(ROW($A$2:$A$10)-1,MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),0)),MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),""),ROW(A1)))))&"/"&COUNTA(A$2:A$10)&")","")


Vba is the right way

Regards
 
Upvote 0
Hi again

if you are using Excel 2010 or newer versions, the formula does not need array status. In B2 to be copied down:

Code:
=IFERROR(INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT($A$2:$A$10,2)=INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))))&" - "&"("&SUMPRODUCT(--(LEFT($A$2:$A$10,2)=INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))))&"/"&COUNTA($A$2:$A$10)&")","")


Regards
 
Upvote 0
Hi again

if you are using Excel 2010 or newer versions, the formula does not need array status. In B2 to be copied down:

Code:
=IFERROR(INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))&" - "&SUMPRODUCT(--(LEFT($A$2:$A$10,2)=INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))))&" - "&"("&SUMPRODUCT(--(LEFT($A$2:$A$10,2)=INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1)))))&"/"&COUNTA($A$2:$A$10)&")","")


Regards

Hi thanks for this but perhaps I have confused matters.

The main bulk of the formula just needs to take the first two characters from a column and de-dupe

Sheet 1:

LU7 2GH
LU7 2GH
LU7 2GH
LU7 2GH
LU7 2GH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
HP21 8YH
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ
SE18 9JJ

Sheet 2:

String (A)No. of Occurrences (B)Percentage (C)
LU5=B2/B6
HP11=B3/B6
SE8=B4/B6
Total=B2+B3+B4

<tbody>
</tbody>
 
Upvote 0
Hi,

using excel 2010, you could use the first segment of previous formula (...untill &" -"&sumproduct...)


Code:
=IFERROR(INDEX(LEFT(A$2:A$10,2),AGGREGATE(15,6,ROW($A$2:$A$10)-1/(FREQUENCY(MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0),MATCH(LEFT(A$2:A$10,2),INDEX(LEFT(A$2:A$10,2),),0))>0),ROW(A1))),"")

No need to control+shift+enter

-----------------------------

Excel 2007, to be confirmed with control+shift+enter and then copied down

Code:
=IFERROR(INDEX(LEFT($A$2:$A$10,2),SMALL(IF(ISNUMBER(MATCH(ROW($A$2:$A$10)-1,MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),0)),MATCH(LEFT($A$2:$A$10,2),LEFT($A$2:$A$10,2),0),""),ROW(A1))),"")

Regards



Hope it helps
 
Last edited:
Upvote 0
Hi again


Book1
ABCDE
2LU7 2GHLU55/23
3LU7 2GHHP1010/23
4LU7 2GHSE88/23
5LU7 2GH
6LU7 2GH
7HP21 8YH
8HP21 8YH
9HP21 8YH
10HP21 8YH
11HP21 8YH
12HP21 8YH
13HP21 8YH
14HP21 8YH
15HP21 8YH
16HP21 8YH
17SE18 9JJ
18SE18 9JJ
19SE18 9JJ
20SE18 9JJ
21SE18 9JJ
22SE18 9JJ
23SE18 9JJ
24SE18 9JJ
Foglio1
Cell Formulas
RangeFormula
D2=COUNTIF($A$2:$A$1000,C2&"*")


Hope it helps
 
Last edited:
Upvote 0
Suggest you use a pivot table. SQL could be like,

Code:
SELECT LEFT(YourHeader,2) AS [First2Chars]
FROM [YourWorksheetName$]

Readily handles huge datasets & is a smart way to work.

Manual set up. Save your data file. CTRL-N for new file, ALT-D-P to start wizard, choose external data source at first step. Get Data, Excel files, OK. Choose your file, OK to any message about no tables then via options select system tables. See worksheet/s names. Choose yours then a field continue to end of wizard & take option to edit in MS Query. Via 'SQL' icon edit SQL. OK to this & any message about not being able to graphically represent. 'Open door' icon to exit MS Query and finish pivot table. If you like, resultant worksheet can be moved into your source data file.

There are other ways to efficiently handle data, but the manual pivot table is easy to explain & generally well known. If pivot tables are unfamiliar to you, suggest you google. (If you have large datasets I strongly recommend you avoid array formulas, and even all formulas for very large datasets. A database may also be better than Excel for large datasets.)

regards
 
Upvote 0
1) Either a pivot system...

Sheet1

Row\Col
A​
B​
1​
ItemsT-Items
2​
LU7 2GHLU
3​
LU7 2GHLU
4​
LU7 2GHLU
5​
LU7 2GHLU
6​
LU7 2GHLU
7​
HP21 8YHHP
8​
HP21 8YHHP
9​
HP21 8YHHP
10​
HP21 8YHHP
11​
HP21 8YHHP
12​
HP21 8YHHP
13​
HP21 8YHHP
14​
HP21 8YHHP
15​
HP21 8YHHP
16​
HP21 8YHHP
17​
SE18 9JJSE
18​
SE18 9JJSE
19​
SE18 9JJSE
20​
SE18 9JJSE
21​
SE18 9JJSE
22​
SE18 9JJSE
23​
SE18 9JJSE
24​
SE18 9JJSE

B2, just enter and copy down:
Rich (BB code):

=LEFT(A2,2)

Now run a pivot table...

Row\Col
A​
B​
2​
3​
Row LabelsCount of T-Items
4​
HP
10​
5​
LU
5​
6​
SE
8​
7​
Grand Total
23​
8​

You can add a column showing the percentage distribution per t-item (transformed item).

2) Or a formula system...

Sheet1

Row\Col
A​
1​
Items
2​
LU7 2GH
3​
LU7 2GH
4​
LU7 2GH
5​
LU7 2GH
6​
LU7 2GH
7​
HP21 8YH
8​
HP21 8YH
9​
HP21 8YH
10​
HP21 8YH
11​
HP21 8YH
12​
HP21 8YH
13​
HP21 8YH
14​
HP21 8YH
15​
HP21 8YH
16​
HP21 8YH
17​
SE18 9JJ
18​
SE18 9JJ
19​
SE18 9JJ
20​
SE18 9JJ
21​
SE18 9JJ
22​
SE18 9JJ
23​
SE18 9JJ
24​
SE18 9JJ

Select the data range in column A and name the selection DATA using either the Name Box or the option Formulas | Name Manager. Note that we can define DATA as a dynamic named range if needed i.e., if it's not a static, unchanging range.

Define Ivec by means of Formulas | Name Manager as referring to:
Rich (BB code):

=ROW(DATA)-ROW(INDEX(DATA,1,1))+1

Sheet2

Row\Col
A​
B​
C​
1​
3​
23​
2​
Distinct T-ItemsCount% distrib
3​
LU
5​
21.74%​
4​
HP
10​
43.48%​
5​
SE
8​
34.78%​
6​

A1, control+shift+enter (cse), not just enter:
Rich (BB code):

=SUM(IF(FREQUENCY(IF(DATA<>"",MATCH(DATA,DATA,0)),Ivec),1))

A3, cse and copy down:
Rich (BB code):

=IF(ROWS($A$3:A3)<=$A$1,
   LEFT(INDEX(DATA,SMALL(IF(FREQUENCY(IF(DATA<>"",
   MATCH(DATA,DATA,0)),Ivec),Ivec),ROWS($A$3:A3))),2),"")

B3, just enter and copy down:
Rich (BB code):

=COUNTIFS(DATA,A3&"*")

C1, just enter:
Rich (BB code):

=SUM(B:B)

C3, just enter and copy down:
Rich (BB code):

=B3/$C$1

See the workbook which works out the foregoing set ups: https://dl.dropboxusercontent.com/u...rence count, and percentage distribution.xlsx.

3) Run a pivot table with SQL (see Fazza's duggestion).
 
Upvote 0

Forum statistics

Threads
1,214,406
Messages
6,119,330
Members
448,888
Latest member
Arle8907

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top