Calculate Duplicate Frequency

MisterMeehan

New Member
Joined
Jun 22, 2015
Messages
10
Hi,


New poster here :wink:


I have a large dataset (~100,000 rows) where I am trying to count the number of duplicate values in a given column. (There are a lot!)


I would like to generate two columns showing which numbers are duplicates (column 1) and how many times (frequency) that each number is repeated (column 2). Please see example below.

How do I do that?



Sorry if this is an obvious thing to work out, but I have no experience with anything other than rudimentary tasks in Excel. Furthermore, I cannot see an example of how to do this in the archives.


I trust this can be calculated using a formula because I wouldn’t know how to apply any Visual Basic. Please advise how to work this out in easy steps.


Thanks for any help given.



Numbers

DuplicateNo.

Frequency

30

10

2

30

20

4

10

30

2

10

40

3

40

40

40

20

20

20

20


 
You are right. The pivot table approach fell over when we put that many records in. We are now trying another approach, back to formulae.

Even the processing the spreadsheets on a 4GB memory PC slows down to a crawl, typically taking a few hours per run through. Anyway... battle on.

Thanks for your kind follow up.

My comment was related rather to the array-processing approach kumar lama suggests, not to the pivot table approach. If the latter is failing for efficiency reasons, the former certainly will. You probably need more memory.

Running Advanced Filter for a unique list than doing counting as DRSteele suggests should be worth trying.

And resorting to Access or SQL will certainly succeed.

A speedy formula approach can also be tried, which trades off memory against speed. That runs as follows:

Row\Col
A​
B​
C​
D​
E​
F​
G​
1​
Numbers
0
4​
2​
30​
1​
#Unique List#Frequency
3​
30​
30​
2​
4​
10​
2​
10​
2​
5​
10​
40​
3​
6​
40​
3​
20​
4​
7​
40​
8​
40​
9​
20​
4​
10​
20​
11​
20​
12​
20​

C1: 0 (This is mandatory.)

C2, enter and copy down:

=IF(ISNUMBER(MATCH(B2,$B$1:B1,0)),"",LOOKUP(9.99999999999999E+307,$C$1:C1)+1)

E1, just enter:

=LOOKUP(9.99999999999999E+307,C:C)

E3, enter and copy down:

=IF(ROWS($E$3:E3)<=$E$1,LOOKUP(ROWS($E$3:E3),C:C,B:B),"")

F3, enter and copy down:

=IF(E3="","",COUNTIFS(B:B,E3))


=LOOKUP(9.99999999999999E+307,C:C)
 
Upvote 0

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().
Thanks. Looking into things like that to streamline the process.

Just a thought: if this isn't a live and active list where you'll always have to do this, then filter it: set up the data properly with column headers and the like, then use Filter, Unique Records Only, Extract to Another sheet. Advanced filter may help too.
 
Upvote 0

Forum statistics

Threads
1,215,262
Messages
6,123,938
Members
449,134
Latest member
NickWBA

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top