Extract Unique Value from List More Than 1 Million

kumara_faith

Well-known Member
Joined
Aug 19, 2006
Messages
918
Office Version
  1. 365
Hi,

I have a few sheets with a long list of duplicate values and the total list is more than 2.5 million. Example is 5 sheets and each sheets have about 500,000 of values. I need to produce a final list of all 5 sheets unique values , meaning there should not be any duplicates with the total list from all 5 sheets combined values.

Normally I would just combine all list in one sheet and use the UNIQUE function to extract the unique values. Since Excel can only accommodate up to 1 million in one sheet, is there another way to accomplish this ?
 

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
Are you happy with a VBA solution?
If so what are the sheet names & where on those sheets are your lists?
 
Upvote 0
You could use something like this, as long as you don't have too many distinct values
+Fluff 1.xlsm
A
1County
2Surrey
3Greater London
4Somerset
5Devon
6Bedfordshire
7Greater London
8East Sussex
9West Yorkshire
10Hertfordshire
11Cambridgeshire
12Gloucestershire
13Tyne and Wear
14Lancashire
15Hampshire
16Greater London
17Greater London
18West Yorkshire
19Hampshire
20Greater London
21Hertfordshire
22Staffordshire
23Tyne and Wear
24Nottinghamshire
25Wiltshire
26Staffordshire
27Derbyshire
28Leicestershire
29County Durham
30Greater London
31
Sheet1


+Fluff 1.xlsm
A
1County
2West Sussex
3Greater London
4Lincolnshire
5Berkshire
6Leicestershire
7South Yorkshire
8Surrey
9Leicestershire
10West Midlands
11Nottinghamshire
12Northamptonshire
13Staffordshire
14Wiltshire
15Merseyside
16Norfolk
17Hertfordshire
18Nottinghamshire
19East Riding of Yorkshire
20Hertfordshire
21Berkshire
22Greater London
23Hampshire
24Essex
25Essex
26Warwickshire
27Berkshire
28Surrey
29Tyne and Wear
30West Yorkshire
Sheet2


+Fluff 1.xlsm
A
1County
2Cumbria
3Greater London
4Greater London
5Greater London
6West Midlands
7Greater Manchester
8Greater London
9Greater London
10West Midlands
11Bedfordshire
12Greater London
13Derbyshire
14Essex
15Gloucestershire
16Cambridgeshire
17Greater London
18Kent
19Greater London
20Essex
21Bristol
22Kent
23Norfolk
24Lancashire
25West Midlands
26Cheshire
27West Midlands
28West Midlands
29Warwickshire
30Tyne and Wear
Sheet3


+Fluff 1.xlsm
ABCD
1
2SurreyWest SussexCumbriaSurrey
3Greater LondonGreater LondonGreater LondonGreater London
4SomersetLincolnshireWest MidlandsSomerset
5DevonBerkshireGreater ManchesterDevon
6BedfordshireLeicestershireBedfordshireBedfordshire
7East SussexSouth YorkshireDerbyshireEast Sussex
8West YorkshireSurreyEssexWest Yorkshire
9HertfordshireWest MidlandsGloucestershireHertfordshire
10CambridgeshireNottinghamshireCambridgeshireCambridgeshire
11GloucestershireNorthamptonshireKentGloucestershire
12Tyne and WearStaffordshireBristolTyne and Wear
13LancashireWiltshireNorfolkLancashire
14HampshireMerseysideLancashireHampshire
15StaffordshireNorfolkCheshireStaffordshire
16NottinghamshireHertfordshireWarwickshireNottinghamshire
17WiltshireEast Riding of YorkshireTyne and WearWiltshire
18DerbyshireHampshireDerbyshire
19LeicestershireEssexLeicestershire
20County DurhamWarwickshireCounty Durham
21Tyne and WearWest Sussex
22West YorkshireLincolnshire
23Berkshire
24South Yorkshire
25West Midlands
26Northamptonshire
27Merseyside
28Norfolk
29East Riding of Yorkshire
30Essex
31Warwickshire
32Cumbria
33Greater Manchester
34Kent
35Bristol
36Cheshire
37
Sheet4
Cell Formulas
RangeFormula
A2:A20A2=UNIQUE(FILTER(Sheet1!A2:A10000,Sheet1!A2:A10000<>""))
B2:B22B2=UNIQUE(FILTER(Sheet2!A2:A9971,Sheet2!A2:A9971<>""))
C2:C17C2=UNIQUE(FILTER(Sheet3!A2:A9941,Sheet3!A2:A9941<>""))
D2:D36D2=LET(Rngs,(A2#,B2#,C2#),Rws,MAX(ROWS(A2#),ROWS(B2#),ROWS(C2#)),Seq,SEQUENCE(Rws*AREAS(Rngs),,0),Ary,IFERROR(INDEX(Rngs,MOD(Seq,Rws)+1,1,INT(Seq/Rws)+1),""),UNIQUE(FILTER(Ary,Ary<>"")))
Dynamic array formulas.
 
Upvote 0
Hi Fluff,

Thank you so much for the solution and it worked. Appreciate your valuable time and patience. ?

Hi Alex,

Thank you for the additional solutions and appreciate it.
 
Upvote 0
I believe this is an alternative to the D2 formula:

Excel Formula:
=LET(r1,ROWS(A2#),r2,ROWS(B2#)+r1,r3,ROWS(C2#)+r2,seq,SEQUENCE(r3),m1,MATCH(seq-1,CHOOSE({1,2,3},0,r1,r2)),INDEX((A2#,B2#,C2#),seq-CHOOSE(m1,0,r1,r2),1,m1))
 
Upvote 0
Glad we could help & thanks for the feedback.
 
Upvote 0

Forum statistics

Threads
1,213,506
Messages
6,114,027
Members
448,543
Latest member
MartinLarkin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top