Big table with long texts (>500chars/cell) eliminate duplicates create matrix

CsJHUN · Mar 23, 2021

Hi guys,
I have a 26 column, 30k rows base table
30k rows belong to 106 doc
There is duplication in 30k rows (concatenate each column for each row) Removing duplication not work, always miss ~100 rows, and there is duplication within that 100.
I tried the built-in remove duplicates, used macro and used arrays to remove duplicated items. Looks like not working (11925 unique row) or the next step missing something
The concatenate cell lenght (as value) is over 500, some even over 1k character long
I want to create a matrix, with unique rows and docs on column.
Here is a sample from start



start
Item	doc
a	1
b	1
c	1
b	2
c	2
d	2
a	3
d	3

(30k rows, 26 column)
And the result



Result
Item / Doc	1	2	3
a	x		x
b	x	x
c	x	x
d		x	x

(~11k row, 106 column)

step i tried:
- create the 30k rows from the 106 doc (each separate xlsx file)
- concatenate the data for each row (its a must, imho, thats makes it unique)
- create unique list of docs (106 pcs)
- create unique list of the 30k row. (~11k pcs)
- find the 30k row if its fit inside the 11k, which is not! some of the 30k has 0 result on 11k
- mark the 1st occurence of the 11k in each column of the 106 docs

Any advice is welcome.
Cheers
J

Big table with long texts (>500chars/cell) eliminate duplicates create matrix

CsJHUN

Active Member

Excel Facts

Similar threads

Forum statistics

Share this page

Big table with long texts (>500chars/cell) eliminate duplicates create matrix

CsJHUN

Active Member

Excel Facts

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock