How to find TRUE DUPLICATES?

anmac1789

New Member
Joined
Dec 6, 2020
Messages
43
Office Version
  1. 365
Platform
  1. Windows
Hello, I am trying to find secondary duplicates together with Matching a specific path. For example, I am trying to match - size, date created, date modified, date accessed and highlight these rows across the entire table while not highlighting other rows for each these 4 properties are not matching. For example, suppose that there is an error made somehow and 2 files have the same 3 dates and times, size and name but when viewing the photos, they are completely different picture, if this is the case, then is there a 5th property that I can check for to make sure two files are TRULY distinct ? Btw, how can I use textjoin with conditional formatting in "use a formula to determine which cells to format" under the new formatting rule?

FYI - I used voidtool's software called Everything to populate a list with filenames, sizes, 3 dates and times -- date created, date modified, date accessed (4 properties so far). I choose not to match the name because there can be duplicates (such as file1.jpg, file1 (1).jpg)
 

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!
Hi,

There might be an easier solution, but below should work.
I've created two helper-columns (Concat in column A and G), and then just did a countif, see below (notice you should probably change from semicolon to comma) .
If the count is above 0, there is a duplicate.
You can also implement this in conditional format, if you want the colors, but in this way, you can simply use the filter function.

1665120865304.png


/Skovgaard
 

Attachments

  • 1665120801462.png
    1665120801462.png
    20.3 KB · Views: 6
Upvote 0
Hi,

There might be an easier solution, but below should work.
I've created two helper-columns (Concat in column A and G), and then just did a countif, see below (notice you should probably change from semicolon to comma) .
If the count is above 0, there is a duplicate.
You can also implement this in conditional format, if you want the colors, but in this way, you can simply use the filter function.

View attachment 75662

/Skovgaard
What if the problem becomes that for example 2 filenames (XnKTgnu7Ny1LvkvaulNrqOHVCsk=PHOTO-2019-11-08-23-08-12, tkTMpj3cPFl0H1ofBoGlRzWUiN4=PHOTO-2019-11-08-23-08-10) have the same size of 8,238 bytes. Both of these filenames are located in J:\photos app and have ‎Friday, ‎November ‎08, ‎2019, ‏‎10:08:10 PM as date created, date modified and date accessed. The same two duplicate files are located in other folders with the date created, date modified and date accessed set as ‎Friday, ‎November ‎08, ‎2019, ‏‎11:08:10 PM.

What is the reasoning for creating 2 concat columns?

excel demonstration.png


Please seee above image for illustrative purposes of the problem
 
Upvote 0
Then you exclude the path in your concat formula. It should only contain the exact values, that you want to compare.
I thought you were comparing two lists, if all transactions are in one list, you only need one concat.

/Skovgaard
 
Upvote 0
Then you exclude the path in your concat formula. It should only contain the exact values, that you want to compare.
I thought you were comparing two lists, if all transactions are in one list, you only need one concat.

/Skovgaard
Oh no, there is only one list to compare but even when running a countif function, it basically re-created another concat list anyway...therefore still no way to isolate and show exact duplicates...please see the below picture
new concat list.png
 
Upvote 0
Oh no, there is only one list to compare but even when running a countif function, it basically re-created another concat list anyway...therefore still no way to isolate and show exact duplicates...please see the below picture
View attachment 75675
Put you concat list in e.g. column G, and include the filename also like this: =A2&C2&d2&E2&F2.
Then put put you countif in column H like this: =COUNTIF($G$2:$G9,G2)

/Skovgaard
 
Upvote 0
Put you concat list in e.g. column G, and include the filename also like this: =A2&C2&d2&E2&F2.
Then put put you countif in column H like this: =COUNTIF($G$2:$G9,G2)

/Skovgaard
What if I didn't want to take into account the name because duplicate files could also be file1.jpg or file1 (1).jpg?
 
Upvote 0
I guess a workaround is that for each countif value, assign a colour to it matching the "block" of concated cells and run a condition that in that block if all values are the same (all count if values are one number) then the block of cobtacted values are true unique duplicates. If the countif values are different or appear as "kinks" of different colour then that block of same size dupes are not true duplicates. I know that was a lot but does the reasoning make sense ? So i guess the question now becomes to find true duplicates, how to set up a dependent conditions ?
 
Upvote 0
In exactly that situation, you could put in the first 5 characters in the concat (file1), but I guess that wont suit for the rest of the filenames.

How would you define a TRUE Duplicate from the given data then?

/Skovgaard
 
Upvote 0
In exactly that situation, you could put in the first 5 characters in the concat (file1), but I guess that wont suit for the rest of the filenames.

How would you define a TRUE Duplicate from the given data then?

/Skovgaard
Ive explained a little bit in detail in post 8 althought in a very hack and slash manner lol sorry
 
Upvote 0

Forum statistics

Threads
1,215,217
Messages
6,123,670
Members
449,115
Latest member
punka6

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top