Match Duplicate Records

amerifax

Board Regular
Joined
Dec 17, 2008
Messages
247
I'm trying to find duplicate occurrences in my spreadsheet.

1. I have seven columns of data.
2. It is the first four columns that might reflect a duplication.
3. When there is a duplication the duplicate roles are always contiguous.
4. Within the spreadsheet some roles might have two or three roles that are duplicates

In the seventh row "G" I would like to place an "X" for all roles that have a duplicate value.
Example:
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212 - X
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Smith Tom - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Johnson Bob - 1221 Main - Milwaukee WI 53027 - 414-444-8989 - X
Johnson Bob - 1221 Main - Milwaukee WI 53027 - 414-444-8989

Thanks
Bob
 

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
One solution would be to enter a helper column that concatenates the columns of data, ie,

HTML:
=A1&B1&C1 etc etc
Then next to that column, enter

HTML:
=if(countif(D:D,D2>1)=TRUE,"X","")
 
Upvote 0
I tried
=IF(COUNTIF(D5:D6,D6>1)=TRUE,"X","")

D5 and D6 being the phone number. I placed formula in the column that I want the X to appear in. It did hi-light the the phone numbers, as if to compare, but nothing happened.

I must be doing something wrong.

Bob
 
Upvote 0
The formula checks the range to see how many times the reference value appears.

(The reference value is our concatenated string - we do this because using just one cell, first name for example, could have multiple values but actually be different entries. It is less likely that FIRST NAME + LAST NAME + ADDRESS + PHONE NUMBER is going to be the same for two different people.)

If the number of times that value appears is greater than 1, we know it has duplicates and tell Excel to return an "X" (could be anything you want). If not, we leave it blank.
 
Upvote 0
The formula checks the range to see how many times the reference value appears.

(The reference value is our concatenated string - we do this because using just one cell, first name for example, could have multiple values but actually be different entries. It is less likely that FIRST NAME + LAST NAME + ADDRESS + PHONE NUMBER is going to be the same for two different people.)

If the number of times that value appears is greater than 1, we know it has duplicates and tell Excel to return an "X" (could be anything you want). If not, we leave it blank.
=if(countif(D:D,D2>1)=TRUE,"X","")
Did you mean this:

=IF(COUNTIF(D:D,D2)>1,"X","")
 
Upvote 0
>>gehusi<<

I have a similar problem understanding how this works
http://www.mrexcel.com/forum/showthread.php?p=2728113#post2728113

=if(countif(D:D,D2>1)=TRUE,"X","")

My values that affect the duplication are cells a1 to d1 or as in next record A2 to D2. It seems that D:D would be the range. And D2>1 would be the condition. It almost seems to me that D:D would be a variable, and I'm quite certain that's not the case. But I truly need an understanding on how D:D can represent my value or range.

Bob

>>T. Valko<<
Did you mean this:
=IF(COUNTIF(D:D,D2)>1,"X","")
I was thinks, that usually gets me into a lot of trouble, I was to use the example and fill it in for my needs. Not having a clue what I'm doing that with my interpretation.

Bob

P.S. Now it's time for a break. And now that James and Casey are both out of the writing, me and the wife are anxiously waiting to see Scott take it all.

As in American Idle.
 
Last edited:
Upvote 0
If your data is set up in the following manner:

Column A = Name
Column B = Address
Column C = City
Column D = State
Column E = Phone Number

Then enter the following in column F:

Code:
=A1&B1&C1&D1&E1

And then the following in column G:

Code:
=IF(COUNTIF(F:F,F1)>1,"X","")

(hat tip to Valko as I mistyped this formula originally)

Column G should give you an "X" in any duplicates that appear in your data set.
 
Upvote 0
I remember the day it took dBase IV 3 1/2 days to run a calculation. I feel like I'm going back in that direction in Excel 2007. I'm using the formula against 137,732 records. It takes several hours to find out which phones have a duplication and Mark them with an "X" in a dedicated column.

Am I doing something wrong? My first thought is to jump ship and try to do it in Access.

I have found where I can set calculation to automatic or manual. Is there a way to handle the long list of calculations were there turned off. And then can be highlighted and calculated by using a hot key of some sort.

Bob
 
Upvote 0
I guess it depends on what you are ultimately trying to accomplish. Will you be deleting the duplicate records? Something else? There may be a more efficient approach out there, short of switching programs entirely.
 
Upvote 0

Forum statistics

Threads
1,224,587
Messages
6,179,740
Members
452,940
Latest member
rootytrip

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top