How do I look for "incorrect duplicates"?

kmprice710

Board Regular
Joined
Jan 8, 2014
Messages
87
Office Version
  1. 2019
Platform
  1. Windows
I have a list of sales. Each customer has an assigned Customer ID. Each row has the Customer ID. But I have found out that the same Customer ID has been incorrectly assigned to more than one customer.

How do I look for duplicate Customer IDs that are incorrectly assigned on a sheet in which some "correct" duplicates appear?

DateCustomer NameAddressCustomer_ID
1/1/2001Jim Jones2565 Gross PointJONJI01
1/2/2001Jim Johnson1024 AveryJONJI02
1/3/2001Jim JamesP.O. Box 32JAMJI01
1/4/2001Jim Mason2754 CentralMASJI01
1/5/2001Jim Newton1520 CosgroveNEWJI01
1/6/2001Jim Newsome502 BoystonNEWJI01
1/7/2001Jim Newsome502 BoystonNEWJI01
1/7/2001Jim SmithPO Box 510SMIJI01
1/8/2001Jim Smith
902 BirchSMIJI02
1/9/2001Jim Smith3254 N. Lake ShoreSMIJI01
1/10/2001Jim Jones2565 Gross PointJONJI01
1/11/2001Jim Jones4891 4th St.JONJI01
1/12/2001Jim Jones2565 Gross PointJONJI01

<tbody>
</tbody>

For example, in this dummy example, SMIJI01 has been assigned to multiple customers --which is incorrect --but Jim Smith on 2565 Gross Point, the one who was first assigned SMIJI01, CORRECTLY appears three times on this sheet. The ID are supposed to be assigned based on the first 3 letters of the customer's last name and first two letters of the first name and then based on the address. If the ID is taken, a number 02, 03 ... etc. should be assigned.

How do I write a conditional statement that will look at the customer name and the address fields and find the "incorrect" duplicates only?
 
Last edited:
Please confirm last column:
DateCustomer NameAddressCust_IDcorrected
1/1/01Jim Jones2565 Gross PointJONJI01JONJI1
1/2/01Jim Johnson1024 AveryJONJI02JOHJI1
1/3/01Jim JamesP.O. Box 32JAMJI01JAMJI1
1/4/01Jim Mason2754 CentralMASJI01MASJI1
1/5/01Jim Newton1520 CosgroveNEWJI01NEWJI1
1/6/01Jim Newsome502 BoystonNEWJI01NEWJI2
1/7/01Jim Newsome502 BoystonNEWJI01NEWJI2
1/7/01Jim SmithPO Box 510SMIJI01SMIJI1
1/8/01Jim Smith902 BirchSMIJI02SMIJI2
1/9/01Jim Smith3254 N. Lake ShoreSMIJI01SMIJI3
1/10/01Jim Jones2565 Gross PointJONJI01JONJI1
1/11/01Jim Jones4891 4th St.JONJI01JONJI2
1/12/01Jim Jones2565 Gross PointJONJI01JONJI1

<tbody>
</tbody>

Thanks for your reply. Unfortunately, the IDs must have two numerical digits at the end, like 01, 02, 03... 11, 21... etc. because we have about 10 Jim Smiths alone.

So add a digit in front of the last number, usually 0, and you have it right.
 
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Sounds like the database needs re-design (to set it up properly). Then it will be MUCH easier to maintain and use.

I don't disagree, but I do not have the authority in the organization to re-do it. I just been ordered to be it fixed. :(
 
Upvote 0
With 3 helper columns in per example G1:
Helper_1Helper_2Helper_3
Jim Jones2565 Gross PointJONJI11
Jim Johnson1024 AveryJOHJI11
Jim JamesP.O. Box 32JAMJI11
Jim Mason2754 CentralMASJI11
Jim Newton1520 CosgroveNEWJI11
Jim Newsome502 BoystonNEWJI12
Jim Newsome502 BoystonNEWJI22
Jim SmithPO Box 510SMIJI11
Jim Smith902 BirchSMIJI12
Jim Smith3254 N. Lake ShoreSMIJI13
Jim Jones2565 Gross PointJONJI21
Jim Jones4891 4th St.JONJI12
Jim Jones2565 Gross PointJONJI31

<tbody>
</tbody>

Formula in G2 is =B2&C2
Formula in H2 is =UPPER(LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(B2)*999-998,999)),3)&LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(A2)*999-998,999)),2))&COUNTIFS($B$2:B2,B2,$C$2:C2,C2)
Formula in I2 is =IF(COUNTIFS($B$2:B2,B2,$C$2:C2,C2)=1,COUNTIF($H$2:H2,H2),INDEX($I$2:I2,MATCH(B2&C2,$G$2:G2,0)))

new formula in D2 is =UPPER(LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(B2)*999-998,999)),3)&LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(A2)*999-998,999)),2))&TEXT(IF(COUNTIFS($B$2:B2,B2,$C$2:C2,C2)=1,COUNTIF($H$2:H2,H2),INDEX($I$2:I2,MATCH(B2&C2,$G$2:G2,0))),"00")

Giving you this:
DateCustomer NameAddressCust_ID
1/1/01Jim Jones2565 Gross PointJONJI01
1/2/01Jim Johnson1024 AveryJOHJI01
1/3/01Jim JamesP.O. Box 32JAMJI01
1/4/01Jim Mason2754 CentralMASJI01
1/5/01Jim Newton1520 CosgroveNEWJI01
1/6/01Jim Newsome502 BoystonNEWJI02
1/7/01Jim Newsome502 BoystonNEWJI02
1/7/01Jim SmithPO Box 510SMIJI01
1/8/01Jim Smith902 BirchSMIJI02
1/9/01Jim Smith3254 N. Lake ShoreSMIJI03
1/10/01Jim Jones2565 Gross PointJONJI01
1/11/01Jim Jones4891 4th St.JONJI02
1/12/01Jim Jones2565 Gross PointJONJI01

<tbody>
</tbody>
 
Last edited:
Upvote 0
With 3 helper columns in per example G1:
Helper_1Helper_2Helper_3
Jim Jones2565 Gross PointJONJI11
Jim Johnson1024 AveryJOHJI11
Jim JamesP.O. Box 32JAMJI11
Jim Mason2754 CentralMASJI11
Jim Newton1520 CosgroveNEWJI11
Jim Newsome502 BoystonNEWJI12
Jim Newsome502 BoystonNEWJI22
Jim SmithPO Box 510SMIJI11
Jim Smith902 BirchSMIJI12
Jim Smith3254 N. Lake ShoreSMIJI13
Jim Jones2565 Gross PointJONJI21
Jim Jones4891 4th St.JONJI12
Jim Jones2565 Gross PointJONJI31
<tbody> </tbody>
Formula in G2 is =B2&C2 Formula in H2 is =UPPER(LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(B2)*999-998,999)),3)&LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(A2)*999-998,999)),2))&COUNTIFS($B$2:B2,B2,$C$2:C2,C2) Formula in I2 is =IF(COUNTIFS($B$2:B2,B2,$C$2:C2,C2)=1,COUNTIF($H$2:H2,H2),INDEX($I$2:I2,MATCH(B2&C2,$G$2:G2,0))) new formula in D2 is =UPPER(LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(B2)*999-998,999)),3)&LEFT(TRIM(MID(SUBSTITUTE($B2," ",REPT(" ",999)),COLUMN(A2)*999-998,999)),2))&TEXT(IF(COUNTIFS($B$2:B2,B2,$C$2:C2,C2)=1,COUNTIF($H$2:H2,H2),INDEX($I$2:I2,MATCH(B2&C2,$G$2:G2,0))),"00") Giving you this:
DateCustomer NameAddressCust_ID
1/1/01Jim Jones2565 Gross PointJONJI01
1/2/01Jim Johnson1024 AveryJOHJI01
1/3/01Jim JamesP.O. Box 32JAMJI01
1/4/01Jim Mason2754 CentralMASJI01
1/5/01Jim Newton1520 CosgroveNEWJI01
1/6/01Jim Newsome502 BoystonNEWJI02
1/7/01Jim Newsome502 BoystonNEWJI02
1/7/01Jim SmithPO Box 510SMIJI01
1/8/01Jim Smith902 BirchSMIJI02
1/9/01Jim Smith3254 N. Lake ShoreSMIJI03
1/10/01Jim Jones2565 Gross PointJONJI01
1/11/01Jim Jones4891 4th St.JONJI02
1/12/01Jim Jones2565 Gross PointJONJI01
<tbody> </tbody>
Thanks for this! I'm trying to adapt it to the database because we have a ton of other columns and I had already combined the first and last name fields for this example. In the Helper_2 formula, what is the formula referencing from column A?
 
Upvote 0
Helper_2 formula will create the first set of customer ID based on Columns B and C, Column A is used a a 'count'
Did I answer your query?
 
Upvote 0
Helper_2 formula will create the first set of customer ID based on Columns B and C, Column A is used a a 'count'
Did I answer your query?
 
Upvote 0

Forum statistics

Threads
1,214,878
Messages
6,122,062
Members
449,064
Latest member
scottdog129

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top