Identifying Duplicate Values

crimsonchin68

New Member
Joined
Sep 28, 2018
Messages
2
Hi MrExcel,

I'm working with a large data dump (>3000 rows) that contains multiple columns but only two values that I'm primarily interested in: messages and message codes. Every message has a code and every code has a message, and what is supposed to be true is that every code has exactly one message and every message text has exactly one code. However, the input method for codes and messages is manual, meaning sometimes mistakes occur. Generally, this manifests itself as a typo in a message, leading to two different messages for the same code. It is also possible that there are identical messages with two or more different codes. I'm interested in finding those examples of duplicate codes and duplicate messages.

Right now we're using a pivot table with messages and codes as the axes. If my data was perfect there would be only one column per row filled in, and the value filled in would represent the number of times the code and its corresponding message appears in the system. When there are mistakes, multiple columns are populated, meaning that one message is used X times with code Z and Y times with code W. In the right of this table we have a "grand total" per uses of each message. We then use a COUNTIF formula to see if a row's grand total appears elsewhere in the same row, which would mean that there is only one code for the message. A "1" populates the COUNTIF column if the message has one code and "0" if it does not. We then use these rows tabbed "0" to filter the original data dump, because at that point the data beyond message and message code is important.

This seems to work in identifying codes that have multiple different messages, but in using VLOOKUP sometimes identifying messages that have multiple different codes fails because of the 255 character limit. So I have two questions: 1) can you think of a generally more efficient way of doing this process, or 2) can you think of a workaround to the VLOOKUP #VALUE error that occurs after the pivot table step?
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
Do you have a list of the unique messages and codes? Are there sometimes two or more messages and two or more codes in the same columns? I would first separate each message and code so they're unique values (1-1) then compare it to the unique list to make sure that each one doesn't have any input errors and flag the ones that do.

Edit: also using match and index might be a way around the limit, can you use your own functions in VBA?
 
Last edited:
Upvote 0
Unfortunately what I'm dealing with is supposed to be the master list of messages and their codes (it's definitely a flawed system) so there's no list that contains every correct message and its correct code. What I'm thinking of doing right now is using VBA to create a hash of each message to compare rather than using the messages themselves.
 
Upvote 0
I'd start with the ones you do know are correct and create a new excel file that is the start of a master list of the codes. How do you know when a code is right or wrong?
 
Upvote 0

Forum statistics

Threads
1,215,474
Messages
6,125,024
Members
449,204
Latest member
LKN2GO

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top