Compare 2 columns and remove duplicates

viper20

New Member
Joined
Sep 6, 2011
Messages
7
Hello,

I have 2 columns A and B column A has about 700 rows and column B has about 400 rows. Some of the data appears in both column A and B so what i want to do is take any of the data that are not duplicates and move them to column C, the data in the 2 fields does not directly match up from left to right not sure if i explained that right but i will provide an example to demonstrate what i mean:

Column A Column B

apple watermelon
orange peaches
grape apple
watermelon kiwi
blueberries
strawberries
mango
peaches
kiwi
pineapple

So how can i get the unique fields from column A to appear in Column C?

Thanks guys
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
You can take column A and copy it to column C, then, take column B and copy it to the end of the data in Column C. You now have the data from Column A and B in one long string in column C. If you have 2007 Excel, there is a button that allows you to "remove duplicates". If you are still in 2003 (like I am), Data Sort on column C and in Column D, type the following formula: =if(C1=C2,1,0). If you have a header, the formula would drop down one row. Copy, Paste Special, Values so that the formula is now hard coded to show either "1" or "0". Data Sort on Column D and remove all the "1's".

Hope this helps!
 
Upvote 0
May Be this will help
this is your data
<b>Excel 2007</b><table cellpadding="2.5px" rules="all" style=";background-color: #FFFFFF;border: 1px solid;border-collapse: collapse; border-color: #A6AAB6"><colgroup><col width="25px" style="background-color: #E0E0F0" /><col /><col /></colgroup><thead><tr style=" background-color: #E0E0F0;text-align: center;color: #161120"><th></th><th>A</th><th>B</th></tr></thead><tbody><tr ><td style="color: #161120;text-align: center;">1</td><td style=";">apple</td><td style=";">watermelon</td></tr><tr ><td style="color: #161120;text-align: center;">2</td><td style=";">orange</td><td style=";">peaches</td></tr><tr ><td style="color: #161120;text-align: center;">3</td><td style=";">grape</td><td style=";">apple</td></tr><tr ><td style="color: #161120;text-align: center;">4</td><td style=";">watermelon</td><td style=";">kiwi</td></tr><tr ><td style="color: #161120;text-align: center;">5</td><td style=";">blueberries </td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">6</td><td style=";">strawberries</td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">7</td><td style=";">mango</td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">8</td><td style=";">peaches</td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">9</td><td style=";">kiwi</td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">10</td><td style=";">pineapple</td><td style="text-align: right;;"></td></tr></tbody></table><p style="width:3.6em;font-weight:bold;margin:0;padding:0.2em 0.6em 0.2em 0.5em;border: 1px solid #A6AAB6;border-top:none;text-align: center;background-color: #E0E0F0;color: #161120">Sheet1</p><br /><br />

after run this code
Code:
Sub Test()
Dim c As Range, d As Range, LR As Long
LR = Range("A" & Rows.Count).End(xlUp).Row

For Each c In Range("A1:A" & LR)
    If Not WorksheetFunction.CountIf(Range("B1:B" & LR), c.Value) >= 1 Then
        i = i + 1
        Range("C" & i).Value = c.Value
    End If
Next c

End Sub

the result will be
<b>Excel 2007</b><table cellpadding="2.5px" rules="all" style=";background-color: #FFFFFF;border: 1px solid;border-collapse: collapse; border-color: #A6AAB6"><colgroup><col width="25px" style="background-color: #E0E0F0" /><col /><col /><col /></colgroup><thead><tr style=" background-color: #E0E0F0;text-align: center;color: #161120"><th></th><th>A</th><th>B</th><th>C</th></tr></thead><tbody><tr ><td style="color: #161120;text-align: center;">1</td><td style=";">apple</td><td style=";">watermelon</td><td style=";">orange</td></tr><tr ><td style="color: #161120;text-align: center;">2</td><td style=";">orange</td><td style=";">peaches</td><td style=";">grape</td></tr><tr ><td style="color: #161120;text-align: center;">3</td><td style=";">grape</td><td style=";">apple</td><td style=";">blueberries </td></tr><tr ><td style="color: #161120;text-align: center;">4</td><td style=";">watermelon</td><td style=";">kiwi</td><td style=";">strawberries</td></tr><tr ><td style="color: #161120;text-align: center;">5</td><td style=";">blueberries </td><td style="text-align: right;;"></td><td style=";">mango</td></tr><tr ><td style="color: #161120;text-align: center;">6</td><td style=";">strawberries</td><td style="text-align: right;;"></td><td style=";">pineapple</td></tr><tr ><td style="color: #161120;text-align: center;">7</td><td style=";">mango</td><td style="text-align: right;;"></td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">8</td><td style=";">peaches</td><td style="text-align: right;;"></td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">9</td><td style=";">kiwi</td><td style="text-align: right;;"></td><td style="text-align: right;;"></td></tr><tr ><td style="color: #161120;text-align: center;">10</td><td style=";">pineapple</td><td style="text-align: right;;"></td><td style="text-align: right;;"></td></tr></tbody></table><p style="width:3.6em;font-weight:bold;margin:0;padding:0.2em 0.6em 0.2em 0.5em;border: 1px solid #A6AAB6;border-top:none;text-align: center;background-color: #E0E0F0;color: #161120">Sheet1</p><br /><br />
 
Upvote 0
Hey Yahya,

This is exactly what i am looking to do stupid question where do i run that code you provided?

Thanks
 
Upvote 0
you are welcome
============
and if you care
you can do it by formula

Test It Please
<b>Excel 2007</b><table cellpadding="2.5px" rules="all" style=";background-color: #FFFFFF;border: 1px solid;border-collapse: collapse; border-color: #A6AAB6"><colgroup><col width="25px" style="background-color: #E0E0F0" /><col /><col /><col /></colgroup><thead><tr style=" background-color: #E0E0F0;text-align: center;color: #161120"><th></th><th>A</th><th>B</th><th>C</th></tr></thead><tbody><tr ><td style="color: #161120;text-align: center;">1</td><td style=";">apple</td><td style=";">watermelon</td><td style=";">orange</td></tr><tr ><td style="color: #161120;text-align: center;">2</td><td style=";">orange</td><td style=";">peaches</td><td style=";">grape</td></tr><tr ><td style="color: #161120;text-align: center;">3</td><td style=";">grape</td><td style=";">apple</td><td style=";">blueberries </td></tr><tr ><td style="color: #161120;text-align: center;">4</td><td style=";">watermelon</td><td style=";">kiwi</td><td style=";">strawberries</td></tr><tr ><td style="color: #161120;text-align: center;">5</td><td style=";">blueberries </td><td style="text-align: right;;"></td><td style=";">mango</td></tr><tr ><td style="color: #161120;text-align: center;">6</td><td style=";">strawberries</td><td style="text-align: right;;"></td><td style=";">pineapple</td></tr><tr ><td style="color: #161120;text-align: center;">7</td><td style=";">mango</td><td style="text-align: right;;"></td><td style=";"></td></tr><tr ><td style="color: #161120;text-align: center;">8</td><td style=";">peaches</td><td style="text-align: right;;"></td><td style=";"></td></tr><tr ><td style="color: #161120;text-align: center;">9</td><td style=";">kiwi</td><td style="text-align: right;;"></td><td style=";"></td></tr><tr ><td style="color: #161120;text-align: center;">10</td><td style=";">pineapple</td><td style="text-align: right;;"></td><td style=";"></td></tr></tbody></table><p style="width:3.6em;font-weight:bold;margin:0;padding:0.2em 0.6em 0.2em 0.5em;border: 1px solid #A6AAB6;border-top:none;text-align: center;background-color: #E0E0F0;color: #161120">Sheet1</p><br /><br /><table width="85%" cellpadding="2.5px" rules="all" style=";border: 2px solid black;border-collapse:collapse;padding: 0.4em;background-color: #FFFFFF" ><tr><td style="padding:6px" ><b>Array Formulas</b><table cellpadding="2.5px" width="100%" rules="all" style="border: 1px solid;text-align:center;background-color: #FFFFFF;border-collapse: collapse; border-color: #A6AAB6"><thead><tr style=" background-color: #E0E0F0;color: #161120"><th width="10px">Cell</th><th style="text-align:left;padding-left:5px;">Formula</th></tr></thead><tbody><tr><th width="10px" style=" background-color: #E0E0F0;color: #161120">C1</th><td style="text-align:left">{=IFERROR(<font color="Blue">INDEX(<font color="Red">$A$1:$A$10,SMALL(<font color="Green">IF(<font color="Purple">ISNA(<font color="Teal">MATCH(<font color="#FF00FF">$A$1:$A$10,$B$1:$B$4,0</font>)</font>),ROW(<font color="Teal">$A$1:$A$10</font>)-ROW(<font color="Teal">$A$1</font>)+1</font>),ROW(<font color="Purple">1:1</font>)</font>)</font>),""</font>)}</td></tr></tbody></table><b>Entered with Ctrl+Shift+Enter.</b> If entered correctly, Excel will surround with curly braces {}.
<b>Note: Do not try and enter the {} manually yourself</b></td></tr></table><br />
 
Upvote 0
This code is not working when the data is like this (See below table). My expectation is 1103 & 1106 should not be pulled in Column C since it exists in both.

ColumnAColumnBColumnC
110270131103
110311011106
110411021110
110511041153
110611051188
110711071189
111024171199
11532485
11883618
11893649
11995129
1102
1103
1104
1105
1106

<tbody>
</tbody>


How can I revise the code?

Thanks!
 
Last edited:
Upvote 0
try PowerQuery
Code:
[SIZE=1]let
    Source = Excel.CurrentWorkbook(){[Name="Table9"]}[Content],
    #"Merged Queries" = Table.NestedJoin(Source,{"ColumnA"},Table10,{"ColumnB"},"Table10",JoinKind.LeftAnti),
    #"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"Table10"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns")
in
    #"Removed Duplicates"[/SIZE]


ColumnAColumnBColumnA
1102​
7013​
1110​
1103​
1101​
1153​
1104​
1102​
1188​
1105​
1104​
1189​
1106​
1105​
1199​
1107​
1107​
1110​
2417​
1153​
2485​
1188​
3618​
1189​
3649​
1199​
5129​
1102​
1103​
1104​
1105​
1106​
 
Last edited:
Upvote 0
Thanks for Power query suggestion, but actually I need a macro to achieve this.
One more thing what i noticed is, this is happening when the data in column B exceeds column B
 
Upvote 0
One more thing what i noticed is, this is happening when the data in column B exceeds column B
If you look back to the original question, one of the specifications was that column A had more data than column B.

If your data always has column B longer than column A, you could just swap all column references in the code. Is that your circumstance or can either column be the longer one?

Also, in that original sample data, all the values in the shorter list appeared in the longer list. Is that your circumstance too or can either column contain unique entries?
 
Last edited:
Upvote 0

Forum statistics

Threads
1,216,085
Messages
6,128,733
Members
449,465
Latest member
TAKLAM

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top