Reconcile Two Columns BUT also Consider duplicate values

Sufiyan97

Well-known Member
Joined
Apr 12, 2019
Messages
1,538
Office Version
  1. 365
  2. 2013
Platform
  1. Windows
I have two columns A and B I want to find Number in column A but NOT in column B, Number in column B but not in column A.

Book1
AB
1AB
211
322
432
543
654
Sheet1


I am using this formula by CA_Punit

C2: =IFERROR(INDEX($A$2:$A$17,AGGREGATE(15,6,1/(1/(ISERROR(MATCH($A$2:$A$17,$B$2:$B$17,0))*(ROW($A$2:$A$17)-ROW($A$2)+1))),ROWS(B$2:$C2))),"")
Range: C2:C6

D2: =IFERROR(INDEX($B$2:$B$17,AGGREGATE(15,6,1/(1/(ISERROR(MATCH($B$2:$B$17,$A$2:$A$17,0))*(ROW($A$2:$A$17)-ROW($A$2)+1))),ROWS($D$2:D2))),"")
Range: D2:D6

I am getting below result:
Book1
CD
1In Column A but not column BIn Column B but not column A
250
300
400
500
600
Sheet1
Cell Formulas
RangeFormula
C2:C6C2=IFERROR(INDEX($A$2:$A$17,AGGREGATE(15,6,1/(1/(ISERROR(MATCH($A$2:$A$17,$B$2:$B$17,0))*(ROW($A$2:$A$17)-ROW($A$2)+1))),ROWS(B$2:$C2))),"")
D2:D6D2=IFERROR(INDEX($B$2:$B$17,AGGREGATE(15,6,1/(1/(ISERROR(MATCH($B$2:$B$17,$A$2:$A$17,0))*(ROW($A$2:$A$17)-ROW($A$2)+1))),ROWS($D$2:D2))),"")


But I want result as below i.e. the 2 Number in column B is 2 times so it should show that 2 number is not in column A

Book1
FG
1In Column A but not column BIn Column B but not column A
252
300
400
500
600
Sheet1
 

sandy666

just for fun...
WSCreditWS_not_CreditCresit_not_WS
50375.78300872.04
30050793.131934
375.78251456.251804.72
793.1323.874610.767.95
1456.25872.04123.4588
251934543.2199
23.87375.78
375.7874
741804.72
4610.76315.56
315.56375.78
2929
375.787.95
123.4588
00
543.2199

Power Query:
// WS_not_Credit
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    AB = List.Difference(Source[WS], Source[Credit])
in
    AB
Power Query:
// Cresit_not_WS
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    BA = List.Difference(Source[Credit], Source[WS])
in
    BA
Thank you! sandy666
This also works perfectly but formula is easier to work with.
 
Upvote 0

Excel Facts

Can you sort left to right?
To sort left-to-right, use the Sort dialog box. Click Options. Choose "Sort left to right"
just for fun...
WSCreditWS_not_CreditCresit_not_WS
50375.78300872.04
30050793.131934
375.78251456.251804.72
793.1323.874610.767.95
1456.25872.04123.4588
251934543.2199
23.87375.78
375.7874
741804.72
4610.76315.56
315.56375.78
2929
375.787.95
123.4588
00
543.2199

Power Query:
// WS_not_Credit
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    AB = List.Difference(Source[WS], Source[Credit])
in
    AB
Power Query:
// Cresit_not_WS
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    BA = List.Difference(Source[Credit], Source[WS])
in
    BA
Hello sandy666

When I was working with above formula with 7K+ rows Exel gets freeze but your solution with power query works very well and not freezing sheet. Now I understand the power of power query.

Thank you very much.
 
Upvote 0
Try:

Book1
ABCD
1Withdrawal StatementCreditIn Column A but not in column BIn column B but not in column A
250375.78300872.04
330050793.131934
4375.78251456.251804.72
5793.1323.874610.767.95
61456.25872.04123.4588
7251934543.2199
823.87375.78  
9375.7874  
10741804.72  
114610.76315.56  
12315.56375.78  
132929  
14375.787.95  
15
16123.4588
1700
18543.2199
19
Sheet12
Cell Formulas
RangeFormula
C2:C14C2=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$2:$A$500)/($A$2:$A$500>0)/(COUNTIF(OFFSET($A$1,0,0,ROW($A$2:$A$500)-ROW($A$2)+2),$A$2:$A$500)>COUNTIF($B$2:$B$500,$A$2:$A$500)),ROWS($C$2:$C2))),"")
D2:D14D2=IFERROR(INDEX(B:B,AGGREGATE(15,6,ROW($B$2:$B$500)/($B$2:$B$500>0)/(COUNTIF(OFFSET($B$1,0,0,ROW($B$2:$B$500)-ROW($B$2)+2),$B$2:$B$500)>COUNTIF($A$2:$A$500,$B$2:$B$500)),ROWS($D$2:$D2))),"")

The formula is pretty tricky, and I don't think I have time to explain in full how it works, but I'll try to give the gist of it. You may want to do a search on the AGGREGATE function, which is key to the whole thing.

We start out with an array of all the rows we want to look at:

=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$2:$A$500)/($A$2:$A$500>0)/(COUNTIF(OFFSET($A$1,0,0,ROW($A$2:$A$500)-ROW($A$2)+2),$A$2:$A$500)>COUNTIF($B$2:$B$500,$A$2:$A$500)),ROWS($C$2:$C2))),"")

This is represented internally like {2,3,4,5,6,...}. Next we want to exclude rows that have zero values, like row 15 and 17. The part in blue:

=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$2:$A$500)/($A$2:$A$500>0)/(COUNTIF(OFFSET($A$1,0,0,ROW($A$2:$A$500)-ROW($A$2)+2),$A$2:$A$500)>COUNTIF($B$2:$B$500,$A$2:$A$500)),ROWS($C$2:$C2))),"")

checks each cell in the range and returns TRUE or FALSE, which creates another array like this {TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,...}. Then we take each individual element in the first array, and divide it by the corresponding element in the next array, and TRUE is equivalent to 1, and FALSE is equivalent to 0, so we'd end up with something like: {2,3,#DIV/0!,5,6,#DIV/0!,8...}.

The next bit is tricky. Consider this example:

Book1 (version 1).xlsb
ABCDEFGHIJK
1Withdrawal StatementCredit
253COUNTIF(A2:A10,A2)3COUNTIF(B2:B10,A2)1D2>G2?TRUE
344COUNTIF(A3:A10,A3)1COUNTIF(B2:B10,A2)1D3>G3?FALSE
455COUNTIF(A4:A10,A4)2COUNTIF(B2:B10,A2)1D4>G4?TRUE
556COUNTIF(A5:A10,A5)1COUNTIF(B2:B10,A2)1D5>G5?FALSE
Sheet17


5 occurs 3 times in column A, and once in column B, so we want to return 2 values of 5. So for each row, we calculate the D and G formulas, and make the comparison in column J, to get the TRUE/FALSE result in column K. Note in column D the range changes with each row, but the range is constant in column G. For rows 2 and 4 we get TRUE, so we would end up outputting 2 values of 5. The K column is converted into an array like before, and it's divided into the previous array, so we might get a few more #DIV/0! values: {2,3,#DIV/0!,#DIV/0!,6,#DIV/0!,...}.

You can see the COUNTIF() > COUNTIF() within the formula. Note that the first COUNTIF has to use OFFSET to generate the variable ranges for each row.

Finally, the ostensible main function of AGGREGATE kicks in. Given an array {2,3,#DIV/0!,#DIV/0!,6,#DIV/0!,...}, find the nth SMALLest value (the 15 in AGGREGATE), ignoring errors (the 6 value). So this would return row numbers 2, 3, and 6. The final ROWS($C$2:$C2) is used to generate the counter n for 1, 2, 3, etc.

Hope this helps!
 
Upvote 0
The formula is pretty tricky, and I don't think I have time to explain in full how it works, but I'll try to give the gist of it. You may want to do a search on the AGGREGATE function, which is key to the whole thing.

We start out with an array of all the rows we want to look at:

=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$2:$A$500)/($A$2:$A$500>0)/(COUNTIF(OFFSET($A$1,0,0,ROW($A$2:$A$500)-ROW($A$2)+2),$A$2:$A$500)>COUNTIF($B$2:$B$500,$A$2:$A$500)),ROWS($C$2:$C2))),"")

This is represented internally like {2,3,4,5,6,...}. Next we want to exclude rows that have zero values, like row 15 and 17. The part in blue:

=IFERROR(INDEX(A:A,AGGREGATE(15,6,ROW($A$2:$A$500)/($A$2:$A$500>0)/(COUNTIF(OFFSET($A$1,0,0,ROW($A$2:$A$500)-ROW($A$2)+2),$A$2:$A$500)>COUNTIF($B$2:$B$500,$A$2:$A$500)),ROWS($C$2:$C2))),"")

checks each cell in the range and returns TRUE or FALSE, which creates another array like this {TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,...}. Then we take each individual element in the first array, and divide it by the corresponding element in the next array, and TRUE is equivalent to 1, and FALSE is equivalent to 0, so we'd end up with something like: {2,3,#DIV/0!,5,6,#DIV/0!,8...}.

The next bit is tricky. Consider this example:

Book1 (version 1).xlsb
ABCDEFGHIJK
1Withdrawal StatementCredit
253COUNTIF(A2:A10,A2)3COUNTIF(B2:B10,A2)1D2>G2?TRUE
344COUNTIF(A3:A10,A3)1COUNTIF(B2:B10,A2)1D3>G3?FALSE
455COUNTIF(A4:A10,A4)2COUNTIF(B2:B10,A2)1D4>G4?TRUE
556COUNTIF(A5:A10,A5)1COUNTIF(B2:B10,A2)1D5>G5?FALSE
Sheet17


5 occurs 3 times in column A, and once in column B, so we want to return 2 values of 5. So for each row, we calculate the D and G formulas, and make the comparison in column J, to get the TRUE/FALSE result in column K. Note in column D the range changes with each row, but the range is constant in column G. For rows 2 and 4 we get TRUE, so we would end up outputting 2 values of 5. The K column is converted into an array like before, and it's divided into the previous array, so we might get a few more #DIV/0! values: {2,3,#DIV/0!,#DIV/0!,6,#DIV/0!,...}.

You can see the COUNTIF() > COUNTIF() within the formula. Note that the first COUNTIF has to use OFFSET to generate the variable ranges for each row.

Finally, the ostensible main function of AGGREGATE kicks in. Given an array {2,3,#DIV/0!,#DIV/0!,6,#DIV/0!,...}, find the nth SMALLest value (the 15 in AGGREGATE), ignoring errors (the 6 value). So this would return row numbers 2, 3, and 6. The final ROWS($C$2:$C2) is used to generate the counter n for 1, 2, 3, etc.

Hope this helps!
Thank you very much Eric!

I need to go through your explanation step by step because I am trying to decode formula since you have provided it in October 2020 but not getting it.

Thank you very much again for your time and explanation!
 
Upvote 0

Forum statistics

Threads
1,214,653
Messages
6,120,750
Members
448,989
Latest member
mariah3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top