CSE Array with 38,000 lines - 24-hours later it’s still calculating

Rawlings

New Member
Joined
Jun 1, 2018
Messages
5
I have a spreadsheet A - AZ, 38,000 lines. Seeking matches for 5,000 in the 38,000 lines. Often I work up to 65,000 lines A - AZ but always multiple lookups. I’ve never tried Index - Match on this size file.

The actual formula looks at six ‘conditions to match’ to return a match.

The simplified version of my formula is:
{=Index($A$2:$A$38000,Match(1,($C$2:$C$38000=$AD2)*($D$2:$D$38000=$AF2),0),4)} of course C-S-E from formula line.

I know the formula works, and it’s calculating line by line but 24-hours later it’s still calculating about a line per second.

Question: will this actually finish or does anyone have a better idea?

thanks!
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
How many copies? I copied it from line 2 down to line 38,000 in column AS. However, from my initial post, I need to correct the formula from {=Index($A$2:$A$38000,Match(1,....

It it should have read:
{=Index($A$2:$AS$38000,Match(1,($C$2:$C$38000=$AD2)*($D$2:$D$38000=$AF2),0),4)}

each row the formula changes to =$AD3 and =$AF3, =$AD4 and =$AF4, =$AD5 and =$AF5, etc.

My appologies. I’m retyping this to my IPad since my desktop is working at 99% capacity.
 
Upvote 0
Have you over-simplifies your formula?
... or perhaps created a more complex formula than is required?

The formula in post 3 seems to simply return AF2 or an error, depending on if a row exists with AD2 in column C and AF2 in column D.
That is, it returns the same as

=IF(COUNTIFS(C$2:C$38000,AD2,D$2:D$38000,AF2),AF2,"#N/A")
 
Upvote 0
Reply to post #4 .

In the formula {=Index($A$2:$AS$38000,Match(1,($C$2:$C$38000=$AD2)*($D$2:$D$38000=$AF2),0),4)} isn't the formula returning a value only (or #N/A if not a match) when column AD matches column C AND when column AF matches column D?
I found what worked and tried it on a massive task. However, I tried your formula and yes Sir, I have created a more complex formula than required. I often have to look at several conditions to match to get a result and Index and Match seemed to be the only alternative to vlookups I could find, when multiple conditions existed. Thank you!
 
Upvote 0
Is this slover than a COUNTIFS set up?

Control+shift+enter, not just enter, and copy down:

=IF(ISNUMBER(MATCH(AD2,IF($D$2:$D$38000=$AF2,$C$2:$C$38000),0)),$AF2,"")
 
Upvote 0
Is this slover than a COUNTIFS set up?

Control+shift+enter, not just enter, and copy down:

=IF(ISNUMBER(MATCH(AD2,IF($D$2:$D$38000=$AF2,$C$2:$C$38000),0)),$AF2,"")
Aladin, just a question.
Do you know if your construct is this likely to slow more/less/same than the COUNTIFS when the actual number of column matches are included? I was under the impression that COUNTIFS was pretty good with multi-conditions.
The actual formula looks at six ‘conditions to match’ to return a match.

The simplified version of my formula is:
{=Index($A$2:$A$38000,Match(1,($C$2:$C$38000=$AD2)*($D$2:$D$38000=$AF2),0),4)}
 
Upvote 0
Aladin, just a question.
Do you know if your construct is this likely to slow more/less/same than the COUNTIFS when the actual number of column matches are included? I was under the impression that COUNTIFS was pretty good with multi-conditions.


I was thinking of: https://www.mrexcel.com/forum/excel-questions/38643-auto-flag-list.html#post178581 This performance problem also shows up in formulas for distinct count -- one that uses match vs one that invokes countif(s).
 
Upvote 0
I was thinking of: https://www.mrexcel.com/forum/excel-questions/38643-auto-flag-list.html#post178581 This performance problem also shows up in formulas for distinct count -- one that uses match vs one that invokes countif(s).
I'm not sure of the relevance of that thread since COUNTIFS did not exist then and the comparison were only on one column. My doubt was that as additional column matches were introduced the relative performance of COUNTIFS should improve. That doubt is now supported by the following testing.

I have generated random (numerical) sample data of the scale indicated by this OP.
38,000 rows in 6 columns C:H
4800 rows in 6 columns AD:AI
Expanded the post 7 formula to match data in all 6 columns and copied it down the 4800 rows
Expanded the post 4 formula to match data in all 6 columns and copied it down the 4800 rows
Both columns of formulas produced the same results.
Timed the recalculation of each column individually.
Time to recalculate the 4,800 array formula cells: 112 seconds
Time to recalculate the 4,800 COUNTIFS formula cells: 12 seconds
 
Upvote 0

Forum statistics

Threads
1,215,452
Messages
6,124,916
Members
449,195
Latest member
Stevenciu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top