Calculation time on Vlookup

mrderekw24

New Member
Joined
Jan 22, 2012
Messages
7
Hello Everyone,

A team I support uses a workbook that has been stripped down to the bare bones for troubleshooting purposes and still has terrible calculation times. Here are a few examples of the calculation process times, which pegs the CPU at 100%. Now it's down to only two sheets at the moment. 400,000 rows by 13 rows on 1st sheet. 119,000 rows by 2 columns on 2nd sheet (table array). It takes 5 minutes to delete 5000 rows while trying to strip it down by the way. We have tested on about 10 different machines to this point.

Filtered on three columns.
Formula1 example =+I378531+H378531
Vlookup1 exmaple =VLOOKUP(C378530,Sheet1!A$2:B$119070,2,FALSE)

Calculation times (on a filter tick on vlookup column);
1 min. 4%
3 min. 13%
5 min. 23%
7 min. 38%
10 min. 47%

I know diagnosing with this info will not happen but can anyone help me with suggestions of speeding up this process (besides turning off auto-calc)? We are ordering beefy hardware and software that will cut this time in half as testing has proven. That's a good start. But 7 minutes for a simple filter tick or save is still not good. Multiple this by 100 filters per day and this team is just sitting around most of the time. They often just go to lunch and sometimes return and it's still calculating. They would love to get to 1 million rows (like Microsoft advertised) but can't even effectively work with 200,000 rows. I believe this team would pay for consulting if they could triple the rows and cut down the calculations by 4x. Turning off auto calculation is a band-aid and they still have to manually calculate 50 times per day. Why they need to recalculate on Vlookups that much per day is beyond me and goes way over my head. So answering questions to their methodology or workflow will be tough for me. Access and DB's will not work according to them. They need the flexibility to filter and manipulate on the fly all day long.
Thank you,
Derek
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
Large numbers of vlookup formulas will slow down a workbook as you are seeing. Could describe what the workbook contains and how & what it's used for? Where did this database originate from - ie: is it imported from a database program or was it all manually entered?

You could copy/paste values the vlookup formulas if they would never change - then use the vlookup formulas only for new data. This would certainly speed things up but is only viable if the vlookup results would never change.

If the data is being imported & refreshed from an external database, you could bring in the results of the vlookup in the external query where is can be efficiently processed.
 
Upvote 0
OK. Let me take a stab at this. Please note I am in desktop support. I install software and setup computers and such (way basic). Our company has a team for just about everything you can possibly imagine. Literately hundreds of different teams worldwide but strangely enough not a team for Excel if you will. The people that I am trying to help could answer every question you have but for whatever reason I am the middle man who knows jack about what and why they do what they do. Let me take a shot at it though.

Patient Data (Medical)
Column A (Database) Example = FY10, FY11, FY12 (they filter on this column)
Column B (Patient Type) Example = OP (out patient), IP (In patient) (the filter on this column also)
Column C (Patient Number) Example = Unique ID
Column D (Cases) My guess is number of cases
Column E (Days) Example 1-5 (maybe how many days they stayed)
Column F (Calenar 2011) Example = x and blanks (no clue on this)
Column H (main OR Case 30 min. CDM, blah blah) example 1-999 and have no idea what this is.
Column J (surgical case) example 1-4 (formula =+I378531+H378531) still no clue on what it means.
Column K (surgical or medical) Example = surgical or medical
Column L (Nursery Admit) Example = Nursery Admit and blanks (no clue on this column as far as how it fits in the puzzle).
Column M (FY11 Visits) Example = 1-45 and #N/A. This is the vlookup column. example =VLOOKUP(C378531,Sheet1!A$2:B$119070,2,FALSE)

Second sheet (table array)
Column A (patient ID) Unique ID
Column B (Count of Serv DT) lookup column - example 1-25


I believe the source is from our AS400 system. Exports to .PRN which is new to me. This team opens the PRN and I don't know what they do from there with it (import, save as, ??)

As far as the copy/paste values suggestion. That goes over my head and I wouldn't try to explain to me but might make perfect sense to this team I am trying to help.

Last question/suggestion on the imported and refreshed also went way over my head.

I will state one thing I think make kinda make me wonder. This problem workbook seems to have thousands of empty columns to the right (goes forever) and also about 600,000 rows that are empty below the data. If I copy and paste into a new workbook I don't automatically see a ton of empty rows and columns past the data in the table. Because I am remedial in Excel I use ASAP utilities to do what you can do with just keystrokes. I tried multiple time to delete empty rows and columns and it just never finishes. It pretty much freezes and after 1 hour i just killed the task. My thought is the data export into the .PRN expands the excel file to 65,000 columns and 1 million rows even though it's not even close to that. If excel thinks it has data (x thousands columns by 1 million rows) I would expect it to choke badly. This may make zero sense but i really have no clue. I copied and pasted these two sheet into a new workbook and it performed nicely (less than a minute in calculation mode). I was sure I was on to something but then noticed it only copied the filtered rows which ended up being much smaller then the original and I didn't see all the extra rows and columns. Original file after stripping data and sheets was about 90 MB. After copy and paste it was only 6 MB and obviously it will perform better. If you haven't caught on I am totally lost. I think the best thing to do is either encourage a consultant or get them on this forum so they can answer your questions. Sorry for this incredibly boring post. I have put myself to sleep.

Thank you,
Derek



<table border="0" cellpadding="0" cellspacing="0" width="94"><col width="94"><tr height="81"> <td class="xl96" style="height: 60.75pt; width: 71pt;" height="81" width="94">
</td></tr></table>




<table border="0" cellpadding="0" cellspacing="0" width="1000"><col width="75"><col width="60"><col width="76"><col width="46"><col width="40"><col width="77"><col width="71"><col width="94"><col width="100"><col width="59"><col width="125"><col width="113"><col width="64"><tr height="81"> <td class="xl94" style="height: 60.75pt; width: 56pt;" height="81" width="75">
</td><td class="xl94" style="width: 45pt;" width="60">
</td><td class="xl94" style="width: 57pt;" width="76">
</td><td class="xl94" style="width: 35pt;" width="46">
</td><td class="xl94" style="width: 30pt;" width="40">
</td><td class="xl95" style="width: 58pt;" width="77">
</td><td class="xl94" style="width: 53pt;" width="71">
</td><td class="xl94" style="width: 71pt;" width="94">
</td><td class="xl94" style="width: 75pt;" width="100">
</td><td class="xl95" style="width: 44pt;" width="59">
</td><td class="xl95" style="width: 94pt;" width="125">
</td><td class="xl95" style="width: 85pt;" width="113">
</td><td class="xl94" style="width: 48pt;" width="64">
</td></tr></table>
 
Upvote 0
Is Sheet1!A$2:B$119070 sorted in ascending order on column A? If not, are you willing to sort the area as indicated?
 
Upvote 0
I have tried that a few times. After researching a bit, I did indeed try this and it didn't seem to help on the calculation times. I will however try again tomorrow just to make sure I didn't do something wrong on the other attempts.

I will anything at this point. I have a test file that I can try anything.

Thank you,
Derek
 
Upvote 0
I have tried that a few times. After researching a bit, I did indeed try this and it didn't seem to help on the calculation times. I will however try again tomorrow just to make sure I didn't do something wrong on the other attempts.

I will anything at this point. I have a test file that I can try anything.

Thank you,
Derek

That means you are willing to sort the data area on column A in ascending order. That allows you
to run a way faster look up formula:

Either...
Code:
[FONT=Lucida Console]=IF(VLOOKUP(C378530,Sheet1!A$2:A$119070,1,1)=C378530,[/FONT]
[FONT=Lucida Console] VLOOKUP(C378530,Sheet1!A$2:A$119070,2,1),"")[/FONT]

Or...
Code:
[FONT=Lucida Console]=IF(LOOKUP(C378530,Sheet1!A$2:A$119070)=C378530,[/FONT]
[FONT=Lucida Console] LOOKUP(C378530,Sheet1!A$2:A$119070,Sheet1!B$2:B$119070),"")[/FONT]
 
Upvote 0
Hello Everyone,

A team I support uses a workbook that has been stripped down to the bare bones for troubleshooting purposes and still has terrible calculation times. Here are a few examples of the calculation process times, which pegs the CPU at 100%. Now it's down to only two sheets at the moment. 400,000 rows by 13 rows on 1st sheet. 119,000 rows by 2 columns on 2nd sheet (table array). It takes 5 minutes to delete 5000 rows while trying to strip it down by the way. We have tested on about 10 different machines to this point.

Filtered on three columns.
Formula1 example =+I378531+H378531
Vlookup1 exmaple =VLOOKUP(C378530,Sheet1!A$2:B$119070,2,FALSE)

Calculation times (on a filter tick on vlookup column);
1 min. 4%
3 min. 13%
5 min. 23%
7 min. 38%
10 min. 47%

I know diagnosing with this info will not happen but can anyone help me with suggestions of speeding up this process (besides turning off auto-calc)? We are ordering beefy hardware and software that will cut this time in half as testing has proven. That's a good start. But 7 minutes for a simple filter tick or save is still not good. Multiple this by 100 filters per day and this team is just sitting around most of the time. They often just go to lunch and sometimes return and it's still calculating. They would love to get to 1 million rows (like Microsoft advertised) but can't even effectively work with 200,000 rows. I believe this team would pay for consulting if they could triple the rows and cut down the calculations by 4x. Turning off auto calculation is a band-aid and they still have to manually calculate 50 times per day. Why they need to recalculate on Vlookups that much per day is beyond me and goes way over my head. So answering questions to their methodology or workflow will be tough for me. Access and DB's will not work according to them. They need the flexibility to filter and manipulate on the fly all day long.
Thank you,
Derek
Tips on improving efficiency:

http://www.decisionmodels.com/
 
Upvote 0
Try doing each of the following independently to determine its effect on speed:
1. Delete all vlookup formulas (compare speed before & after deletion)
2. Delete all other formulas
3. Delete all rows & columns after the end of data (delete the "entire row & column", not just the "cell contents"
4. Delete half the rows of data
5. Double the rows of data, ie: copy all data below the end of data

If the users really need visibility to all rows of data, (eg: you can't exclude data because it is more than "X" years), you may want to investigate downloading this file to a true database program such as SQL or MSAccess. Then all calculations can be done by queries within SQL or Access and Excel can be set up to pull whatever external data the user need to see at that moment in time.

Regarding the 2nd sheet (table array), this could be a pivot table summarizing the data in the 1st sheet to calculate the number of visits per patient ID during the year. It could also be formula based. In any event, if it is created within Excel, it could be a source of the slowdown. Try deleting this sheet when you delete the vlookup formulas to see the impact on speed.

Regarding my previous post on trying to copy/paste values, what I'm referring to is the ability within Excel to replace a formula with the resulting calculation from that formula. The calculated amount remains, but the formula is gone. Formulas such as vlookup can really slow down speed in a workbook so this technique can retain all the data by removing the speed-sapping formula.
 
Upvote 0
That means you are willing to sort the data area on column A in ascending order. That allows you
to run a way faster look up formula:

Either...
Code:
[FONT=Lucida Console]=IF(VLOOKUP(C378530,Sheet1!A$2:A$119070,1,1)=C378530,[/FONT]
[FONT=Lucida Console] VLOOKUP(C378530,Sheet1!A$2:A$119070,2,1),"")[/FONT]
Or...
Code:
[FONT=Lucida Console]=IF(LOOKUP(C378530,Sheet1!A$2:A$119070)=C378530,[/FONT]
[FONT=Lucida Console] LOOKUP(C378530,Sheet1!A$2:A$119070,Sheet1!B$2:B$119070),"")[/FONT]

Are you freaking kidding me. I don't believe my eyes right now. Option 2 appears to calculate in 2 seconds. I still can't believe it and am thinking I must have done something wrong. I keep filtering on column M (old vlookup column) at it takes only seconds while the original is taking 15 minutes plus. after I wait for the original to calculate I check against your method and I have the same results. Let me do a bit more testing and pick myself off the floor and will update.
 
Upvote 0
Issue solved. As far as my untrained eyes can tell anyway. Going from around 15 minutes per calculation x 30 times a day x 6 people for over a year, is some serious hours lost. I thank all of you for your replies and contributions.

This site works and I am a fan.

Derek
 
Upvote 0

Forum statistics

Threads
1,215,261
Messages
6,123,942
Members
449,134
Latest member
NickWBA

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top