Huge Union Query Taking hours

crozfader

Board Regular
Joined
Mar 23, 2011
Messages
86
I'm trying to execute the query below... I has been running for more than one hour, and my computer is not too responsive anymore. I'm just trying to put these 26 tables that contain the same data (different months) into one table.

Each table has about 50,000 lines. So the total would have around 1.2 millions lines, which is not that huge. My Access DB is around 600 megs before the union...

What's the best way to create this master table if union doesn't work? Append maybe?

Suggestions are appreciated.

Thank you.



SELECT * INTO Item_Sales
FROM (
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Apr_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Apr_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Aug_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Aug_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Dec_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Dec_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Feb_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Feb_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jan_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jan_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jul_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jul_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jun_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jun_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Jun_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Mar_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Mar_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM May_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM May_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM May_2011_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Nov_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Nov_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Oct_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Oct_2010_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Sep_2009_f UNION
SELECT *, ITEM_NUMBER & '_' & PRODUCT & '_' & LOC AS IPL, ITEM_NUMBER & '_' & PRODUCT AS IP FROM Sep_2010_f );
 

Excel Facts

Using Function Arguments with nested formulas
If writing INDEX in Func. Arguments, type MATCH(. Use the mouse to click inside MATCH in the formula bar. Dialog switches to MATCH.
After waiting one hour, got an 'out of memory' error (more than 2GB file).

I'll wait for suggestions before create a new database linking to the one containing the 26 tables.

Thx
 
Upvote 0
Access can cope with 1.2 million, but that doesn't mean it's a good idea using it for that many records.

One thing you could try with your UNION query would be to use UNION ALL instead of UNION.

Without the ALL then a UNION returns unique records.

So in this case it's going to check all 50000 records in each table against each other, and the 50000 records in all the other tables.

That's quite a lot of checking and it'll do it even if all the records are unique.

Of course I'm assuming you don't need the unique records thing.

Did you try an append?

You could set that up to do one table at a time, and just change the table name in the FROM clause for each table.

Obviously that wouldn't be much fun but it would probably be a bit less time consuming than waiting for the UNION to finish.

Also if one of the appends fail you'll still have all the previous data you appended.

You try that append again or continue and go back to it at a later stage.
 
Upvote 0
Thanks for your reply.

I've created a new DB linking the one containing the 26 tables. My query is executing. I was not aware of the distinction between union and union all.


Let me make sure I understand. I have about 20 columns, for UNION to actually find and delete a duplicate, then all 26 columns would have to be equal for two records, correct?

I think I will go ahead and try union all to see how much it speeds it up.

Thanks
 
Upvote 0
As far as I know that's how it works and I have noticed the difference in speed between using ALL and not using it.

I'm sure you'll find a better explanation elsewhere.

One thing though, nothing is deleted.

By the way UNION ALL was only a suggestion to speed up the UNION.

I'd actually recommend appending the data table by table instead of using UNION.
 
Upvote 0
I tried your suggestion, and the Query is 75% done and it has only been 5 minutes... Huge difference. On top of that, I'm calling the Query from a linked DB.

Thanks!
 
Upvote 0
Which suggestion?

UNION ALL or append?
 
Upvote 0
Union All!!

Append is too manual. I need to repeat this process several times.

Btw it litteraly took 5 mins... Instead of more than an hour for UNION.
 
Upvote 0
UNION ALL is definitely quicker. If you want unique records once the query is run, do this:

1. Create a MAKE-TABLE query that pushes the UNION ALL output into a new table. You will also need a KEY field (at this stage empty).
2. Create a unique key (it may need to be concatenated) and write the output to the field using an UPDATE query.
3. Create an index on the key field; it should not be unique
4. Delete the duplicate records -- see here for a tutorial

Note: The reason for pushing the query into a table is two-fold. Performance is much better on raw data than on complex queries. And indexing gets terminally scrambled in UNION queries, so searching etc is slow and often error-prone.

Denis
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,521
Messages
6,179,285
Members
452,902
Latest member
Knuddeluff

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top