SQL TOP n from SQL query?

Jaymond Flurrie

Well-known Member
Joined
Sep 22, 2008
Messages
919
Office Version
  1. 365
Platform
  1. Windows
I have a working SQL query:


Code:
SELECT SUM(a10a), SUM(a11a), SUM(a12a), SUM(a13a), SUM(a14a), SUM(a15a), SUM(a16a), SUM(a17a), SUM(a19a), SUM(a20a), SUM(a21a), SUM(a22a), SUM(a23a), SUM(a24a), SUM(a25a), SUM(a26a), SUM(a30a), SUM(a31a), SUM(a32a), SUM(a33a), SUM(a34a), SUM(a35a), SUM(a36a), SUM(a37a), SUM(a38a), SUM(a39a), SUM(a40a), SUM(a41a), SUM(a42a), SUM(a43a), SUM(a44a), SUM(a45a), SUM(a48a), SUM(a49a), SUM(a50a), SUM(a53a), SUM(a54a), SUM(a59a), SUM(a60a), SUM(a61a), SUM(a62a), SUM(a63a), SUM(a64a), SUM(a65a), SUM(a66a), SUM(a67a), SUM(a68a), SUM(a69a), SUM(a70a), SUM(a71a), SUM(a72a), SUM(a73a), SUM(a74a), SUM(a75a), SUM(a76a), SUM(a77a), SUM(a78a), SUM(a79a), SUM(a80a), SUM(a82a), SUM(a83a), SUM(a84a), SUM(a85a), SUM(a86a), SUM(a87a), SUM(a88a), SUM(a89a) FROM [Ostoaineisto$];

But how do I turn this to TOP n query? Ie. Let's says these 10, 11, 12... 89 are the sums that this query results. How do I select the columns where this sum is in Top 40 of those sums?

I'm pretty sure the answer should include that query mentioned above as a subquery and I'd guess it should treat that one above with "AS" and probably also somehow "IN", but I'm not sure what the actual query-line should be.

You can also simplify your answer quite a lot, take for example a "TOP 2" and that working SQL query from above as, say,
Code:
SELECT SUM(a10a), SUM(a11a), SUM(a12a) FROM [Ostoaineisto$];
just to keep it simple.

I would imagine it goes something like
Code:
SELECT TOP 40 FROM (SELECT SUM(a10a), SUM(a11a), SUM(a12a) FROM [Ostoaineisto$]);
, but please help me to complete this one.
 

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Jaymond

Do you actually have 89 fields that you want to sum?

Have you considered restructuring the data into something a bit more normalized?

If you did it would make both your original query and the TOP n query a lot easier to do.
 
Upvote 0
Jaymond

Do you actually have 89 fields that you want to sum?

Have you considered restructuring the data into something a bit more normalized?

If you did it would make both your original query and the TOP n query a lot easier to do.

I don't think normalization is possible. Each column has only ones and zeros besides the header row and none of those columns has relationship from one to another, excluding one single field - the header column. So to keep the number of columns manageable, I need to cut out the ones that doesn't have the sum among the top-40 ie. I'll pick those top-40 and throw the rest away. And that 89 columns is just a test sample.
 
Upvote 0
Why do the columns need to be related?

Perhaps you could make a 2 field table with the first field being the current field name and the oher field being the 1/0.

You could then sum them with a simple query.

Or perhaps you can't, what type of data is this?

Are you sure what you want to do couldn't be done in something like Excel?

All you would need to do for the sum part would be to add a SUM at the bottom of each field.
 
Upvote 0
Why do the columns need to be related?

Perhaps you could make a 2 field table with the first field being the current field name and the oher field being the 1/0.

You could then sum them with a simple query.

Or perhaps you can't, what type of data is this?

Are you sure what you want to do couldn't be done in something like Excel?

All you would need to do for the sum part would be to add a SUM at the bottom of each field.

Otherwise a good idea, but one huge problem comes from actually putting the data "as is" to Excel. The real data I have is 2400 columns and a million rows. I need to cut the data down a lot, and a random sample of 100k from that million rows is ok, but I need to first cut down the number of the columns to 40 and those 40 have to be the ones with the biggest sums. As far as I can think this thru, the smartest way is to make SQL query against that original data (it's in a 6GB flatfile that I'll break down to a couple of Access databases) and then handle the SQL query results in Excel. I already know how to pick those 100k random rows, but this column part is more of a problem because of the requirement of those sums. Yes, I can query each sum first to Excel, then check the top-40 of them and finally query against the Access DBs by having the names of those top-40 as the columns I want to select (like you suggested), but I think this is possible to do relatively easily with SQL just like that.

And before anyone thinks of asking, no, SQL Server and Oracle are not options. This has to be done with Office alone.
 
Upvote 0
Jaymond

I'm not so sure it is possible using straight SQL, not easily anyway.

Have you considered code?

If you have all the records in a table in Access you could go through the table field by field getting the sum for each.

That could be added to an array, or even another table.

Actually I think another table would be best - much easier to sort/query than an array.

Anyway, once you've got all the sums in a table run a TOP n query against it that returns the field names you want.

Then construct another, simple SELECT query, by looping through that query.

This would be the query that would return the data you want, maybe.:)

By the way, hasn't Access got a 256 field limit?

Are you going to deal with that by breaking up the data into multiple tables?

Does that make any sense?
 
Upvote 0
Jaymond

I'm not so sure it is possible using straight SQL, not easily anyway.

Have you considered code?

If you have all the records in a table in Access you could go through the table field by field getting the sum for each.

That could be added to an array, or even another table.

Actually I think another table would be best - much easier to sort/query than an array.

Anyway, once you've got all the sums in a table run a TOP n query against it that returns the field names you want.

Then construct another, simple SELECT query, by looping through that query.

This would be the query that would return the data you want, maybe.:)

By the way, hasn't Access got a 256 field limit?

Are you going to deal with that by breaking up the data into multiple tables?

Does that make any sense?

I guess the smartest way is to count these 2400 sums to a table like you suggested. Can I get somehow the name of the column I sum to be included in SQL-query results, or is the smartest way to write these column names "by hand" (actually, by parsing from the flatfile) to the table?

Oh and Access has 256 field and 2 GB database size limit, that's why I would split that 6GB flatfile to pieces. I'm not sure whether it's smarter to split it into three databases with multiple tables or to 15 databases.
 
Upvote 0
I was thinking of writing the sums to a table with 2 fields, one for the field name and one for the sum.

I've actually had a think and you could probably do it with 2400 queries (or however many columns there are and a UNION query.

But with code you could just loop through each field, run a simple query to get the sum/count and then just insert the field name and sum into the 'totals' table.

Perhaps it's me but that sounds easier than creating some hugh UNION query which is going to be prone to errors.
 
Upvote 0
I was thinking of writing the sums to a table with 2 fields, one for the field name and one for the sum.

I've actually had a think and you could probably do it with 2400 queries (or however many columns there are and a UNION query.

But with code you could just loop through each field, run a simple query to get the sum/count and then just insert the field name and sum into the 'totals' table.

Perhaps it's me but that sounds easier than creating some hugh UNION query which is going to be prone to errors.

This UNION query would naturally be automated (ie. created in a loop) and tested out with much smaller data. If it works with three, it probably works with 2400 - unless there's some Access-limitation.

When I have ran queries with Excel thru ADODB, the queries have been pretty slow and taken quite a while to execute, no matter how simple the query is and no matter how small recordset it results. Are Access-queries faster if they are run from VBA? I already have code to return sum of each column, but I would need to query for field names in a table too. I mean, if I have field names "field1a", "field2a" and "field3a", what's the query to return recordset of "field1a", "field2a" and "field3a" instead of whatever data these columns actually contain?
 
Upvote 0

Forum statistics

Threads
1,215,086
Messages
6,123,035
Members
449,092
Latest member
ikke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top