Joining multiple SQL databases in Power Query

masplin

Active Member
Joined
May 10, 2010
Messages
413
I'm brand new to Power Query, but been using Powerpivot for a while. i'm looking to convert all my PP SQL queries into Power query so that I can use the cool paramaterisation some kind people explained to me.

My main query combines 5 different tables using the following SQL

Code:
SELECT        Reporting_Transactions_Fact.Location, Reporting_Transactions_Fact.TransID, Reporting_BusDates.BusDate,              Reporting_Plus_Fact.LineNum, Reporting_Plus_Fact.PluNo, Reporting_Plus_Fact.Qty, Reporting_Plus_Fact.TotalValue , 
               Reporting_Transactions_Fact.SaleTotal,
                      Reporting_Plus_Fact.ClerkID, Reporting_Terminals.EcrID, Reporting_Plus_Fact.AdjustmentsValue, 
                      Reporting_Plus_Fact.TaxNonAddValue, Reporting_Plus_Fact.SeatingArea, Reporting_Times.Hour, Reporting_Times.Minute                          
FROM            Reporting_Terminals INNER JOIN
                         Reporting_Plus_Fact ON Reporting_Terminals.EcrID = Reporting_Plus_Fact.EcrID INNER JOIN
                         Reporting_BusDates ON Reporting_Plus_Fact.BusDateID = Reporting_BusDates.BusDateID INNER JOIN
                         Reporting_Transactions_Fact ON Reporting_Plus_Fact.TransID = Reporting_Transactions_Fact.TransID  INNER JOIN
                         Reporting_Times ON Reporting_Plus_Fact.TimeID = Reporting_Times.TimeID 




WHERE        Reporting_Plus_Fact.CorrectionType = 0 AND Reporting_Transactions_Fact.SalesMode = 0 AND Reporting_Transactions_Fact.SalesType = 0 AND
              Reporting_BusDates.BusDate < '2016-01-01'  AND Reporting_BusDates.BusDate >'2015-07-31' 
                        AND  Reporting_Transactions_Fact.Location >0000

So I guess I need to convert this into "M", but i have even less clue about that than I do SQL!


  1. Is there any straight forward way to convert SQL to M?
  2. Assuming not I loaded up the 5 tables in PQ and set about merging them using the join function. This created 2 new questions
  3. I joined the first 2 tables together , but one has 10mio rows so took forever to do the join. Should I filter down the first table to something manageable and then remove the filters once all 5 tables are joined?
  4. After the first join completed I see the columns of the first table with a new column "table". What do I click on next to join the 3rd table? Do I click on the table that contains the other join field or on this new "Table"?
Thanks for any advice
Mike
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
1.Is there any straight forward way to convert SQL to M?

Answer: No – you need to speak “M” (Power Query’s language). The code will be translated into SQL-statements (you can see them if you use the Profiler-tool for SQL-server). This means that you have to create the joins - like you apparently did - by merging the tables:

2.Assuming not I loaded up the 5 tables in PQ and set about merging them using the join function. This created 2 new questions
3.I joined the first 2 tables together , but one has 10mio rows so took forever to do the join. Should I filter down the first table to something manageable and then remove the filters once all 5 tables are joined?


Answer: Yes, you got it!
As long as you are designing your queries, it makes sense to shorten your long tables (which would normally be your Data/Fact-tables) by dynamic filters – even if you wouldn’t need them later on.

So in your example here, you would work you way backward through the SQL-statements: Import the tables that contain the filters and apply these filters (as narrow as possible) and don’t load to data model, but “Only create connection”. Then import the tables that are not going to be filtered according to the SQL-statement (but apply filters if necessary during design phase).

4.After the first join completed I see the columns of the first table with a new column "table". What do I click on next to join the 3rd table? Do I click on the table that contains the other join field or on this new "Table"?


Answer: Yes - you need to build the joins sequentially here - just do as the SQL-statement reads (in the FROM-section).

So after you created your first join between “Reporting_Terminals” and “”Reporting_Plus_Fact” you click on the expand-arrows of the table and only select the fields that are included in the SELECT-section of your query + the ID that you need to connect the following table to connect (ON-part of the FROM-section): “Reporting_Plus_Fact.BusDateID” here. This is the key that you’re going to join the next query/table on: “Reporting_BusDates”…

This should bring back the results quite quickly, because “query folding” should take place: Meaning that the filtering of the data is done at the server and only the filtered results are returned to Power Query and need to be read there (which is what takes the time). You can then save the workbook and set the filter values of your parameters to the “real” ones – which will result in a longer loading time (but shorter as if this would happen in the query-editor). Query folding should still happen for the query in your example.

However – there are actions that would prevent query folding to happen which would result in dramatically decreased performance – have a look at my blog post how this could happen and what do to about it. (Warning: This could sound Chinese again :) )
 
Last edited:
Upvote 0
On last dumb question . Once ive done the merge I have BusDateID and BusDateID2, but actually I don't need either so can i remove the columns without breaking the merge? I assume so
 
Upvote 0
It was all going so well!!! So have my original 5 queries. I've merged them all into the first one and all looks great. I renamed the query with all the merges to "Transaction". Then i clicked close & load to...chose make connection and load to data model. Then the trouble started!!!

firstly it started loading all 5 tables not just the merged one. I panicked and ended the import. i went back to query editor and tried to delete the 4 original queries that i no longer need, but it says i can't as it is referenced by the main query. So there must be a way to only load the finished merged query and not the components or you duplicate all the data?

Secondarily the "close & load to" is now greyed out so I can only "close & load", but doing that does nothing. I tried a refresh all on my data model but only pulling my old SQL queries. so where has by query gone and how do I get it working again?

so close but.....

Thanks
Mike
 
Upvote 0
Sorry, should have said this as well: You only load the last query that did the consolidation to the data model. The other queries stay "Only create connection". (You need to keep them as they have their jobs to do with every refresh).

The greyed-out-field is in fact a bit strange in Power Query: You can only use it when you decide where to load the query the first time. Instead now you have to edit the settings from Excel: Power Query -> Workbook Queries -> Show Pane -> right-click with your mouse on the query you want to edit: Load To: There is the dialogue again.

Now that you've been thrown head first into Power Query, I can only recommend to get it going :) This book is an excellent starting point for Excel users: Excelguru Help Site - M is for (Data) Monkey
 
Upvote 0
easy when you know how!!! Good idea I'll buy the book. Thanks so much for your help as think I can now parameterise this main query and make a big leap forward in usability.

Mike
 
Upvote 0
sorry be annoying now. Is it better to apply all the filters in the top table so they are all in one place or or to apply the filters to each sub table. I did the latter as thought it would cut down the amount of data being passed to the merge. However makes it less transparent what filters you have applied if you have to look through 5 tables.
 
Upvote 0
As a rule of thumb I'd apply the filters as soon as possible. Folding might be possible at later stages as well, but you cannot be sure then.

Re transparency of your code using my "Management Studio" Cleaning Your #PowerBI Power Query Code | SQL with Dustin Ryan might help.
On sheet "Analyze" you'd see the code of all your queries in a readable format at a glance. You can also easily filter or search there for keywords like "SelectRows".
 
Upvote 0
Hi Imke

Sorry to be a pest, but got the whole thing working great and all the parameters feeding perfectly. so feeling pretty pleased with myself until I did a refresh!. If I put the parameters into my SQL query the refresh take 1min 11s, but if I put it in power query the refresh takes 3 min. Id this what you expect? I've gained flexibility for a huge decrease in performance. i'm guessing not or everyone wouldn't be so thrilled about Power query. Any idea why it would be so much slower as this is a non starter time wise as only a very small portion of the full data set so could be adding an hour to a full load.

Maybe I added some inefficient steps when merging the tables as renamed a few columns, removed unnecessary ones, add one calculated column.

Thanks for any tips

Mike
 
Upvote 0

Forum statistics

Threads
1,214,386
Messages
6,119,214
Members
448,874
Latest member
b1step2far

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top