PowerQuery - Using Parameter in SQL query before filtering [huge datasource]

Nyanko

Active Member
Joined
Sep 1, 2005
Messages
437
Hi,
I'm using PowerQuery to access and return data in an SQL table. The entire unfiltered table is enormous so I would rather the SQL query return the selected data rather than to return it all and then use PQ to filter it.

I have saved the start and end date into the functions : fnGetStartDate and fnGetEnddate using 2 versions of the below code

Rich (BB code):
let
    Source = Excel.CurrentWorkbook(){[Name="tblParameters"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Parameter", type text}, {"Value", type date}}),
    Value = #"Changed Type"{0}[Value]
in
    Value

The query I need to return contains the half hourly energy use for a site and I only want to return a weeks worth...

Rich (BB code):
let
    Source = Odbc.Query("dsn=SystemsLink", 
        "SELECT Lookup.Lookup_Name, Contacts.Name, DataProfile.Date, #(lf)Sum(DataProfile.[00:30]) AS [00:30], Sum(DataProfile.[01:00]) AS [01:00], Sum(DataProfile.[01:30]) AS [01:30], Sum(DataProfile.[02:00]) AS [02:00], Sum(DataProfile.[02:30]) AS [02:30], Sum(DataProfile.[03:00]) AS [03:00], Sum(DataProfile.[03:30]) AS [03:30], Sum(DataProfile.[04:00]) AS [04:00], Sum(DataProfile.[04:30]) AS [04:30], Sum(DataProfile.[05:00]) AS [05:00], Sum(DataProfile.[05:30]) AS [05:30], Sum(DataProfile.[06:00]) AS [06:00], Sum(DataProfile.[06:30]) AS [06:30], Sum(DataProfile.[07:00]) AS [07:00], Sum(DataProfile.[07:30]) AS [07:30], Sum(DataProfile.[08:00]) AS [08:00], Sum(DataProfile.[08:30]) AS [08:30], Sum(DataProfile.[09:00]) AS [09:00], Sum(DataProfile.[09:30]) AS [09:30], Sum(DataProfile.[10:00]) AS [10:00], Sum(DataProfile.[10:30]) AS [10:30], Sum(DataProfile.[11:00]) AS [11:00], Sum(DataProfile.[11:30]) AS [11:30], Sum(DataProfile.[12:00]) AS [12:00], Sum(DataProfile.[12:30]) AS [12:30], Sum(DataProfile.[13:00]) AS [13:00], Sum(DataProfile.[13:30]) AS [13:30], Sum(DataProfile.[14:00]) AS [14:00], Sum(DataProfile.[14:30]) AS [14:30], Sum(DataProfile.[15:00]) AS [15:00], Sum(DataProfile.[15:30]) AS [15:30], Sum(DataProfile.[16:00]) AS [16:00], Sum(DataProfile.[16:30]) AS [16:30], Sum(DataProfile.[17:00]) AS [17:00], Sum(DataProfile.[17:30]) AS [17:30], Sum(DataProfile.[18:00]) AS [18:00], Sum(DataProfile.[18:30]) AS [18:30], Sum(DataProfile.[19:00]) AS [19:00], Sum(DataProfile.[19:30]) AS [19:30], Sum(DataProfile.[20:00]) AS [20:00], Sum(DataProfile.[20:30]) AS [20:30], Sum(DataProfile.[21:00]) AS [21:00], Sum(DataProfile.[21:30]) AS [21:30], Sum(DataProfile.[22:00]) AS [22:00], Sum(DataProfile.[22:30]) AS [22:30], Sum(DataProfile.[23:00]) AS [23:00], Sum(DataProfile.[23:30]) AS [23:30], Sum(DataProfile.[24:00]) AS [24:00]
            FROM Lookup INNER JOIN ((Contacts INNER JOIN Points ON Contacts.Id = Points.Contacts_Id) INNER JOIN DataProfile ON Points.Id = DataProfile.Point_Id) ON Lookup.Lookup_Id = Contacts.Group_1
            GROUP BY Lookup.Lookup_Name, Contacts.Name, DataProfile.Date, DataProfile.Date
            HAVING (((DataProfile.Date) Between '6/4/2018' And '6/10/2018'))
            ORDER BY DataProfile.Date;"),
    #"Filtered Rows" = Table.SelectRows(#"Reordered Columns1", each ([Site] = fnGetSiteName))
in
    #"Filtered Rows"

I am using a fnGetSiteName to filter after the data has been requested - but how can I replace the date values with fnGetStartDate and fnGetEndDate ?

Thanks in advance
 

Some videos you may like

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().

peter789

Board Regular
Joined
Nov 20, 2016
Messages
115
I'm sure what you request is possible. It's on my list of things to look into at some stage where I have inherited complex SQL connection strings to cope with.
However can't you connect from power query directly to the database tables? If you structure the merge operation correctly the magic of query folding will take place and the filtering will take place before the entire dataset is downloaded?
Peter
 

Ron Coderre

MrExcel MVP
Joined
Jan 14, 2009
Messages
2,316
I would do it this way

•I would change this line in your SQL:
Code:
     HAVING (((DataProfile.Date) Between '6/4/2018' And '6/10/2018'))
To this:
Code:
     HAVING (((DataProfile.Date) Between 'optStartDate' And 'optEndDate'))


• Add 4 steps before "Source" (Using the advance editor)
Code:
StartDate = Date.ToText(fnGetStartDate, "yyyy-MMM-dd"),
EndDate = Date.ToText(fnGetEndDate, "yyyy-MMM-dd"),
BaseSqlText = (put your sql code here),
SqlText = Text.Replace(Text.Replace(BaseSqlText, "optStartDate", StartDate), "optEndDate", EndDate),


THEN...Use this source step:
Code:
Source = Odbc.Query("dsn=SystemsLink", SqlText),
(Actually, I would read the SQL from a range in Excel...but I don't want to make this too complicated for you)

Is that something you can work with?
 

Nyanko

Active Member
Joined
Sep 1, 2005
Messages
437
That is amazing.

Worked like a charm, thank you so much. This has now given me a strategy to use when writing other queries.
 

Nyanko

Active Member
Joined
Sep 1, 2005
Messages
437
I'm sure what you request is possible. It's on my list of things to look into at some stage where I have inherited complex SQL connection strings to cope with.
However can't you connect from power query directly to the database tables? If you structure the merge operation correctly the magic of query folding will take place and the filtering will take place before the entire dataset is downloaded?
Peter

After reading this article about Native Queries : https://www.excelguru.ca/blog/2015/04/22/native-database-query-security-prompts/

I'm starting to have a look at your suggestion and the concept of "Query Folding", however one of the four SQL tables (DataProfile) I connect to is at least 3,500,000 rows and 28 columns

I have bought in to PQ all the tables I need and created a merge query to deal with the joins, however every time i change a parameter the merge query loads all of the 3.5 million rows before filtering which is a load time of 2.3 minutes compared to the SQL query which loads in 2 seconds.

Am I doing this right ?

Code:
let
    Source = Odbc.DataSource("dsn=SystemsLink", [HierarchicalNavigation=true]),
    ENERGISE_EMSQL_Database = Source{[Name="XXX",Kind="Database"]}[Data],
    dbo_Schema = XXX_Database{[Name="dbo",Kind="Schema"]}[Data],
    Lookup_Table = dbo_Schema{[Name="Lookup",Kind="Table"]}[Data],
    #"Removed Other Columns" = Table.SelectColumns(Lookup_Table,{"Lookup_Id", "Lookup_Name"}),

//JOIN CLIENTS TO SITES    
    #"Merged Queries" = Table.NestedJoin(#"Removed Other Columns",{"Lookup_Id"},tblContacts,{"Group_1"},"Contacts",JoinKind.Inner),
    #"Filtered Rows" = Table.SelectRows(#"Merged Queries", each ([Lookup_Name] = fnGetClient)),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Lookup_Name", "Client"}}),
    #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"Lookup_Id"}),
    #"Expanded Contacts" = Table.ExpandTableColumn(#"Removed Columns", "Contacts", {"Id", "Name"}, {"Contacts.Id", "Contacts.Name"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded Contacts",{{"Contacts.Name", "Site"}}),
    #"Filtered Rows1" = Table.SelectRows(#"Renamed Columns1", each ([Site] = fnGetSite)),
    
//JOIN SITES TO METERS
    #"Merged Queries1" = Table.NestedJoin(#"Filtered Rows1",{"Contacts.Id"},tblPoints,{"Contacts_Id"},"Points",JoinKind.LeftOuter),
    #"Removed Columns1" = Table.RemoveColumns(#"Merged Queries1",{"Contacts.Id"}),
    #"Expanded Points" = Table.ExpandTableColumn(#"Removed Columns1", "Points", {"Id"}, {"Points.Id"}),
    
//JOIN METERS TO HALF HOURLY DATA POINTS
    #"Merged Queries2" = Table.NestedJoin(#"Expanded Points",{"Points.Id"},tblDataProfile,{"Point_Id"},"tblDataProfile",JoinKind.LeftOuter),
    #"Expanded tblDataProfile" = Table.ExpandTableColumn(#"Merged Queries2", "tblDataProfile", {"Date", "00:30", "01:00", "01:30", "02:00", "02:30", "03:00", "03:30", "04:00", "04:30", "05:00", "05:30", "06:00", "06:30", "07:00", "07:30", "08:00", "08:30", "09:00", "09:30", "10:00", "10:30", "11:00", "11:30", "12:00", "12:30", "13:00", "13:30", "14:00", "14:30", "15:00", "15:30", "16:00", "16:30", "17:00", "17:30", "18:00", "18:30", "19:00", "19:30", "20:00", "20:30", "21:00", "21:30", "22:00", "22:30", "23:00", "23:30", "24:00", "TotalUnits"}, {"tblDataProfile.Date", "tblDataProfile.00:30", "tblDataProfile.01:00", "tblDataProfile.01:30", "tblDataProfile.02:00", "tblDataProfile.02:30", "tblDataProfile.03:00", "tblDataProfile.03:30", "tblDataProfile.04:00", "tblDataProfile.04:30", "tblDataProfile.05:00", "tblDataProfile.05:30", "tblDataProfile.06:00", "tblDataProfile.06:30", "tblDataProfile.07:00", "tblDataProfile.07:30", "tblDataProfile.08:00", "tblDataProfile.08:30", "tblDataProfile.09:00", "tblDataProfile.09:30", "tblDataProfile.10:00", "tblDataProfile.10:30", "tblDataProfile.11:00", "tblDataProfile.11:30", "tblDataProfile.12:00", "tblDataProfile.12:30", "tblDataProfile.13:00", "tblDataProfile.13:30", "tblDataProfile.14:00", "tblDataProfile.14:30", "tblDataProfile.15:00", "tblDataProfile.15:30", "tblDataProfile.16:00", "tblDataProfile.16:30", "tblDataProfile.17:00", "tblDataProfile.17:30", "tblDataProfile.18:00", "tblDataProfile.18:30", "tblDataProfile.19:00", "tblDataProfile.19:30", "tblDataProfile.20:00", "tblDataProfile.20:30", "tblDataProfile.21:00", "tblDataProfile.21:30", "tblDataProfile.22:00", "tblDataProfile.22:30", "tblDataProfile.23:00", "tblDataProfile.23:30", "tblDataProfile.24:00", "tblDataProfile.TotalUnits"}),
    
//HARD CODE THESE DATES FOR NOW - TO BE REPLACED BY PARAMETERS
    #"Filtered Rows2" = Table.SelectRows(#"Expanded tblDataProfile", each [tblDataProfile.Date] >= #date(2018, 5, 14) and [tblDataProfile.Date] <= #date(2018, 5, 15)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows2",{"Points.Id"}),
    #"Renamed Columns2" = Table.RenameColumns(#"Removed Columns2",{{"tblDataProfile.Date", "Date"}, {"tblDataProfile.00:30", "00:30"}, {"tblDataProfile.01:00", "01:00"}, {"tblDataProfile.01:30", "01:30"}, {"tblDataProfile.02:00", "02:00"}, {"tblDataProfile.02:30", "02:30"}, {"tblDataProfile.03:00", "03:00"}, {"tblDataProfile.03:30", "03:30"}, {"tblDataProfile.04:00", "04:00"}, {"tblDataProfile.04:30", "04:30"}, {"tblDataProfile.05:00", "05:00"}, {"tblDataProfile.05:30", "05:30"}, {"tblDataProfile.06:00", "06:00"}, {"tblDataProfile.06:30", "06:30"}, {"tblDataProfile.07:00", "07:00"}, {"tblDataProfile.07:30", "07:30"}, {"tblDataProfile.08:00", "08:00"}, {"tblDataProfile.08:30", "08:30"}, {"tblDataProfile.09:00", "09:00"}, {"tblDataProfile.09:30", "09:30"}, {"tblDataProfile.10:00", "10:00"}, {"tblDataProfile.10:30", "10:30"}, {"tblDataProfile.11:00", "11:00"}, {"tblDataProfile.11:30", "11:30"}, {"tblDataProfile.12:00", "12:00"}, {"tblDataProfile.12:30", "12:30"}, {"tblDataProfile.13:00", "13:00"}, {"tblDataProfile.13:30", "13:30"}, {"tblDataProfile.14:00", "14:00"}, {"tblDataProfile.14:30", "14:30"}, {"tblDataProfile.15:00", "15:00"}, {"tblDataProfile.15:30", "15:30"}, {"tblDataProfile.16:00", "16:00"}, {"tblDataProfile.16:30", "16:30"}, {"tblDataProfile.17:00", "17:00"}, {"tblDataProfile.17:30", "17:30"}, {"tblDataProfile.18:00", "18:00"}, {"tblDataProfile.18:30", "18:30"}, {"tblDataProfile.19:00", "19:00"}, {"tblDataProfile.19:30", "19:30"}, {"tblDataProfile.20:00", "20:00"}, {"tblDataProfile.20:30", "20:30"}, {"tblDataProfile.21:00", "21:00"}, {"tblDataProfile.21:30", "21:30"}, {"tblDataProfile.22:00", "22:00"}, {"tblDataProfile.22:30", "22:30"}, {"tblDataProfile.23:00", "23:00"}, {"tblDataProfile.23:30", "23:30"}, {"tblDataProfile.24:00", "24:00"}, {"tblDataProfile.TotalUnits", "TotalUnits"}})
in
    #"Renamed Columns2"
 

Watch MrExcel Video

Forum statistics

Threads
1,109,020
Messages
5,526,297
Members
409,694
Latest member
bastos21

This Week's Hot Topics

Top