Power Query Data Source - Not Pulling from Sub Folders

richh

Board Regular
Joined
Jun 24, 2007
Messages
245
Office Version
  1. 365
  2. 2016
MS Excel 2016

This is the first time I've used a query, so I could very well have constructed the applied steps incorrectly.

My problem is that records from newer files are not present.

The query is designed to compile lists of reviewed records from weekly reports. They are the same lists, so the duplicate entries are deleted once the files are compiled.

I built a query that (should) pull records from subfolders housed in the direct data source folder; however, the query appears to only be returning files from one subfolder. My directory is structured as follows:
  • Data Source
    • Month
      • Week <-- data appears to only be coming from this folder
      • Week <-- data from here is not present
      • Week <-- No data yet
      • Week
    • Month
      • Etc...

The current results of my query show that there are 53 files to pull from, 34 of which are stored in the 1st month's 1st week folder. The other 19 are stored in the 1st month's 2nd week folder.

My query (if I constructed it correctly) should pull data in from a file, sort the files based upon completion date (if the record is incomplete, the field is left blank, otherwise a date is there), then remove the duplicates. I was hoping that doing so would remove the old rows that do not have a completed date.

Any help would be appreciated!
 

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Without seeing your PowerQuery in detail (copy it from the Advanced Editor option) it is difficult to comment. I started the process by selecting New Query, From File, From Folder. then navigate to the top level directory 'Data Source' in your case, and then choose the 'Transform Option' from there you should be able to filter out the files you want before continuing. I'm sure its possible. For info the PowerQuery I created read as follows and resulted in the list of files in spreadsheets folder and included files in the subdirectories.

Power Query:
let
    Source = Folder.Files("C:\Users\peter\Dropbox\Spreadsheets")
in
    Source
 
Upvote 0
I'm not certain, but if I sort the list when there is a date in descending order, then remove the duplicates, it should remove those that either have no date or those that have older dates, correct?

The older files should appear something like...

Primary IDReview Date
123402/03/2021
234502/04/2021
3456
4567

Newer files may come in as follows..

Primary IDReview Date
123402/03/2021
234503/10/2021 (indicates the record was re-reviewed)
345603/03/2021
4567

My expectation from the query is as follows...

Primary IDReview Date (Sorted Newest to Oldest)
234503/10/2021
345603/03/2021
123402/03/2021
4567


However, I am still seeing only the records from the older files.

Power Query:
let
    Source = Folder.Files("[Directory]...\Weekly Data Reports"),
    #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from FSP Weekly Data Reports", each #"Transform File from FSP Weekly Data Reports"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from FSP Weekly Data Reports"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from FSP Weekly Data Reports", Table.ColumnNames(#"Transform File from FSP Weekly Data Reports"(#"Sample File"))),
    //#"Changed Type" = A Bunch of changes,
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Source.Name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Columns",{{"Review Date", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Primary ID"}),
    #"Removed Blank Rows" = Table.SelectRows(#"Removed Duplicates", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Added Custom" = Table.AddColumn(#"Removed Blank Rows", "Unrev Students", each if[Review Date] = null then 1 else 0)
in
    #"Added Custom"
 
Upvote 0

Forum statistics

Threads
1,216,126
Messages
6,129,007
Members
449,480
Latest member
yesitisasport

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top