How to filter multiple files for latest version of each file

Judoman

New Member
I am using Power Query in Excel for the first time. Got query to work fine but have too much unneeded data.
Query combines 1000+ files. Each file has ~150 rows of data. Files are named using serial number (SN) and Revision letter for each file.
There are multiple files of different revision levels for each serial number. The file name is shown to left of each row of data.

I only need the rows of data for the latest/highest revision of each serial number. That would equal ~ 33 rows for each serial number. This would give me ~11,000 rows of data instead of the unneeded 540,000+ rows.

So question 1 is can this be done in power query?

If yes then question 2 is how is it done? Need specific details on how to do it as I am new at this in PQ.

Here is an example of 2 Revs for SN 133. I only need the rows that have Rev B to be displayed.

Source.NameOPTIONOPTION SELECTED
APM Form for F2X DN133 SN133 Rev A.xlsAircraft ModelF-2000
APM Form for F2X DN133 SN133 Rev A.xlsWinglets InstalledYes
APM Form for F2X DN133 SN133 Rev A.xlsABOC Installed
APM Form for F2X DN133 SN133 Rev A.xlsBASC Installed700
APM Form for F2X DN133 SN133 Rev A.xlsBattery Voltage Drop SuppressionNo
APM Form for F2X DN133 SN133 Rev B.xlsAircraft ModelF-2000
APM Form for F2X DN133 SN133 Rev B.xlsWinglets InstalledYes
APM Form for F2X DN133 SN133 Rev B.xlsABOC Installed
APM Form for F2X DN133 SN133 Rev B.xlsBASC Installed700
APM Form for F2X DN133 SN133 Rev B.xlsBattery Voltage Drop SuppressionNo

<tbody>
</tbody>
 

theBardd

Rules violation
Here is one way, it assume the serial number is 3 items from the end

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
    serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
    source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
    serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
    serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
    columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
    index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
    index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
    next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
    columns.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
    next.Extpand = Table.ExpandTableColumn(columns.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
    tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
    tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
    allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
    table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
    table.Remove
 
Last edited:

Judoman

New Member
When your code was used I got:
Expression.SyntaxError: Token Eof expected.

In list of workbook queries I have a warning of Download did not complete.

Here is my original code that shows all revs of each line item:

let
Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type text}, {"Column2", type text}, {"Column3", type any}, {"Column4", type text}, {"Column5", type text}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", Int64.Type}, {"Column10", type text}, {"Column11", type any}, {"Column12", type any}, {"Column13", type any}, {"Column14", type any}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column3"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "OPTION"}, {"Column5", "OPTION SELECTED"}})
in
#"Renamed Columns"
 

theBardd

Rules violation
I didn't get that error, but I did get a duplicate step-name.

This version works for me but it is just one file, not a folder like yours, so it needs modifying.

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
    serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
    source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
    serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
    serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
    columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
    index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
    index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
    next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
    columns2.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
    next.Extpand = Table.ExpandTableColumn(columns2.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
    tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
    tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
    allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
    table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
    table.Remove
 

Judoman

New Member
Now I can see the steps under Applied Steps. But get Expression.Error: We couldn't find an Excel table named 'Table3'.Details:
Table3

What is Table 3? looks like it would be a file name. I put in name of my file but it did not work.
 

theBardd

Rules violation
As I said, my code is for one file, and that file is current workbook. I am querying a table called Table3, you will need to change it to get all of your files in first before you use the transforming code I give.
 

Judoman

New Member
As I'm new to power query I'm not sure how to get all my files in. I have an excel worksheet that contains all the files. Worksheet has total of 14 columns but I only need to use the 3 I made sample from. Do I enter the column (table) names? You already have the names of my 3 columns in what you sent me.
 

theBardd

Rules violation
Doesn't this part of the code that you supplied do that, grabbing all the files in the folder? I couldn't test it as I don't have any files.
Code:
let
    Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
    #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),
Append my code starting at columns.TYpe, change my first row refence to the previous step, and tweak it all.
 

Judoman

New Member
Sorry for not replying sooner. Been out sick a few days. Looks like I should add your code to mine. But I don't know where to add it. Nor do I know exactly what/how to change your first row to previous step - -change to what? Then how/what do I tweak it all? Below is what I currently have in Advanced Editor using your code and mine.

let
Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),






#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type text}, {"Column2", type text}, {"Column3", type any}, {"Column4", type text}, {"Column5", type text}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", Int64.Type}, {"Column10", type text}, {"Column11", type any}, {"Column12", type any}, {"Column13", type any}, {"Column14", type any}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column3"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "OPTION"}, {"Column5", "OPTION SELECTED"}})
in
#"Renamed Columns"


let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
columns2.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
next.Extpand = Table.ExpandTableColumn(columns2.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
table.Remove
 

Some videos you may like

This Week's Hot Topics

Top