Power Query: Delete this, Delete those, or delete nothing?


February 25, 2020 - by

Note

This is one of a series of articles detailing solutions sent in for the Podcast 2316 challenge.

I am having a great time going through the 29 different solutions to the Data Cleansing challenge.

In my solution, and in many solutions, I came down to the point where I wanted to delete the subtotal row.

Need to delete some rows
Need to delete some rows


So far, though, there have been two very different ways of dealing with this:

  • Chris McNeil used Pivot Quarters, Fill Up, and kept only the rows I deleted!
  • Michael Karpfen kept everything. Why delete the total rows and add them back later as a total column?

Chris McNeil’s Method

Chris added the Index and Modulo column as described here.

But then he tool the Modulo and pivoted it to go across the columns.

Chris McNeil’s Method
Chris McNeil’s Method

This left a bunch of null values, which he filled with Fill Up.

Later, he kept only the employee names. In real life, his method is not going to work as written, because you will have real employee names like Andy, Bob, Charlie instead of Employee1, Employee2, Employee3, but there are other ways to identify those rows to keep.

Here is Chris’s code

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category Description", type text}, {"Dept. Total", type number}, {"Q1", type number}, {"Q2", type number}, {"Q3", type number}, {"Q4", Int64.Type}, {"Employee 1", type number}, {"Q12", type number}, {"Q23", type number}, {"Q34", Int64.Type}, {"Q45", Int64.Type}, {"Employee 2", Int64.Type}, {"Q16", Int64.Type}, {"Q27", Int64.Type}, {"Q38", Int64.Type}, {"Q49", Int64.Type}, {"Employee 3", Int64.Type}, {"Q110", Int64.Type}, {"Q211", Int64.Type}, {"Q312", Int64.Type}, {"Q413", Int64.Type}, {"Employee 4", type number}, {"Q114", type number}, {"Q215", type number}, {"Q316", type number}, {"Q417", Int64.Type}}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Category Description"}, "Attribute", "Value"),
    #"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Modulo", each Number.Mod([Index], 5), type number),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Inserted Modulo", {{"Modulo", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Inserted Modulo", {{"Modulo", type text}}, "en-US")[Modulo]), "Modulo", "Value"),
    #"Sorted Rows" = Table.Sort(#"Pivoted Column",{{"Index", Order.Ascending}}),
    #"Filled Up" = Table.FillUp(#"Sorted Rows",{"0", "1", "2", "3", "4"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([Attribute] = "Employee 1" or [Attribute] = "Employee 2" or [Attribute] = "Employee 3" or [Attribute] = "Employee 4")),
    #"Sorted Rows1" = Table.Sort(#"Filtered Rows",{{"Attribute", Order.Ascending}, {"Category Description", Order.Ascending}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Sorted Rows1",{"Category Description", "Attribute", "Index", "1", "2", "3", "4", "0"}),
    #"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"0", "Total"}, {"1", "Q1"}, {"2", "Q2"}, {"3", "Q3"}, {"4", "Q4"}}),
    #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns",{"Index"})
in
    #"Removed Columns"

Michael Karpfen’s Method

Michael had a Modulo column like many other solutions. Where most people would delete Modulo = 0, Michael kept it, but added a label of TOTAL. If the Modulo is 0, call it Total, otherwise use the name from the Quarter column. This is clever.

Michael Karpfen’s Method
Michael Karpfen’s Method

That column on the right later was pivoted to go across.

Michael then did a reordered columns to move the Totals to the far right side.

Move to end
Move to end

Here is Michael’s code:

let
    Quelle = Excel.CurrentWorkbook(){[Name="UglyData"]}[Content],
    #"Höher gestufte Header" = Table.PromoteHeaders(Quelle, [PromoteAllScalars=true]),
    #"Geänderter Typ" = Table.TransformColumnTypes(#"Höher gestufte Header",{{"Category Description", type text}, {"Dept. Total", type number}, {"Q1", type number}, {"Q2", type number}, {"Q3", type number}, {"Q4", Int64.Type}, {"Employee 1", type number}, {"Q1_1", type number}, {"Q2_2", type number}, {"Q3_3", Int64.Type}, {"Q4_4", Int64.Type}, {"Employee 2", Int64.Type}, {"Q1_5", Int64.Type}, {"Q2_6", Int64.Type}, {"Q3_7", Int64.Type}, {"Q4_8", Int64.Type}, {"Employee 3", Int64.Type}, {"Q1_9", Int64.Type}, {"Q2_10", Int64.Type}, {"Q3_11", Int64.Type}, {"Q4_12", Int64.Type}, {"Employee 4", type number}, {"Q1_13", type number}, {"Q2_14", type number}, {"Q3_15", type number}, {"Q4_16", Int64.Type}}),
    #"Entpivotierte andere Spalten" = Table.UnpivotOtherColumns(#"Geänderter Typ", {"Category Description"}, "Attribut", "Wert"),
    #"Hinzugefügter Index" = Table.AddIndexColumn(#"Entpivotierte andere Spalten", "Index", 1, 1),
    #"Hinzugefügte benutzerdefinierte Spalte" = Table.AddColumn(#"Hinzugefügter Index", "Benutzerdefiniert", each Number.Mod([Index]-1,5)),
    #"Hinzugefügte benutzerdefinierte Spalte1" = Table.AddColumn(#"Hinzugefügte benutzerdefinierte Spalte", "Benutzerdefiniert.1", each if [Benutzerdefiniert]=0 then [Attribut] else null),
    #"Nach unten gefüllt" = Table.FillDown(#"Hinzugefügte benutzerdefinierte Spalte1",{"Benutzerdefiniert.1"}),
    #"Neu angeordnete Spalten" = Table.ReorderColumns(#"Nach unten gefüllt",{"Benutzerdefiniert.1", "Attribut", "Category Description", "Wert", "Index", "Benutzerdefiniert"}),
    #"Geänderter Typ1" = Table.TransformColumnTypes(#"Neu angeordnete Spalten",{{"Benutzerdefiniert", type text}}),
    #"Hinzugefügte benutzerdefinierte Spalte2" = Table.AddColumn(#"Geänderter Typ1", "Benutzerdefiniert.2", each if [Benutzerdefiniert] = "0" then "TOTAL" else "Q"&[Benutzerdefiniert]),
    #"Neu angeordnete Spalten1" = Table.ReorderColumns(#"Hinzugefügte benutzerdefinierte Spalte2",{"Attribut", "Category Description", "Benutzerdefiniert.1", "Wert", "Index", "Benutzerdefiniert", "Benutzerdefiniert.2"}),
    #"Entfernte Spalten" = Table.RemoveColumns(#"Neu angeordnete Spalten1",{"Attribut", "Index", "Benutzerdefiniert"}),
    #"Pivotierte Spalte" = Table.Pivot(#"Entfernte Spalten", List.Distinct(#"Entfernte Spalten"[Benutzerdefiniert.2]), "Benutzerdefiniert.2", "Wert", List.Sum),
    #"Neu angeordnete Spalten2" = Table.ReorderColumns(#"Pivotierte Spalte",{"Benutzerdefiniert.1", "Category Description", "Q1", "Q2", "Q3", "Q4", "TOTAL"}),
    #"Sortierte Zeilen" = Table.Sort(#"Neu angeordnete Spalten2",{{"Benutzerdefiniert.1", Order.Ascending}}),
    #"Umbenannte Spalten" = Table.RenameColumns(#"Sortierte Zeilen",{{"Benutzerdefiniert.1", "Employee Name"}})
in
    #"Umbenannte Spalten"

Return to the main page for the Podcast 2316 challenge.

Read the next article in this series: Power Query: Isolating the Rows that represent Quarters.