Best Method of "Cleaning-Up" Excel Report from Payroll Software

goob90

New Member
Joined
Nov 12, 2021
Messages
32
Office Version
  1. 365
Platform
  1. Windows
Hello!

I am not really sure how to ask what I need to do, other than ask for suggestions. I have the report that I have to download monthly and compare it with an insurance report from our vendor. The payroll software we use is outdated, but I have no other choice than to use it.

Right now, the file is an .xls structured as:
1637094096776.png

Ideally, I would like to have the file structured smiliar to:
1637094038892.png

I've included a link for a sample of the workbook. Keep in mind, I've only used one employee for the example (we have hundreds).


The task is very time-consuming and I know there has to be a way I can automate it. I typically use Power Query for these types of tasks, but I cannot not seem to get a suitable format.

Any help is greatly appreciated. Thanks!
 

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.

severynm

Active Member
Joined
Jan 8, 2021
Messages
335
Office Version
  1. 365
Platform
  1. Windows
Hello!

How does this work for you? Swap out the initial source path to whatever you need.

The data is pretty messy coming in, so I ended up just adding new columns in with some logic to rebuild up the Employee Name, Check Date, and Amount columns from scratch. Some filling of values up and down and then filtering out blanks gets us most of the way there. Add in some last steps to clean up and format, and the result looks pretty good to me.
Power Query:
let
    Source = Excel.Workbook(File.Contents("C:\Users\########\Desktop\data.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Removed Top Rows" = Table.Skip(Sheet1_Sheet,13),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows",{"Column3", "Column4", "Column6", "Column7", "Column9", "Column10", "Column11", "Column13", "Column15", "Column5", "Column8", "Column12"}),
    #"Added Col 'Employee'" = Table.AddColumn(#"Removed Columns", "Employee", each if [Column2] <> null and Text.Contains([Column2],"-") then [Column2] else null, type text),
    #"Replaced Errors in 'Employee'" = Table.ReplaceErrorValues(#"Added Col 'Employee'", {{"Employee", null}}),
    #"Filled 'Employee' Up" = Table.FillUp(#"Replaced Errors in 'Employee'",{"Employee"}),
    #"Filled 'Employee' Down" = Table.FillDown(#"Filled 'Employee' Up",{"Employee"}),
    #"Added Col 'Check Date'" = Table.AddColumn(#"Filled 'Employee' Down", "Check Date", each if [Column2] is date then [Column2] else null, type date),
    #"Added Col 'Amount'" = Table.AddColumn(#"Added Col 'Check Date'", "Amount", each if [Column2] is date then [Column14] else null, type number),
    #"Removed Columns 2" = Table.RemoveColumns(#"Added Col 'Amount'",{"Column2", "Column14"}),
    #"Renamed Col 'Benefit'" = Table.RenameColumns(#"Removed Columns 2",{{"Column1", "Benefit"}}),
    #"Changed Type Col 'Benefit'" = Table.TransformColumnTypes(#"Renamed Col 'Benefit'",{{"Benefit", type text}}),
    #"Filled Down 'Benefit'" = Table.FillDown(#"Changed Type Col 'Benefit'",{"Benefit"}),
    #"Filtered Rows 'Check Date != null'" = Table.SelectRows(#"Filled Down 'Benefit'", each ([Check Date] <> null)),
    #"Split Col 'Employee'" = Table.SplitColumn(#"Filtered Rows 'Check Date != null'", "Employee", Splitter.SplitTextByDelimiter(" - ", QuoteStyle.Csv), {"Employee Number", "Employee Name"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Col 'Employee'",{{"Employee Number", Int64.Type}, {"Employee Name", type text}}),
    #"Extracted Text Before Delimiter 'Benefit'" = Table.TransformColumns(#"Changed Type", {{"Benefit", each Text.BeforeDelimiter(_, ":"), type text}})
in
    #"Extracted Text Before Delimiter 'Benefit'"

1637185914367.png
 
Solution

goob90

New Member
Joined
Nov 12, 2021
Messages
32
Office Version
  1. 365
Platform
  1. Windows
Hello!

How does this work for you? Swap out the initial source path to whatever you need.

The data is pretty messy coming in, so I ended up just adding new columns in with some logic to rebuild up the Employee Name, Check Date, and Amount columns from scratch. Some filling of values up and down and then filtering out blanks gets us most of the way there. Add in some last steps to clean up and format, and the result looks pretty good to me.
Power Query:
let
    Source = Excel.Workbook(File.Contents("C:\Users\########\Desktop\data.xlsx"), null, true),
    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Removed Top Rows" = Table.Skip(Sheet1_Sheet,13),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows",{"Column3", "Column4", "Column6", "Column7", "Column9", "Column10", "Column11", "Column13", "Column15", "Column5", "Column8", "Column12"}),
    #"Added Col 'Employee'" = Table.AddColumn(#"Removed Columns", "Employee", each if [Column2] <> null and Text.Contains([Column2],"-") then [Column2] else null, type text),
    #"Replaced Errors in 'Employee'" = Table.ReplaceErrorValues(#"Added Col 'Employee'", {{"Employee", null}}),
    #"Filled 'Employee' Up" = Table.FillUp(#"Replaced Errors in 'Employee'",{"Employee"}),
    #"Filled 'Employee' Down" = Table.FillDown(#"Filled 'Employee' Up",{"Employee"}),
    #"Added Col 'Check Date'" = Table.AddColumn(#"Filled 'Employee' Down", "Check Date", each if [Column2] is date then [Column2] else null, type date),
    #"Added Col 'Amount'" = Table.AddColumn(#"Added Col 'Check Date'", "Amount", each if [Column2] is date then [Column14] else null, type number),
    #"Removed Columns 2" = Table.RemoveColumns(#"Added Col 'Amount'",{"Column2", "Column14"}),
    #"Renamed Col 'Benefit'" = Table.RenameColumns(#"Removed Columns 2",{{"Column1", "Benefit"}}),
    #"Changed Type Col 'Benefit'" = Table.TransformColumnTypes(#"Renamed Col 'Benefit'",{{"Benefit", type text}}),
    #"Filled Down 'Benefit'" = Table.FillDown(#"Changed Type Col 'Benefit'",{"Benefit"}),
    #"Filtered Rows 'Check Date != null'" = Table.SelectRows(#"Filled Down 'Benefit'", each ([Check Date] <> null)),
    #"Split Col 'Employee'" = Table.SplitColumn(#"Filtered Rows 'Check Date != null'", "Employee", Splitter.SplitTextByDelimiter(" - ", QuoteStyle.Csv), {"Employee Number", "Employee Name"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Col 'Employee'",{{"Employee Number", Int64.Type}, {"Employee Name", type text}}),
    #"Extracted Text Before Delimiter 'Benefit'" = Table.TransformColumns(#"Changed Type", {{"Benefit", each Text.BeforeDelimiter(_, ":"), type text}})
in
    #"Extracted Text Before Delimiter 'Benefit'"

View attachment 51513
Thank you so much!

I am running into a little issue:
1637257875011.png


I may not be inserting it right or missing something.

I import the worksheet in Power Query and select sheet one and "Transform Data"
1637257955404.png

I then open the advanced editor and see:
1637257979921.png

I then insert the code keeping the same source that my workbook has:


1637258125389.png


Any idea what I am doing wrong?
 

severynm

Active Member
Joined
Jan 8, 2021
Messages
335
Office Version
  1. 365
Platform
  1. Windows
Look closely - the navigation step (the second line of code after let) is different in your last two pictures. My code used an xlsx file for the import, so its possible that step might be slightly different for xls files. Try to use the navigation step you get after connecting to the xls file yourself, and then add the remaining parts I provided.
 

goob90

New Member
Joined
Nov 12, 2021
Messages
32
Office Version
  1. 365
Platform
  1. Windows
Look closely - the navigation step (the second line of code after let) is different in your last two pictures. My code used an xlsx file for the import, so its possible that step might be slightly different for xls files. Try to use the navigation step you get after connecting to the xls file yourself, and then add the remaining parts I provided.
You are a Godsend. Thank you so much. I resaved my file as an xlsx and that did the trick.
 

Forum statistics

Threads
1,181,848
Messages
5,932,442
Members
436,838
Latest member
rparthireddy

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top