Help using power query and a mapping table to append multiple worksheets and files to one file

Stacker

Board Regular
Joined
Jul 11, 2021
Messages
87
Office Version
  1. 365
Platform
  1. Windows
I am following this guide and I successfully did the first 5 or so minutes and created a mapping table. I have three files, a blank template, and two files with dummy data.

Template

https://imgur.com/RfQ9joM

File 1


File 2


I want to append the contents of file 1 and 2, to the blank template. When i try to transform the sample file I used this formula.

= Table.TransformColumnNames(Table3_Table, each

List.Accumulate(Table.ToRecords(Table23),_,

(state, Current)=> Text.Replace(Text.Upper(state),current

[BEFORE], current [AFTER]) ))

Which presents this error:

Expression.Error: The name 'Table3_Table' wasn't recognized. Make sure it's spelled correctly.

Here is the source part of the transformation




Top is the header table but without any content

Table3 is the template but without any info so just headings of the final doc



Table 23 is the mapping table mentioned earlier in the vid.

What do i do?
 
Here is the transformation of the sample file I did

= Table.TransformColumnNames(Table1_Table, each List.Accumulate(Table.ToRecords(Table23), _ , (state, current)=>Text.Replace((state), current [BEFORE],current[AFTER]) ))

I get the subsequent error

Expression.Error: The column 'Ethnicity' of the table wasn't found.
Details:
Ethnicity

I played around with the final reddit query and it produces this



Book1
ABCDEFGHIJKLMNO
1NHS NumberDOBEthnicityPatientAddress1PatientAddress2Column1PatientAddress3PatientAddress4PatientPostcodeHospitalDateAdmittedDateTestedPositiveDateDischarged (if applicable)DischargeDestinationPatient Currently Admitted At Report Generation Time
212318/06/1935WHITE - BRITISH123 BUSY LANEDEWSBURYYorkWK12 123SACRED HEART01/01/202111/01/202112/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHN
312415/09/1944WHITE - BRITISH456 Bee DriveBridgeYorkWK12 124ARKHAM01/01/202101/01/202104/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHN
412518/06/1936Black124 BUSY LANEDEWSBURYTexasWK12 125SACRED HEART01/01/202111/01/202212/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTH
512615/09/1945Black457 Bee DriveBridgeAtlantaWK12 126ARKHAM01/01/202101/01/202104/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTH
612718/06/1937Black125 BUSY LANEDEWSBURYLondonWK12 127SACRED HEART01/01/202111/01/202312/01/2023NOT APPLICABLE -PATIENT DIED/STILL BIRTH
7NSacred Heart24/09/2021
8NSt Judes25/09/2021
9Mash26/09/2021
10Chicago Hope27/09/2021
11St Elmo28/09/2021
12
Sheet2
 
Last edited:
Upvote 0

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
I have abandoned using Leila's method. For your purposes it has a number of issues.
  1. It has a bug in it. It is using an unqualified Replace, and in your scenario,
    in Reddit1 you only want PatientAddress1 mapped but the Reddit2 mapping of PatientAddress is being applied to Reddit1's PatientAddress1, 2, 3 & 4

  2. The 2nd issue is, I assume that you want all column in your template in the output.
    This is easier to do using a different approach.
Can you review your Template, these columns seem to be duplicates:-
  • DateAdmitted & Admitted_date
  • Date discharged (if applicable) & Discharged_date
  • These could be different but you need to check
    Date tested Positive & Tested_date
If you are happy to change tack I can try and talk you through it but it does have quite a few moving parts, although conceptually simpler.

Essentially
  • Load from Current Workbook - Mapping table
  • Load from Current Workbook -Template
  • Load from folder both Reddit files
    • Create an index
    • Unpivot it
    • Map the columns
    • Pivot it back (using the index)
  • Using the Template Query Append Reddit file query
    • Filter out the single template row loaded
 
Last edited:
Upvote 0
I have abandoned using Leila's method. For your purposes it has a number of issues.
  1. It has a bug in it. It is using an unqualified Replace, and in your scenario,
    in Reddit1 you only want PatientAddress1 mapped but the Reddit2 mapping of PatientAddress is being applied to Reddit1's PatientAddress1, 2, 3 & 4

  2. The 2nd issue is, I assume that you want all column in your template in the output.
    This is easier to do using a different approach.
Can you review your Template, these columns seem to be duplicates:-
  • DateAdmitted & Admitted_date
  • Date discharged (if applicable) & Discharged_date
  • These could be different but you need to check
    Date tested Positive & Tested_date
If you are happy to change tack I can try and talk you through it but it does have quite a few moving parts, although conceptually simpler.

Essentially
  • Load from Current Workbook - Mapping table
  • Load from Current Workbook -Template
  • Load from folder both Reddit files
    • Create an index
    • Unpivot it
    • Map the columns
    • Pivot it back (using the index)
  • Using the Template Query Append Reddit file query
    • Filter out the single template row loaded
My supervisor said the duplicate columns are needed for other things and it's intentional. When you say Load from Current Workbook- Mapping Table do you mean create connections or Data- Get Data- From Workbook- Mapping Table table- as connection? Or do you mean actually load it onto a sheet? Anyhow, I created an index column, unpivoted said column but I am a little unsure how to map said columns
 
Upvote 0
Mapping table
Get Data > from range.
Connection only.

Reddit files
Get Data > from folder
Connection only
Add index column

Merge > Left outer join to mapping query.
Join on BEFORE column
Pull AFTER value into the query.
 
Upvote 0
Mapping table
Get Data > from range.
Connection only.

Reddit files
Get Data > from folder
Connection only
Add index column

Merge > Left outer join to mapping query.
Join on BEFORE column
Pull AFTER value into the query.
I tried to merge the index column and ok is greyed out but the other columns aren't, Is it cos I combined the two reddit files in the previous step?
 
Upvote 0
I am going to have to wind this up.

Here is my output. I can give you the code to make to do it but it is multiple queries.
You would need to be prepared to use these table names:
  • tblMapping - Mapping Table
  • tblTemplate - Template Table - NHS Number on first line called "TemplateRow1"
  • Reddit Files - Table1
The assuming is that the tblMapping, tblTemplate and the output is all in the same workbook (which is also the workbook with the queries).

I have not used any manually modified formulas, to produce the below it has been via the ribbon interface.

Main Incl Mapping Param1 v06.xlsx
IJKLMNOPQRSTUVWXYZAAAB
17NHS NumberDOBEthnicity (from Hospital)EthnicityPatientAddress1PostcodePatient UPRNPatient TertiaryClassificationHospitalDateAdmittedDate tested PositiveDate discharged (if applicable)Discharge DestinationInpatientAdmissionTimeInpatientDischargeTimeTrustCombinedAdmitted_dateTested_dateDischarged_date
1812318/06/1935WHITE - BRITISH123 BUSY LANEWK12 123SACRED HEART1/01/202111/01/202112/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
1912415/09/1944WHITE - BRITISH456 Bee DriveWK12 124ARKHAM1/01/20211/01/20214/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2012518/06/1936Black124 BUSY LANEWK12 125SACRED HEART1/01/202111/01/202212/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2112615/09/1945Black457 Bee DriveWK12 126ARKHAM1/01/20211/01/20214/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2212718/06/1937Black125 BUSY LANEWK12 127SACRED HEART1/01/202111/01/202312/01/2023NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
232468918/06/1935WHITE - BRITISH123 BUSY LANEWK12 123Sacred Heart24/09/20211/01/2021123 Random
242469015/09/1944WHITE - BRITISH456 Bee DriveWK12 124St Judes25/09/20211/01/2021111 Random
252469118/06/1936Black124 BUSY LANEWK12 125Mash26/09/20211/01/2021333 Random
262469215/09/1945Black457 Bee DriveWK12 126Chicago Hope27/09/20211/01/2021444 Random
272469318/06/1937Black125 BUSY LANEWK12 127St Elmo28/09/20211/01/2021555 Random
Mapping
 
Upvote 0
I am going to have to wind this up.

Here is my output. I can give you the code to make to do it but it is multiple queries.
You would need to be prepared to use these table names:
  • tblMapping - Mapping Table
  • tblTemplate - Template Table - NHS Number on first line called "TemplateRow1"
  • Reddit Files - Table1
The assuming is that the tblMapping, tblTemplate and the output is all in the same workbook (which is also the workbook with the queries).

I have not used any manually modified formulas, to produce the below it has been via the ribbon interface.

Main Incl Mapping Param1 v06.xlsx
IJKLMNOPQRSTUVWXYZAAAB
17NHS NumberDOBEthnicity (from Hospital)EthnicityPatientAddress1PostcodePatient UPRNPatient TertiaryClassificationHospitalDateAdmittedDate tested PositiveDate discharged (if applicable)Discharge DestinationInpatientAdmissionTimeInpatientDischargeTimeTrustCombinedAdmitted_dateTested_dateDischarged_date
1812318/06/1935WHITE - BRITISH123 BUSY LANEWK12 123SACRED HEART1/01/202111/01/202112/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
1912415/09/1944WHITE - BRITISH456 Bee DriveWK12 124ARKHAM1/01/20211/01/20214/01/2021NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2012518/06/1936Black124 BUSY LANEWK12 125SACRED HEART1/01/202111/01/202212/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2112615/09/1945Black457 Bee DriveWK12 126ARKHAM1/01/20211/01/20214/01/2022NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
2212718/06/1937Black125 BUSY LANEWK12 127SACRED HEART1/01/202111/01/202312/01/2023NOT APPLICABLE -PATIENT DIED/STILL BIRTHNOT APPLICABLE -PATIENT DIED/STILL BIRTH
232468918/06/1935WHITE - BRITISH123 BUSY LANEWK12 123Sacred Heart24/09/20211/01/2021123 Random
242469015/09/1944WHITE - BRITISH456 Bee DriveWK12 124St Judes25/09/20211/01/2021111 Random
252469118/06/1936Black124 BUSY LANEWK12 125Mash26/09/20211/01/2021333 Random
262469215/09/1945Black457 Bee DriveWK12 126Chicago Hope27/09/20211/01/2021444 Random
272469318/06/1937Black125 BUSY LANEWK12 127St Elmo28/09/20211/01/2021555 Random
Mapping

Thank you, how do i implement the code?
 
Upvote 0
This is how my queries are organised and named.
For the reddit get from folder. You are best off creating the structure first by importing the file manually.
Filtering on files starting with Reddit and exploding Binary.

If you start with a clean slate hopefully the files will come out with the same names as the below for the Transform files.

Create a blank query for tblMapping & FinalRptTemplateFormat and drop in the code.

Also drop in the code for the other 2 shown below.

For you to minimise changes use the table names I mentioned previously.

1627220658407.png


tblMapping

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="tblMapping"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"BEFORE", type text}, {"AFTER", type text}})
in
    #"Changed Type"

Transform Sample File

Power Query:
let
    Source = Excel.Workbook(Parameter1, null, true),
    Table1_Table = Source{[Item="Table1",Kind="Table"]}[Data],
    #"Added Index" = Table.AddIndexColumn(Table1_Table, "Index", 0, 1, Int64.Type),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns", {"Attribute"}, tblMapping, {"BEFORE"}, "tblMapping", JoinKind.LeftOuter),
    #"Expanded tblMapping" = Table.ExpandTableColumn(#"Merged Queries", "tblMapping", {"AFTER"}, {"AFTER"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded tblMapping", each ([AFTER] <> null)),
    #"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"Index", Order.Ascending}}),
    #"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Attribute"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[AFTER]), "AFTER", "Value")
in
    #"Pivoted Column"

CombinedRedditFiles - You will need to change the source folder here

Power Query:
let
    Source = Folder.Files("C:\Users\QuestionNaming"),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.StartsWith([Name], "Reddit")),
    #"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Table Column1",{"Source.Name", "Index"})
in
    #"Removed Columns"


FinalRptTemplateFormat

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="tblTemplate"]}[Content],
    #"Appended Query" = Table.Combine({Source, CombinedRedditFiles}),
    #"Filtered Rows" = Table.SelectRows(#"Appended Query", each ([NHS Number] <> "TemplateRow1")),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows",{{"NHS Number", Int64.Type}, {"DOB", type date}, {"Ethnicity (from Hospital)", type text}, {"Ethnicity", type any}, {"PatientAddress1", type text}, {"Postcode", type text}, {"Patient UPRN", type any}, {"Patient TertiaryClassification", type any}, {"Hospital", type text}, {"DateAdmitted", type date}, {"Date tested Positive", type date}, {"Date discharged (if applicable)", type date}, {"Discharge Destination", type text}, {"InpatientAdmissionTime", type text}, {"InpatientDischargeTime", type any}, {"Trust", type any}, {"Combined", type any}, {"Admitted_date", type any}, {"Tested_date", type any}, {"Discharged_date", type any}})
in
    #"Changed Type"
 
Upvote 0
My Template table looks like this:

Main Incl Mapping Param1 v06.xlsx
ABCDEFGHIJKLMNOPQRST
3NHS NumberDOBEthnicity (from Hospital)EthnicityPatientAddress1PostcodePatient UPRNPatient TertiaryClassificationHospitalDateAdmittedDate tested PositiveDate discharged (if applicable)Discharge DestinationInpatientAdmissionTimeInpatientDischargeTimeTrustCombinedAdmitted_dateTested_dateDischarged_date
4TemplateRow1
Template


Your mapping table also still has issues to use this:-

Main Incl Mapping Param1 v06.xlsx
AB
1BEFOREAFTER
2NHS NumberNHS Number
3DOBDOB
4EthnicityEthnicity (from Hospital)
5PatientAddressPatientAddress1
6PatientAddress1PatientAddress1
7PatientPostcodePostcode
8HospitalHospital
9DateAdmittedDateAdmitted
10DateTestedPositiveDate tested Positive
11DateDischarged (if applicable)Date discharged (if applicable)
12DischargeDestinationDischarge Destination
13DateOfBirthDOB
14EthnicCategoryEthnicity (from Hospital)
15PostcodePostcode
16DateDischargedDate discharged (if applicable)
17AddressDischargedToDischarge Destination
18Patient Currently Admitted At Report Generation TimeInpatientAdmissionTime
19NHSNumberNHS Number
20Patient UPRNPatient UPRN
21Patient TertiaryClassificationPatient TertiaryClassification
22InpatientDischargeTimeInpatientDischargeTime
23TrustTrust
24CombinedCombined
25Admitted_dateAdmitted_date
26Tested_dateTested_date
27Discharged_dateDischarged_date
Mapping
 
Upvote 0

Forum statistics

Threads
1,215,432
Messages
6,124,860
Members
449,194
Latest member
HellScout

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top