[Power query] Transpose uneven data and combine multiple occurences by delimiter

Dutsj

New Member
Joined
Jan 10, 2022
Messages
14
Office Version
  1. 365
Platform
  1. Windows
So i got some rough data from an SQL, it is in JSON and it's poorly formatted. I've somehow managed to format it almost how i want it, now i'm just limited by my skills and i can't seem to find anyone with a similar issue. I've tried several things, split by list, group, transposing you name it, i just can't seem to get it right.

I've visualized some sample data because i think that explains it better. Basically i want to transpose the first columns rows into columns, and add the values. Everytime it gets to a new column 1, that means it's a new row. It's not identical, sometimes column4name will be missing, sometimes there's only 5 columnnames, other times there's duplicate columnnames that needs to be combined with a delimiter (comma and space ", "). The author and link columns are identical until the next column1, so they only need to be added once as a new column after the transposed columns.
1641824847504.png

Basically i want this data to be transformed into the table below it if possible.
ColumnNamesValuesAuthor (same value until next column1)Link (same value until next column1)
Column1Name (column1 will always be start of a new row)
1​
Author1http:/Link1.com
Column2Name
2​
Author1http:/Link1.com
Column3Name
3​
Author1http:/Link1.com
Column3Name (There can be duplicate column names per row, join by comma)
4​
Author1http:/Link1.com
Column5Name
5​
Author1http:/Link1.com
Column1Name (column1 will always be start of a new row)aAuthor2http:/Link2.com
Column2NamebAuthor2http:/Link2.com
Column3NamecAuthor2http:/Link2.com
Column4NamedAuthor2http:/Link2.com
Column5NameeAuthor2http:/Link2.com
Column6NamefAuthor2http:/Link2.com
Column7NamegAuthor2http:/Link2.com
Column7NamehAuthor2http:/Link2.com
Column7Name (There can be duplicate column names per row, join by comma)iAuthor2http:/Link2.com
Column1Name (column1 will always be start of a new row)
101​
Author3http:/Link3.com
Column2Name
102​
Author3http:/Link3.com
Column3Name
103​
Author3http:/Link3.com
Column3Name (There can be duplicate column names per row, join by comma)
104​
Author3http:/Link3.com
Column5Name
105​
Author3http:/Link3.com
Column6Name
106​
Author3http:/Link3.com
Column7Name
107​
Author3http:/Link3.com
Column8Name
108​
Author3http:/Link3.com
So from the above table to this table below
Column1NameColumn2NameColumn3NameColumn4NameColumn5NameColumn6NameColumn7NameColumn8NameAuthorLink
1​
2​
3, 4null
5​
nullnullnullAuthor1http:/Link1.com
abcdefg, h, inullAuthor2http:/Link2.com
101​
102​
103, 104null
105​
106​
107​
108​
Author3http:/Link3.com
 

Attachments

  • 1641824816509.png
    1641824816509.png
    64.6 KB · Views: 21
The query becomes much simpler and I would assume much faster if you add four helper columns to the original table as in the example below:

Book3
ABCDEFGHIJKLMNOPQRST
1ColumnNamesValuesAuthorLinkMultipleNumberJoinedKeepColumn1NameColumn2NameColumn3NameColumn5NameColumn4NameColumn6NameColumn7NameColumn8NameAuthorLink
2Column1Name1Author1http:/Link1.comFALSE01TRUE123, 45Author1http:/Link1.com
3Column2Name2Author1http:/Link1.comFALSE02TRUEabcedfg, h, iAuthor2http:/Link2.com
4Column3Name3Author1http:/Link1.comTRUE13, 4TRUE101102103, 104105106107108Author3http:/Link3.com
5Column3Name4Author1http:/Link1.comFALSE04FALSE
6Column5Name5Author1http:/Link1.comFALSE05TRUE
7Column1NameaAuthor2http:/Link2.comFALSE0aTRUE
8Column2NamebAuthor2http:/Link2.comFALSE0bTRUE
9Column3NamecAuthor2http:/Link2.comFALSE0cTRUE
10Column4NamedAuthor2http:/Link2.comFALSE0dTRUE
11Column5NameeAuthor2http:/Link2.comFALSE0eTRUE
12Column6NamefAuthor2http:/Link2.comFALSE0fTRUE
13Column7NamegAuthor2http:/Link2.comTRUE2g, h, iTRUE
14Column7NamehAuthor2http:/Link2.comTRUE1h, iFALSE
15Column7NameiAuthor2http:/Link2.comFALSE0iFALSE
16Column1Name101Author3http:/Link3.comFALSE0101TRUE
17Column2Name102Author3http:/Link3.comFALSE0102TRUE
18Column3Name103Author3http:/Link3.comTRUE1103, 104TRUE
19Column3Name104Author3http:/Link3.comFALSE0104FALSE
20Column5Name105Author3http:/Link3.comFALSE0105TRUE
21Column6Name106Author3http:/Link3.comFALSE0106TRUE
22Column7Name107Author3http:/Link3.comFALSE0107TRUE
23Column8Name108Author3http:/Link3.comFALSE0108TRUE
Sheet1
Cell Formulas
RangeFormula
E2:E23E2=A2=A3
F2:F22F2=IF(E2,MATCH(FALSE,E3:E$23,0),0)
G2:G23G2=TEXTJOIN(", ",1,OFFSET(B2,0,0,F2+1))
H3:H23H3=IF(F2,FALSE,TRUE)
F23F23=IF(E23,MATCH(FALSE,E$23:E24,0),0)


Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    FilterKeep = Table.SelectRows(Source, each [Keep]),
    RemoveCols = Table.RemoveColumns(FilterKeep,{"Values", "Multiple", "Number","Keep"}),
    Pivot = Table.Pivot(RemoveCols, List.Distinct(RemoveCols[ColumnNames]), "ColumnNames", "Joined"),
    lstReorder = let tcn = Table.ColumnNames(Pivot) in List.Skip(tcn,2) & List.FirstN(tcn,2),
    ReorderCols = Table.ReorderColumns(Pivot,lstReorder)
in
    ReorderCols
Looks good, but i forgot to mention that i'm trying to do this only in Power BI so i'm not sure how i would achieve the helper columns in power query, i'm not very skilled in it yet ?
 
Upvote 0

Excel Facts

Format cells as date
Select range and press Ctrl+Shift+3 to format cells as date. (Shift 3 is the # sign which sort of looks like a small calendar).
The query becomes much simpler and I would assume much faster if you add four helper columns to the original table as in the example below:
Looks good, but i forgot to mention that i'm trying to do this only in Power BI so i'm not sure how i would achieve the helper columns in power query, i'm not very skilled in it yet ?
Did you see my reply? I assume you don't have the answer since you didn't reply but i just wanted to make sure :)
 
Upvote 0
I think the pivot step (tbl9) may be the slow part. If you stop at tbl8 is it a lot faster?
 
Upvote 0
I think the pivot step (tbl9) may be the slow part. If you stop at tbl8 is it a lot faster?
Yea unfortunately, i've tried adding it step for step and it freezes on tbl9 after 10 minutes ?
 
Upvote 0

Forum statistics

Threads
1,214,544
Messages
6,120,126
Members
448,947
Latest member
test111

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top