How to group data based on header

learningpython

New Member
Joined
Oct 10, 2012
Messages
17
Hi Experts,

Note: This question was asked in Power Query: Grouping Data based on Color of the font or other ways - Microsoft Q&A Unfortunately they do not support power query anymore. so I have come back to my usual place of Excel to see if you support power query and delighted to see have a power query group.

Please can you help me with this. I have some .pdf presented to me and I have only Office 16 power query so I cannot read .PDF so I copied and pasted them in excel
This is what I want

The raw data in pdf has a Header with 10s of listings under each Header and about 10000 headers.
I have only office 16 professional and cannot read PDF so I copied it to a .xls file.

Example (sorry silly one!)
Purchase Order
Customer
Bill
receipt

Address
Door / Apt #
Street
City
State
Zip

etc long list

Now when I extract to .xls using copy and paste, I get the font but my challenge is the following

I need to Create a Power Pivot for each of the Header so I get the elements associated.
Example: When I select " Purchase Order ", I must get a listing of
"Customer
Bill
receipt "


one way is to add a column with a flash fill of associated Header but How can I do it in Power Query based on Font ?
Or any other best ways you advise would be appreciated.
 

Attachments

  • OutputExpected.PNG
    OutputExpected.PNG
    5 KB · Views: 11
  • Sample_Data.PNG
    Sample_Data.PNG
    3.7 KB · Views: 11

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
We cannot manipulate data in pictures. Please reload your data using XL2BB. See my signature for instructions.
 
Upvote 0
Hi Experts,

Note: This question was asked in Power Query: Grouping Data based on Color of the font or other ways - Microsoft Q&A Unfortunately they do not support power query anymore. so I have come back to my usual place of Excel to see if you support power query and delighted to see have a power query group.

Please can you help me with this. I have some .pdf presented to me and I have only Office 16 power query so I cannot read .PDF so I copied and pasted them in excel
This is what I want

The raw data in pdf has a Header with 10s of listings under each Header and about 10000 headers.
I have only office 16 professional and cannot read PDF so I copied it to a .xls file.

Example (sorry silly one!)
Purchase Order
Customer
Bill
receipt

Address
Door / Apt #
Street
City
State
Zip

etc long list

Now when I extract to .xls using copy and paste, I get the font but my challenge is the following

I need to Create a Power Pivot for each of the Header so I get the elements associated.
Example: When I select " Purchase Order ", I must get a listing of
"Customer
Bill
receipt "


one way is to add a column with a flash fill of associated Header but How can I do it in Power Query based on Font ?
Or any other best ways you advise would be appreciated.

We cannot manipulate data in pictures. Please reload your data using XL2BB. See my signature for instructions.
GroupingwithaFONT_OR_Color.xlsx
ABCD
1Input
2Purchase OrderNote: Header could be different font or color
3Customer
4Bill
5receipt
6AddressNote: Header could be different font or color
7Door / Apt #
8Street
9City
10State
11Zip
12
13Output Expected Example 1Output Expected Example 2 etc
14Purchase OrderAddress
15CustomerDoor / Apt #
16BillStreet
17receiptCity
18State
19Zip
20
Input


Apologies, Doing this first time, I hope I have followed instructions
 
Upvote 0
I can see your data. Power Query does not recognize color nor formatting. I have no idea how to help you with the current layout and your expected results. Good Luck finding a solution. If you change your expected results then post back and I will relook at if there is a possible PQ solution. I would also urge you to show us 6-8 records with expected results. One record is not sufficient.
 
Upvote 0
You need a way to identify the headers such as a custom function:

VBA Code:
Function IsHeader(rng As Range) As Boolean
    If rng.Font.Bold Or rng.Font.Size > 10 Then IsHeader = True
End Function

Then you can transform a table like the blue one below into the green one using the M code below:

Book7
ABCDEFG
1Column1Column2Purchase OrderAddress
2Purchase OrderTRUECustomerDoor / Apt #
3CustomerFALSEBillStreet
4BillFALSEreceiptCity
5receiptFALSEState
6AddressTRUEZip
7Door / Apt #FALSE
8StreetFALSE
9CityFALSE
10StateFALSE
11ZipFALSE
12
Sheet1
Cell Formulas
RangeFormula
B2:B11B2=IsHeader(A2)


Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    tblType = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type logical}}),
    tbl1 = Table.AddColumn(tblType, "Headers", each if [Column2] = true then [Column1] else null),
    tbl2 = Table.FillDown(tbl1,{"Headers"}),
    tbl3 = Table.SelectRows(tbl2, each ([Column2] = false)),
    headers = List.Distinct(tbl3[Headers]),
    lstCols = List.Generate(()=>0, (x)=> x< List.Count(headers), (x)=> x+1, (x)=> Table.SelectRows(tbl3, each _[Headers] = headers{x})[Column1]),
    tbl4 = Table.FromColumns(lstCols,headers)
in
    tbl4
 
Upvote 0

Forum statistics

Threads
1,214,632
Messages
6,120,649
Members
448,975
Latest member
sweeberry

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top