Parsing pdf into excel

GScott

New Member
Joined
Oct 12, 2013
Messages
28
Office Version
  1. 365
Platform
  1. Windows
What is a good way to parse pdf into excel?
It needs to be flexible, perhaps working from copied data from the pdf.

Thanks for any help
 

Excel Facts

Wildcard in VLOOKUP
Use =VLOOKUP("Apple*" to find apple, Apple, or applesauce
Bring the PDF into Power Query Editor. Manipulate your data and close and load to Excel.
 
Upvote 0
I need to provide better quality detail as this didn't really work.
The pdf contains credit card transactions and could look like this

Date Amount Description
30/07/2019 50.00 The description
goes over a couple
of rows I need to merge

Each month the statement will have different numbers of transactions however with each transaction having 3 lines of description.
Can power query be this flexible?
Do I need to recreate the query for each month or can I repurpose?
 
Upvote 0
Just noticed my mail has formatted correctly. There are 3 columns and the description column contents runover 3 rows underneath one another
 
Upvote 0
Using XL2BB, (no pictures), upload a sample of 8-10 records and then mock up the expected results for those records.

Your current explanation is confusing and I cannot visualize your needs.
 
Upvote 0
I have removed the solution mark from this thread.
If your query has not been solved, please do not mark it as such until it has been.
 
Upvote 0
Apologies for delay. XLS2bb took some work to install. You will notice that the description appears over 2 rows - I would like to merge these descriptions into one row
Power Query Output.xlsx
ABC
1DateAmountDescription
231/10/2023$11.61Sakuramobairu
31,100.00 JPY Rate:0.010554
402/11/2023$83.23Holiday Inn Santiago A Santiago
547,016.00 CLP Rate:0.001770
602/11/2023$25.77Whittakers
728.03 NZD Rate:0.919372
802/11/2023$20.23Vantage Bar
922.00 NZD Rate:0.919545
1009/11/2023$16.06Refugio Dickson Vertic Pto. Natales
119,326.00 CLP Rate:0.001722
12
13
14Preferred way of data presentation
1531/10/2023$11.61Sakuramobairu 1,100.00 JPY Rate:0.010554
16
Sheet1
 
Upvote 0
With Power Query

Book1
ABC
1DateAmountDescription
210/31/2023$11.61Sakuramobairu
31,100.00 JPY Rate:0.010554
411/2/2023$83.23Holiday Inn Santiago A Santiago
547,016.00 CLP Rate:0.001770
611/2/2023$25.77Whittakers
728.03 NZD Rate:0.919372
811/2/2023$20.23Vantage Bar
922.00 NZD Rate:0.919545
1011/9/2023$16.06Refugio Dickson Vertic Pto. Natales
119,326.00 CLP Rate:0.001722
12
13
14DateAmountNew Description
1510/31/202311.61Sakuramobairu 1,100.00 JPY Rate:0.010554
1611/2/202383.23Holiday Inn Santiago A Santiago 47,016.00 CLP Rate:0.001770
1711/2/202325.77Whittakers 28.03 NZD Rate:0.919372
1811/2/202320.23Vantage Bar 22.00 NZD Rate:0.919545
1911/9/202316.06Refugio Dickson Vertic Pto. Natales 9,326.00 CLP Rate:0.001722
Sheet1

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Amount", Currency.Type}}),
    #"Filled Down" = Table.FillDown(#"Changed Type",{"Date"}),
    #"Added Custom" = Table.AddColumn(#"Filled Down", "Custom", each if [Amount]= null then [Description] else null),
    #"Filled Up" = Table.FillUp(#"Added Custom",{"Custom"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([Amount] <> null)),
    #"Merged Columns" = Table.CombineColumns(#"Filtered Rows",{"Description", "Custom"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"New Description")
in
    #"Merged Columns"
 
Upvote 0
Solution
Thanks, that solved it.
However the PDF can have a variable amount of pages and tables in it, depending on the month (I receive these pdf's monthly). Sometimes I need the query to extract data from 2 pages and then the next month I need to extract data from 5 pages. Can this flexibility be built into the code?
Thanks
 
Upvote 0

Forum statistics

Threads
1,215,330
Messages
6,124,308
Members
449,151
Latest member
JOOJ

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top