Advice on extracting multiple tables from multiple pdfs

Shortmeister1

Board Regular
Joined
Feb 19, 2008
Messages
204
Office Version
  1. 365
  2. 2016
  3. 2013
  4. 2010
  5. 2007
  6. 2003 or older
Platform
  1. Windows
Hi all

I've got an interesting problem for you that I am feeling fairly clueless about. I'm trying to create a dataset of my entire personal PDF bank statements. The data is a little messy to put it mildly and I am definitely out of my depth on this one.

My first attempt was relatively successful. I took all of the bank statements up to June '23, combined them and did an extensive but somewhat painful clean-up of the irrelevant table data.

Then Nat West Bank (bless them! :mad:) decided to radically change their statement table format from July. Once again in most cases, I'm capable of cleaning the data, although it is an even more complex procedure - combining all of the columns into one string and then selecting out the relevant data. Where I've come unstuck is that sometimes the individual transaction is in one distinct record and some times it has spilled onto a second row.

Clear as mud? Sigh!

If anyone knows how to deal with something like this, then I'd love you to respond, but I don't think it can be done as there are way too many random variables.

Thanks
Martin

(This is an excerpt from my real bank statement, but it is heavily anonymised. Please excuse the garish colours, but it does make it easier to see which transaction is which.)
2023-09-06 11_00_47-Book1 - Excel.png
 

Excel Facts

Control Word Wrap
Press Alt+Enter to move to a new row in a cell. Lets you control where the words wrap.
Is your sample worksheet the result of trying to import the PDF file directly into Excel, or did you write VBA code to import the records into Excel?
My experience has been that it is less frustrating to import the PDF data line-by-line with dedicated VBA code for each bank. My bank in Guatemala only provided PDF files, so that was my final solution.
It is also frustrating when banks decide to change the format of the their PDF's; which means creating new VBA code.
 
Upvote 0
One of the best tools for converting PDF to EXCEL files is ADOBE Convert PDF to Excel.
its results are amazing, and it also allows you to modify and edit the result before downloading it.
It may not be perfect 100%, but, from my experience with dozens of tools available, it is the best and very useful. :cool:

Just try it! :)
 
Upvote 0
Thanks for the replies guys. I'm trying to do this as a one-stop process using m-code i.e. Power Query. Whilst I posted in "Power Tools", I didn't explicitly state this. The output I posted is from Power Query. Sorry.

(Faisal, definitely not keen on putting financial data into an online tool! :) )

Bosquedeguate - I'm interested. The only problem I can see is that it isn't possible (??) to extract data from PDF to Excel using VBA without purchasing an API key from Adobe/adding an additional step to convert the document to Word. Do you have any clever ways to set up an ado connection that would just work? I'm all ears...
 
Upvote 0
Thanks for the replies guys. I'm trying to do this as a one-stop process using m-code i.e. Power Query. Whilst I posted in "Power Tools", I didn't explicitly state this. The output I posted is from Power Query. Sorry.

(Faisal, definitely not keen on putting financial data into an online tool! :) )

Bosquedeguate - I'm interested. The only problem I can see is that it isn't possible (??) to extract data from PDF to Excel using VBA without purchasing an API key from Adobe/adding an additional step to convert the document to Word. Do you have any clever ways to set up an ado connection that would just work? I'm all ears...
Regarding your question about an ADO connection ... I have no suggestions. I have a couple of financial institutions that only provide a CSV file as an export option (i.e. USAA and Navy Federal Credit Union). I have created VBA macro code in Excel to convert these files to QBO files that I can import into Quickbooks. Is CSV not an export option from your financial institution(s)?
 
Upvote 0

Forum statistics

Threads
1,215,139
Messages
6,123,264
Members
449,093
Latest member
Vincent Khandagale

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top