Using VBA Power Query to import PDF tables

jharding

New Member
Joined
May 30, 2018
Messages
9
I have a PDF that has three pages - each page has a header and detail and the last page has an additional footer.

I'd like to use Power Query to import the PDF tables into Excel

I ran the File ->Get data ->From File -> From PDF and it came back with 6 tables and three pages

I'd like to import the First table as my header, then tables 2, 4 and 6 as the detail and 6 will also have the footer.

I recorded a macro when I did this and it came back with the following code (PDF_Path substituted for hard path) for Table001

VBA Code:
Sub PDF_PQ(PDF_Path As String)

    ActiveWorkbook.Queries.Add Name:="Table001 (Page 1)", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Pdf.Tables(File.Contents(""" & PDF_Path & """), [Implementation=""1.3""])," & Chr(13) & "" & Chr(10) & "    Table001 = Source{[Id=""Table001""]}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Table001,{{""Column1"", type text}, {""Column2"", type text}})" & Chr(13) & "" & Chr(10) & "in" & _
        "" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
   
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table001 (Page 1)"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table001 (Page 1)]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table001__Page_1"
        .Refresh BackgroundQuery:=False
    End With
    
End Sub

When I run this code from a button push event, it give me an error - "System Error & H80070057 (-2147024809). The Parameter is incorrect". I'm not sure what the issue is, it was generated from the system.
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
I think I figured out the error. I already had a query with the name I was trying to use.

I guess my question now is - how do I create queries for PDFs of different page counts?

Process is - user selects a PDF file. That file name gets passed to power query. PQ figures out how many pages and tables there are in the PDF. Then I would want the tables imported via queries.

The way I am doing it now, there is a lot of manual entry for Table name, etc.
 
Upvote 0
A gentle reminder:

Cross-posting (posting the same question in more than one forum) is not against our rules, but the method of doing so is covered by #13 of the Forum Rules.

Be sure to follow & read the link at the end of the rule too!

Cross posted at:

There is no need to repeat the link(s) provided above but if you have posted the question at other places, please provide links to those as well.

If you do cross-post in the future and also provide links, then there shouldn’t be a problem.
 
Upvote 0

Forum statistics

Threads
1,215,097
Messages
6,123,076
Members
449,094
Latest member
mystic19

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top