VBA to Search Folder of PDF Docs & Return based on Criteria

xcellrodeo

Board Regular
Joined
Oct 27, 2008
Messages
200
Hi MrExcel Community

I would like to ask for some help with achieving the following:
BASE INFO: I have a folder with 50-100 PDF documents. Each PDF document has about 10-15 pages with a cover page.
The cover page is the first page.
TASK: Search the first page of each PDF Document to see if it matches any of the 1 to 4 criteria below AND has a date.
Each PDF document cover page will most likely have one of the Criteria 1 to 4 AND a Date.

CRITERIA 1) "Build Survey"
CRITERIA 2) "Build Sample"
CRITERIA 3) "Build Removal"
CRITERIA 4) "Build Qlty"

CRITERIA 5) "Date" followed by the actual date i.e 14 November 2020 Note: This is on every cover page!

RESULT: In Excel Sheet named 'Results', I would like the following results returned for each PDF searched.
Col A = Name of PDF file (i.e File Name)
Col B = Name of the Criteria found (as mentioned above Criteria 1-4)
Col C = 'Date' followed by the actual date in each document on page 1
NOTE: If the search comes across a PDF which does not fulfil all or any of the above criteria, I would like it still listed
in the Excel Sheet 'Results' but with comment added saying 'Not found'

Hope this makes sense.

Thanks so much for your help
:)

PS: Excel vs is 2010
 

Some videos you may like

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.

brawnystaff

Board Regular
Joined
Aug 9, 2012
Messages
92
Office Version
  1. 365
In order for this to work, you would need Acrobat Pro installed on your computer and NOT just the Adobe Reader. Do you have Acrobat Pro?
 

xcellrodeo

Board Regular
Joined
Oct 27, 2008
Messages
200
In order for this to work, you would need Acrobat Pro installed on your computer and NOT just the Adobe Reader. Do you have Acrobat Pro?
Hi Brawnystaff, Thanks for your message. Unfortunately I only have Adobe Reader not the Pro vs. Out of interest, why is the Pro vs required since all that is needed in the above process is for the PDF documents to be opened and searched by Criteria on 1st page only? No changes to the PDF document is required. Thanks
 

brawnystaff

Board Regular
Joined
Aug 9, 2012
Messages
92
Office Version
  1. 365

ADVERTISEMENT

Typically, Excel cannot read PDF files natively and needs to use APIs/references available only in Acrobat Pro to see into them.

That being said, Excel 365 has a new feature under Power Query that imports PDF content as Tables. Believe this is only available in the "Insider Channel" and is not so much for searching PDFs as much as importing them, although I am sure something in M query could be used to return results for the keywords you are looking for.
 

Macropod

Retired Moderator
Joined
Aug 27, 2007
Messages
3,527

ADVERTISEMENT

Depending on the Office version, you could open & read the PDFs with Word. However, it seems the OP is only using Office 2010, which doesn't support that.
 

xcellrodeo

Board Regular
Joined
Oct 27, 2008
Messages
200
Typically, Excel cannot read PDF files natively and needs to use APIs/references available only in Acrobat Pro to see into them.
That being said, Excel 365 has a new feature under Power Query that imports PDF content as Tables. Believe this is only available in the "Insider Channel" and is not so much for searching PDFs as much as importing them, although I am sure something in M query could be used to return results for the keywords you are looking for.
Hi again, yes thats an interesting suggestion. However I don't have Office 365 I'm afraid. Looking at the Excel University link you sent, would it not be possible to create a VBA script to perform this function via a Loop? For example the 'Open a pdf file at specified page number and scale.' ...can't this function be used to open and copy over into say a designated Excel sheet all the first pages for each PDF in a given folder? Might end up with 50 - 100 front pages. Not sure how that would be imported though....Does that get downloaded as data that can then be subsequently searched for the Criteria I mention above or is it best to import all the first pages into a Word doc instead and then write a VBA script to open and search all docs in that word doc? Or is it best to write a VBA which converts 50 -100 PDFs into Word doc and then opens and checks each Word doc if one of the Criteria is matched? Thanks :)
 

xcellrodeo

Board Regular
Joined
Oct 27, 2008
Messages
200
Depending on the Office version, you could open & read the PDFs with Word. However, it seems the OP is only using Office 2010, which doesn't support that.
Hi Macropod, thanks for your comments. Please see my above reply to brawnystaff. Thanks
 

Macropod

Retired Moderator
Joined
Aug 27, 2007
Messages
3,527
FWIW, it doesn't have to be Office 365 - any Office version after Office 2010 will do.
 

Watch MrExcel Video

Forum statistics

Threads
1,127,328
Messages
5,624,036
Members
416,007
Latest member
csf

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top