Good times everyone,
Using: windows 7 professional, excel 2010 and ie 8.0
Please forgive my almost close to "absolute zero" knowledge of vba - an absolute beginner, in the immortal words of David Bowie -
Googling, cutting, pasting and with trial and error and error and error... managed to prepare some code for what i needed and so far been successful thanks to the work of many member and MVPs in this forum.
Now i am truly stuck, please have mercy: don't send me to interpret code that does not contain the nomenclature listed below! )
DESCRIPTION
01. Given an intranet URL, lets call it: doc_list
This is a URL (company intranet, so it is useless to post it - there is no access from outside) that contains a list of documents with hyperlinks.
02. How does doc_list look?
The URL doc_list has a list of documents, and the list is organized as follows:
10-2R1 Title of document 10
0101-R3 Title of document 0101
...
...
1820 Title of document 1820
There are about 2500 documents.
03. How does doc_list work?
The numbers on the left have hyperlinks to a pdf file, or sometimes HTML file that contains the document itself.
When clicking on the link provided the target pdf or HTML file opens.
Check the hyperlinks in the example above: the pdf or HTML filenames "look like" the nomenclature shown in doc_list but are not identical, the hyphen is gone and the "R" converted to an "r", sometimes! There is no consistency.
04. What do i need to do?
Create a vba macro in excel 2010 that will:
04.a - open the doc_list URL
04.b - FOR EACH DOCUMENT IN doc_list:
04.b.1 Follow the link provided**
04.b.2 - create a folder named T-pdf in a path selected by the user (with a browse function similar to "Save As")
04.b.3 - save each target file of the hyperlinks in doc_list in the folder called T-pdf we just created
04.b.4 - print all the HTML files saved in folder T-pdf to pdf
04.b.5 - delete all the HTML files from folder T-pdf
04.b.6 - capture the filename of each pdf file into variable doc_name.
04.b.7 - calculate a new filename, save that name in variable newdoc_name.
The new name is consistent with the following nomenclature: T00xy-mRn.pdf or T0xyz-mRn.pdf or Twxyz-mRn.pdf
Where: m, R, and n could be present or not, and the number of leading zeroes is used to allow proper numerical sorting of the files.
Already have a procedure to calculate the correct filename newdoc_name once the current filename is captured by variable doc_name, let's call that procedure: run_index
04.b.8 - rename (or save as) each pdf file with the name calculated by run_index in 04.g stored in variable newdoc_name
04.c - end when all the documents listed in doc_list were processed
** the HTML source of doc_list contains all the links, perhaps it can be used.
Hope you can help me with this, please treat me as a complete illiterate - i don't mind... in fact: i deserve it!
Muchas gracias
a.
Using: windows 7 professional, excel 2010 and ie 8.0
Please forgive my almost close to "absolute zero" knowledge of vba - an absolute beginner, in the immortal words of David Bowie -
Googling, cutting, pasting and with trial and error and error and error... managed to prepare some code for what i needed and so far been successful thanks to the work of many member and MVPs in this forum.
Now i am truly stuck, please have mercy: don't send me to interpret code that does not contain the nomenclature listed below! )
DESCRIPTION
01. Given an intranet URL, lets call it: doc_list
This is a URL (company intranet, so it is useless to post it - there is no access from outside) that contains a list of documents with hyperlinks.
02. How does doc_list look?
The URL doc_list has a list of documents, and the list is organized as follows:
10-2R1 Title of document 10
0101-R3 Title of document 0101
...
...
1820 Title of document 1820
There are about 2500 documents.
03. How does doc_list work?
The numbers on the left have hyperlinks to a pdf file, or sometimes HTML file that contains the document itself.
When clicking on the link provided the target pdf or HTML file opens.
Check the hyperlinks in the example above: the pdf or HTML filenames "look like" the nomenclature shown in doc_list but are not identical, the hyphen is gone and the "R" converted to an "r", sometimes! There is no consistency.
04. What do i need to do?
Create a vba macro in excel 2010 that will:
04.a - open the doc_list URL
04.b - FOR EACH DOCUMENT IN doc_list:
04.b.1 Follow the link provided**
04.b.2 - create a folder named T-pdf in a path selected by the user (with a browse function similar to "Save As")
04.b.3 - save each target file of the hyperlinks in doc_list in the folder called T-pdf we just created
04.b.4 - print all the HTML files saved in folder T-pdf to pdf
04.b.5 - delete all the HTML files from folder T-pdf
04.b.6 - capture the filename of each pdf file into variable doc_name.
04.b.7 - calculate a new filename, save that name in variable newdoc_name.
The new name is consistent with the following nomenclature: T00xy-mRn.pdf or T0xyz-mRn.pdf or Twxyz-mRn.pdf
Where: m, R, and n could be present or not, and the number of leading zeroes is used to allow proper numerical sorting of the files.
Already have a procedure to calculate the correct filename newdoc_name once the current filename is captured by variable doc_name, let's call that procedure: run_index
04.b.8 - rename (or save as) each pdf file with the name calculated by run_index in 04.g stored in variable newdoc_name
04.c - end when all the documents listed in doc_list were processed
** the HTML source of doc_list contains all the links, perhaps it can be used.
Hope you can help me with this, please treat me as a complete illiterate - i don't mind... in fact: i deserve it!
Muchas gracias
a.