EXCEL 2010: VBA: Download .csv file with dynamic name from a secure website

mrsklb

New Member
Program: EXCEL 2010
OS: Windows 8.1

I am wanting to login to a site (I have a login script), navigate to a page, then download the `.CSV` file that will always end with a dynamic string due to it being 'custom'.

I have tried to access the site by recording a macro. However, as the data is not in a table the macro recorder is not able to pick up the actual address of the .csv file.

The display text is always:
Code:
Results [link]Click to Download[/link]
The html values are always shown as:
Code:
[COLOR=#800000][FONT=Consolas]<td[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#FF0000][FONT=Consolas]class[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]=[/FONT][/COLOR][COLOR=#0000FF][FONT=Consolas]"smallText"[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]><b>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]Results[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]</b>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#800000][FONT=Consolas]<a[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR][COLOR=#FF0000][FONT=Consolas]href[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]=[/FONT][/COLOR][COLOR=#0000FF][FONT=Consolas]"vendor_report.php?report=custom [insert extremely long string here] >[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]<u>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas]Click to Download[/FONT][/COLOR][COLOR=#800000][FONT=Consolas]</u></a></td>[/FONT][/COLOR][COLOR=#000000][FONT=Consolas] [/FONT][/COLOR]
Without using a table, is there a way to get to this .csv & save it to my PC?
I am aware that the use of
Code:
<td>
denotes it is part of a table, but it is definitely not picking it up, I've gone through the site using the macro recorder and it's not picking up the inner contents from the page.

As a secondary solution (and not my preferred):
I had also thought to navigate to the site page, highlight the text, copy & paste to a spare sheet in my book, then use some code L42 previously wrote here (below) however I can't even get the copy & paste to work correctly.

Code:
    For Each hlink In ThisWorkbook.Sheets("NameOfYourSheet").Hyperlinks
        Set wb = Workbooks.Open(hlink.Address)
        wb.SaveAs saveloc & hlink.Range.Offset(0,1).Value & ".xlsx"
        wb.Close True
        Set wb = Nothing
    Next
Please advise. Thank you in advance.






[1]: Cross posted: stackoverflow
 
Last edited:

mrsklb

New Member
Hi there,
Upon rereading my post I realised that some of the formatting was incorrect, as such the code wasn't showing correctly.

Due to my issues with how the code looks I have substituted the following:
[ or < as {
] or > as }

1) Here is how the Download links looks:
Code:
Results [link]Click to Download[/link]
2) Here is how the download link looks in html:
Code:
{td class="smallText"}{b}Results{/b} {a href="vendor_report.php?report=custom [insert extremely long string here]}{u}Click to Download{/u}{/a}{/td}
3) Here is how I know this belongs in another table, however I can't figure out which one (I have looped through all tables):
Code:
{td}
Thank you
 

mrsklb

New Member
Bump

I have the link, however there is no file name, it looks like it may be a php mask of some sort. ends with: xxxxxxxx101a1cc0

Once clicked, the link then generates a ' Payments.csv ' file name & prompts the ' save as ' dialog.

I've tried a number of different ways to get through to the 'save as' dialog, however either IE crashes or the whole process stalls.

Please help.

Cheers.


IE = Ver: 11.09
Win = 8.1 x64
Excel = 2010
 

mrsklb

New Member
Hi John,

Thank you for your help, I've furthered my ability to get a file, and i've had some success.
On the 2nd link you posted I was able to get the code to work and it downloaded a .csv, however it contained the page details in HTML form, it didn't capture the actual .csv hiding under the string.

Code:
{!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN"}{html dir="LTR" lang="en"}
{head}
{meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"}
I've had to replace the < with { and > with } to keep the code intact.


Many thanks.
 

John_w

MrExcel MVP
You have to send the same form data that your browser would send. It might be a GET request or a POST request, or both in sequence. It all depends on the site.

Code:
Dim formData As String
formData = "?????"
<code>WHTTP.Open "POST", fileUrl, False
WHTTP.SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
WHTTP.Send formData
</code>Use the console tool in your browser (or Fiddler) to examine the request and response bodies to determine the form data.
 

mrsklb

New Member
Hi John,

Thanks for the direction, I've had a play with Fiddler, I don't know exactly what I'm looking for. I think this is far beyond what I can currently understand, however I have found these items :

Code:
GET /vendors.php?vf=vieworders HTTP/1.1
Code:
(under webforms) vf  (value) vieworders
When I click on the "Click to Download" button this pops up:
Code:
GET /vendor_report.php?report=custom&q={exceptionally long strong}
In the Chrome Console, this is the response I get:
Code:
Resource interpreted as Script but transferred with MIME type text/html: "https://seal.godaddy.com/getSeal?sealID={long id string}".
Resource interpreted as Document but transferred with MIME type text/csv: "https://www.{noname}store.com/vendor_report.php?report=custom&q={long}==&k={morelong}".
Could the above be causing the issue?

I've had success with the following, it allows me to pop open the download link, enter my username & password, but then I can't move further without manual intervention, tell IE that I want to save the file manually.
When I attempt to get the file without IE, I can't move forward at all, the Username & Password isn't ever accepted.

Code:
Sub GetDets_1XXXX3()
    'sales records are in table 25
  
    'Begin sales history retrieval
    'source: http://dailydoseofexcel.com/archives/2011/03/08/get-data-from-website-that-requires-a-login/
    'ensure tools/references/microsoft internet controls = check
    'ensure tools/references/microsoft forms 2.0 = check
    
    Dim ieApp As InternetExplorer
    Dim ieDoc As Object
    Dim ieTable As Object
    Dim clip As DataObject
     
     
     'create a new instance of ie
        On Error Resume Next
        Set ieApp = New InternetExplorer
    
     'you don’t need this, but it’s good for debugging
        ieApp.Visible = True
    
    'assume we’re not logged in and just go directly to the login page
        ieApp.Navigate "{{Jumbo String}}"
            'Application.Wait Now + TimeSerial(0, 0, 10)
        Application.Wait Now + TimeSerial(0, 0, 10)
        Do While ieApp.Busy: DoEvents: Loop
        Do Until ieApp.ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
             
          Set ieDoc = ieApp.Document
     
      'fill in the login form – View Source from your browser to get the control names
        With ieDoc.forms(2)
            .UserName.Value = "name"
            .Password.Value = "pass"
            .Submit
        End With
        'Do While ieApp.Busy: DoEvents: Loop
        'Do Until ieApp.ReadyState = READYSTATE_COMPLETE: DoEvents: Loop
        Application.Wait Now + TimeSerial(0, 0, 20)
        
        'wait here for IE to popup the "Save" option
        
    'close 'er up
        ieApp.Quit
        Set ieApp = Nothing
        Sheets("XXXXX").Select
 

John_w

MrExcel MVP
Thanks for the direction, I've had a play with Fiddler, I don't know exactly what I'm looking for. I think this is far beyond what I can currently understand, however I have found these items :

Code:
GET /vendors.php?vf=vieworders HTTP/1.1
Code:
(under webforms) vf  (value) vieworders
You are sort of looking in the right place in Fiddler. The Inspectors - WebForms tab shows the parameter names and values separated by "&" in the URL query string - the part of the URL after the "?" character. For example page.php?index=23&id=2&actionID=112&a=3268b66be4108e96f2a4e67f2749bbdc&f=/data.csv

But you really need to look at the Inspectors - Raw tab for the relevant request (what Fiddler calls a session in its main window).

When I click on the "Click to Download" button this pops up:
Code:
GET /vendor_report.php?report=custom&q={exceptionally long strong}
Click that request (session) in the main Fiddler window and then click Inspectors - Raw tab. Click 'Open in Notepad' to make it easier to view the data. If that data contains the "Content-Type", "application/x-www-form-urlencoded" header then you will need to send the same form data in your WinHttp GET request as I previously posted. If the data contains a cookie (also seen via Inspectors - Cookies) then you may need to specify the cookie value (obtained from the IE.document) in a "Cookie" header, like this:
Code:
WHTTP.SetRequestHeader "Cookie", IE.document.cookie
In the Chrome Console, this is the response I get:
Code:
Resource interpreted as Script but transferred with MIME type text/html: "https://seal.godaddy.com/getSeal?sealID={long id string}".
Resource interpreted as Document but transferred with MIME type text/csv: "https://www.{noname}store.com/vendor_report.php?report=custom&q={long}==&k={morelong}".
Could the above be causing the issue?
No - the second request is Chrome requesting the csv file download. I'm not familiar with the Chrome developer tools, but just had a play around. Click that 2nd link in the Console and it takes you to the Network tab, where if you click the same path (link) it displays the request headers (the same as shown in Fiddler but in a different layout).

In summary, the technique is to use IE automation to log into the website, navigate to the required page and obtain the URL (href) for the file download link. Then use WinHttpRequest 5.1 to send a GET request to that URL to download file, specifying any headers as required. Then save the responseBody bytes in a file.

If you want to use IE automation for the whole process then there is code at http://www.mrexcel.com/forum/excel-questions/567033-automate-file-download-dialog-box-without-sendkeys.html which uses Windows API calls to handle the IE download dialogue. However I've only used it on IE8 and never tried to get it working on IE11.
 
Last edited:

Some videos you may like

This Week's Hot Topics

  • Get External Data (long shot question!)
    This is likely a long shot but I am wondering if it is at all possible for Excel to somehow 'change' the contents of a URL that is being linked to...
  • Importing multiple excel files into one spreadsheet
    Hi, I'm trying to import multiple excel files (with the same format into a single spreadsheet) so that each day's file is listed underneath the...
  • Cell Formatting
    Good Morning, I need to format a few different cells in the following manners: A1 has to always add a colon (:) after whatever is typed in by a...
  • How to copy multiple rows using If
    Hi all, I'm very new to VBA and have written this simple code to copy certain cells if a certain cell within that row contains any data. I need...
  • Workbook_Change stopped working !
    I am working on an app to speed up & automate processing of Credit Cards statements. After data is input from a CSV file, it is presented to the...
  • VBA If statement
    Dear All, I have two dates, where I'd like a message box to pop, if the dates are between this criteria. [CODE] sDate1 = #10/1/2019#...
Top