Results 1 to 4 of 4

Thread: VBA HTTP request to scrape webpage links for a URL
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular
    Join Date
    Jan 2018
    Posts
    230
    Post Thanks / Like
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Default VBA HTTP request to scrape webpage links for a URL

    I'm attempting to use HTTP requests using https://codingislove.com/parse-html-in-excel-vba/ as a guide in order to bypass having to use internet explorer. When I use the following macro it doesn't assign T_Object_S to do anything and skips over the for loop.
    Desired url: https://www.theice.com/publicdocs/fu...OTHist2019.csv

    Code:
    SUB  Get_File_URL() 
    Dim oStrm As Object, WinHttpReq As Object, T_Object_S As Object, html As New HTMLDocument, Extension As String, File_Name As String, TitleEM As Object, T_Object As HTMLHtmlElement  
    
      File= "https://www.theice.com/marketdata/reports/122"
    
    
    Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
        WinHttpReq.Open "GET", File, False 
        WinHttpReq.send
    
    
            html.body.innerHTML = WinHttpReq.responseText: Set T_Object_S = html.getElementsByClassName("document-list")
    
            For Each T_Object In T_Object_S
                
            TitleEM = T_Object.getElementsByTagName("li")(0)
                
            MsgBox TitleEM.getElementsByTagName("a")(0).innerText
    
            Next T_Object
    
    End SUB
    The code below does the same thing and with the correct result but uses internet explorer.

    Code:
    Sub Scrape_Web_ICE() 
    
        Dim internet As Object
        Dim internetdata As Object
        Dim div_result As Object
        Dim header_links As Object
        Dim link As Object
        Dim URL As String
    
    
        Set internet = CreateObject("InternetExplorer.Application")
        internet.Visible = True
    
    
        URL = "https://www.theice.com/marketdata/reports/122"
        internet.Navigate URL
    
    
        Do Until internet.ReadyState >= 4
            DoEvents
        Loop
    
    
        Application.Wait Now + TimeSerial(0, 0, 5)
    
    
        Set internetdata = internet.document
        Set div_result = internetdata.getElementsByClassName("true-grid-3")
    
    
    msgbox div_result.Item(1).all.Item(1).FirstChild.NextSibling.all.Item(0).href
     
    
    internet.Quit
    
    
    End Sub
    The targeted link:
    https://www.dropbox.com/s/8fy4rytokp...itled.png?dl=0
    Last edited by MoshiM; Jun 24th, 2019 at 04:34 AM.

  2. #2
    Board Regular Kyle123's Avatar
    Join Date
    Jan 2012
    Location
    Leeds, UK
    Posts
    2,641
    Post Thanks / Like
    Mentioned
    10 Post(s)
    Tagged
    2 Thread(s)

    Default Re: VBA HTTP request to scrape webpage links for a URL

    You need to call this url, rather than the parent:
    https://www.theice.com/CommitmentOfT...1&reportId=122

    The page you are calling is calling the url above, so in your first code, the elements you are looking for do not exist.

    You've also missed a "Set"
    Code:
    Sub Get_File_URL()
    Dim oStrm As Object, WinHttpReq As Object, T_Object_S As Object, html As New HTMLDocument, Extension As String, File_Name As String, TitleEM As Object, T_Object As HTMLHtmlElement
    
      File = "https://www.theice.com/CommitmentOfTradersReports.shtml?rcMode=1&reportId=122"
    
    
    Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
        WinHttpReq.Open "GET", File, False
        WinHttpReq.send
    
    
            html.body.innerHTML = WinHttpReq.responseText:
            
            Set T_Object_S = html.getElementsByClassName("document-list")
    
            For Each T_Object In T_Object_S
                
            Set TitleEM = T_Object.getElementsByTagName("li")(0)
                
            MsgBox TitleEM.getElementsByTagName("a")(0).innerText
    
            Next T_Object
    
    End Sub
    Though I don't know why you need to jump through these hoops as the url of the CSV doesn't seem to change, why not call it directly?
    Last edited by Kyle123; Jun 24th, 2019 at 06:18 AM.

  3. #3
    Board Regular Kyle123's Avatar
    Join Date
    Jan 2012
    Location
    Leeds, UK
    Posts
    2,641
    Post Thanks / Like
    Mentioned
    10 Post(s)
    Tagged
    2 Thread(s)

    Default Re: VBA HTTP request to scrape webpage links for a URL

    You could simplify the selection to the below

    Code:
        Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
        WinHttpReq.Open "GET", File, False
        WinHttpReq.send
    
        html.body.innerHTML = WinHttpReq.responseText
        
        Set T_Object_S = html.querySelector(".true-grid-3 > .document-list > li > a")
        MsgBox T_Object_S.innerText & " : " & "https://www.theice.com/" & T_Object_S.pathname

  4. #4
    Board Regular
    Join Date
    Jan 2018
    Posts
    230
    Post Thanks / Like
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Default Re: VBA HTTP request to scrape webpage links for a URL

    Thank you.Your solution worked.

    What if I wanted all the items like that or a specific one and not just the first? I googled your solution and if I understand correctly, I can use the following to get all of them
    Code:
    Set T_Object_S = html.querySelectorAll(".true-grid-3 > .document-list > li > a")
    but when I try and view the contents with the Locals Window it crashes the Excel file.
    Last edited by MoshiM; Jun 24th, 2019 at 11:11 AM.

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •