Navigate and getting data from website

arunsjain

Board Regular
Joined
Apr 29, 2016
Messages
130
Office Version
  1. 365
Platform
  1. Windows
Hi All,


I am looking for VBA code where following steps areperformed:



Open website https://www.abs.gov.au/Price-Indexes-and-Inflation


Click on link “Consumer Price Index” on that page.


Click on “Download” tab on that page.


Open .Xls file from that page (do not save that file).


Copy Data in tab “Data” from open file and paste in currentfile in sheet 1 and close that open file without saving.


Could anyone please help regarding this? Highly appreciate your help.

Cheers!!!
 
Last edited:

Excel Facts

Who is Mr Spreadsheet?
Author John Walkenbach was Mr Spreadsheet until his retirement in June 2019.
Hi

This is the first part; I will be back during the week to complete it.

Code:
Sub Scrape()
Dim Browser As InternetExplorer, Document As HTMLDocument, Elements
Dim Element As IHTMLElement, doc, i%, r As Range
Set r = [a2]
Set Browser = New InternetExplorer
Browser.Visible = True
Browser.navigate "https://www.abs.gov.au/Price-Indexes-and-Inflation"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
    DoEvents
Loop
Set doc = Browser.Document
Set Elements = doc.getElementsByTagName("a")
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Consumer Price Index" Then
        MsgBox "found"
        Elements(i).Click
    End If
Next
MsgBox Elements.Length, , "elements"
End Sub
 
Upvote 0
The page has 12 workbooks, which one do you want?

Code:
Sub Scrape()
Dim Browser As InternetExplorer, Document As HTMLDocument, Elements
Dim Element As IHTMLElement, doc, i%
Set Browser = New InternetExplorer
Browser.Visible = True
Browser.navigate "https://www.abs.gov.au/Price-Indexes-and-Inflation"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
    DoEvents
Loop
Set doc = Browser.Document
DoEvents
Set Elements = doc.getElementsByTagName("a")
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Consumer Price Index" Then
        MsgBox "found consumer"
        Elements(i).Click
    End If
Next
MsgBox Elements.Length, , "a elements"
Set doc = Browser.Document
Set Elements = doc.getElementsByTagName("span")
MsgBox Elements.Length, , "span"
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Downloads" Then
        MsgBox "found download"
        Elements(i).Click
    End If
Next
Set doc = Browser.Document
Set Elements = doc.getElementsByTagName("img")
MsgBox Elements.Length, , "Image elements"
End Sub
 
Upvote 0
Thank you so much Worf.

I need first file "TABLES1 and 2. CPI: All Groups, Index Numbers and Percentage Changes".


 
Upvote 0
IE automation is not completely reliable, but the code below brings up the dialog window to download the file.
Now I will decide whether to use the send keys method or something more elegant.

Code:
Sub Scrape()
Dim Browser As InternetExplorer, Document As HTMLDocument, Elements
Dim Element As IHTMLElement, doc, i%, r As Range, elem2, iel%
Set Browser = New InternetExplorer
Browser.Visible = True
Browser.navigate "https://www.abs.gov.au/Price-Indexes-and-Inflation"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
    DoEvents
Loop
Set doc = Browser.Document
DoEvents
Set Elements = doc.getElementsByTagName("a")
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Consumer Price Index" Then
        MsgBox "found consumer"
        Elements(i).Click
    End If
Next
MsgBox Elements.Length, , "a elements"
Set doc = Browser.Document
Set Elements = doc.getElementsByTagName("span")
MsgBox Elements.Length, , "span"
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Downloads" Then
        MsgBox "found download"
        Elements(i).Click
    End If
Next
Set doc = Browser.Document
Set elem2 = doc.getElementsByTagName("img")
iel = elem2.Length
MsgBox iel, , "Image elements"
For i = 0 To iel - 1
    Set elem2 = doc.getElementsByTagName("img")
    If elem2(i).Title Like "*136*" Then elem2(i).Click
Next
Set doc = Nothing
End Sub
 
Upvote 0
John

I did not test your method because the pop up at that site does not seem to be the download notification bar.
The code below uses send keys for the time being. The second macro copies the required data, which is the last step in this project.

Code:
[FONT=Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif]Sub Scrape2()
Dim Browser As InternetExplorer, Document As HTMLDocument, Elements
Dim Element As IHTMLElement, doc, i%, r As Range, elem2, iel%, macrosec%
macrosec = Application.AutomationSecurity
Application.AutomationSecurity = msoAutomationSecurityLow   ' avoid protected view
Set Browser = New InternetExplorer
Browser.Visible = True
Browser.navigate "https://www.abs.gov.au/Price-Indexes-and-Inflation"
Do While Browser.Busy And Not Browser.readyState = READYSTATE_COMPLETE
    DoEvents
Loop
Set doc = Browser.Document
DoEvents
Set Elements = doc.getElementsByTagName("a")
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Consumer Price Index" Then
        MsgBox "found consumer"
        Elements(i).Click
    End If
Next
MsgBox Elements.Length, , "a elements"
Set doc = Browser.Document
Set Elements = doc.getElementsByTagName("span")
For i = 0 To Elements.Length - 1
    If Elements(i).textContent = "Downloads" Then
        MsgBox "found download"
        Elements(i).Click
    End If
Next
Set doc = Browser.Document
Set elem2 = doc.getElementsByTagName("img")
iel = elem2.Length
MsgBox iel, , "Image elements"
For i = 0 To iel - 1
    Set elem2 = doc.getElementsByTagName("img")
    If elem2(i).Title Like "*136*" Then elem2(i).Click
Next
Application.Wait (Now + TimeValue("0:00:04"))
DoEvents
SendKeys "%a"           ' this is Alt + A
Set doc = Nothing
Application.AutomationSecurity = macrosec
End Sub[/FONT]

Code:
[FONT=Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif]Sub TwoWorkbooks()
Dim orig As Workbook, dataw As Workbook
If Application.Workbooks.Count > 2 Then
    MsgBox "Too many workbooks opened..."
    Exit Sub
End If
Select Case Workbooks(1).Name Like "*Automation*"
    Case True
        Set orig = Workbooks(1)
        Set dataw = Workbooks(2)
    Case False
        Set orig = Workbooks(2)
        Set dataw = Workbooks(1)
End Select
MsgBox orig.Name & vbLf & dataw.Name
dataw.Sheets("data1").UsedRange.Copy orig.Sheets("sheet1").[a1]
dataw.Close 0
End Sub[/FONT]
 
Upvote 0
John

I did not test your method because the pop up at that site does not seem to be the download notification bar.
You are correct: it doesn't display the normal Download Notification Bar at the bottom of the IE window, but a "What do you want to do with xxxxx.xxx?" dialogue:

IuDNX5n.png


I have seen this before on another website and I've found a reason why it happens. I have updated my UIAutomation code to also handle this dialogue and will post details of my diagnosis in https://www.mrexcel.com/forum/gener...nclient-automate-save-file-download-ie11.html.
 
Upvote 0
Another way:
Code:
Option Explicit
Public Sub GetCPIData()

    Dim savePath    As String
    Dim wb          As Workbook
    
    savePath = Envrion("Temp") & "\test.xls"
    Set wb = SaveCPIWorkbook(savePath)
    
    wb.Sheets("Data1").UsedRange.Copy ThisWorkbook.Sheets("Sheet1").Range("a1")
    
    wb.Close False
    
    Kill savePath

End Sub



Public Function SaveCPIWorkbook(savePath As String) As Workbook
    
    Dim ifileNum As Long
    Dim fileBytes() As Byte
    
    ifileNum = FreeFile
    
    With CreateObject("winhttp.winhttprequest.5.1")
        .Open "GET", getDownloadUrl(), False
        .send
        fileBytes = .responseBody
        Kill savePath
        Open savePath For Binary As [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=ifileNum]#ifileNum[/URL] 
           Put [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=ifileNum]#ifileNum[/URL] , , fileBytes
        Close [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=ifileNum]#ifileNum[/URL] 
    End With
    
    Set savespiworkbook = Workbooks.Open(savePath)
    
End Function

Private Function getDownloadUrl()

    Const BaseUrl   As String = "https://www.abs.gov.au/"
    Dim doc         As HTMLDocument
    
    Set doc = New HTMLDocument
    
    doc.body.innerHTML = getSyncRequestResponse(BaseUrl & "Price-Indexes-and-Inflation")
    doc.body.innerHTML = getSyncRequestResponse(BaseUrl & getAnchorByContent(doc, "#element2list a", "Consumer Price Index").pathname)
    doc.body.innerHTML = getSyncRequestResponse(BaseUrl & getAnchorByContent(doc, "#tabsJ a", "Downloads").pathname)
    
    getDownloadUrl = BaseUrl & getAnchorByContent(doc, ".listentry a", "").pathname

End Function

Private Function getAnchorByContent(dom As HTMLDocument, selector As String, content As String) As HTMLAnchorElement
    
    Dim a           As Object
    Dim nodeList    As IHTMLDOMChildrenCollection
    Dim x           As Long
    
    Set nodeList = dom.querySelectorAll(selector)
    
    For x = 0 To nodeList.Length
        Set a = nodeList(x)
        If a.innerText = content Then
            Set getAnchorByContent = a
            Exit For
        End If
    Next x
    
End Function

Private Function getSyncRequestResponse(url As String) As String
    Static request As Object
    If request Is Nothing Then Set request = CreateObject("MSXML2.XMLHTTP")
    
    With request
        .Open "GET", url, False
        .send
        getSyncRequestResponse = .responseText
    End With
    
End Function
 
Upvote 0
Thank you so much for your help Worf and Kyle123. Highly appreciated.

Cheers!!
 
Last edited:
Upvote 0

Forum statistics

Threads
1,212,933
Messages
6,110,751
Members
448,295
Latest member
Uzair Tahir Khan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top