Thanks Thanks:  0
Likes Likes:  0
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: Excel vba - extracting data from webpage using msxml2.xmlhttp

  1. #1
    New Member
    Join Date
    Dec 2018
    Posts
    13
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Excel vba - extracting data from webpage using msxml2.xmlhttp

    Hi , The below link helped me to workout a old problem i was facing of copying picture from website to excel.

    https://www.mrexcel.com/forum/excel-...namic+url+cell

    Need help again.

    The Code and modules which wasgiven was very fast , it processed 500 entries in 20-25 seconds . i had another requirement of extracting data from webpage and i tried the below code and it takes around 3-4 seconds for one record , how can i change this to module format USING MSXML2.XMLHTTP

    First column of Asin9 array contains my 100 asin's for which i am trying to nextract a particular data from amazon.in site.

    Code-
    ====
    Sub Button1_Click()
    Dim StartTime As Double
    Dim SecondsElapsed As Double
    StartTime = Timer
    For i = 1 to 100
    On Error Resume Next
    Dim IE As New InternetExplorer
    IE.Visible = False
    IE.navigate "https://www.amazon.in/dp/" & Range("Asin9")(i, 1).Value
    Do
    DoEvents
    Loop Until IE.readyState = READYSTATE_COMPLETE
    Dim Doc As HTMLDocument
    Set Doc = IE.document
    Range("Asin9")(i, 13).Value = Doc.getElementById("imgTagWrapperId").innerHTML
    Next
    SecondsElapsed = Round(Timer - StartTime, 2)
    MsgBox "Refresh Completed in" & SecondsElapsed & " seconds", vbInformation
    End Sub


    Kindly advise!!

    Thanks,

    Nishant.

  2. #2
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,305
    Post Thanks / Like
    Mentioned
    11 Post(s)
    Tagged
    3 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    I haven't been able to fully test the following code, since I don't have the specific values contained in the first column of your data. Nevertheless, see if it provides you with the desired result...

    Code:
    Option Explicit
    
    Sub Button1_Click()
    
    
        Dim xmlReq As Object
        Dim htmlDoc As Object
        Dim htmlElement As Object
        Dim url As String
        Dim i As Long
        
        On Error GoTo errHandler
        
        Set xmlReq = CreateObject("MSXML2.XMLHTTP")
        Set htmlDoc = CreateObject("HTMLFile")
        
        For i = 1 To Range("Asin9").Rows.Count
            If Len(Range("Asin9")(i, 1)) > 0 Then
                url = "https://www.amazon.in/dp/" & Range("Asin9")(i, 1).Value
                With xmlReq
                    .Open "GET", url, False
                    .send
                End With
                If xmlReq.Status <> 200 Then
                    Range("Asin9")(i, 13).Value = "Error " & xmlReq.Status & ":  " & xmlReq.statusText
                Else
                    htmlDoc.body.innerHTML = xmlReq.responseText
                    Set htmlElement = htmlDoc.getElementById("imgTagWrapperId")
                    If htmlElement Is Nothing Then
                        Range("Asin9")(i, 13).Value = "Item not found"
                    Else
                        Range("Asin9")(i, 13).Value = htmlElement.innerHTML
                    End If
                End If
            End If
        Next i
            
    exitHandler:
        Set xmlReq = Nothing
        Set htmlDoc = Nothing
        Exit Sub
        
    errHandler:
        MsgBox "Error " & Err.Number & ": " & Err.Description, vbCritical, "Error"
        Resume exitHandler
    
    
    End Sub
    Hope this helps!

  3. #3
    New Member
    Join Date
    Dec 2018
    Posts
    13
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    Hi Domenic , Thanks a lot , This worked fine and the speed has improved from 3-4 seconds to ~2.5 seconds but i could see one issue , when i run this for multiple entries 5/10/100/500 , the entries in between randomly misses out . so for example it processes around for 300 out of 500 or 70 out of 100 . I understand this can be because of network speed or my system CPU I/O but i get confused when i run the one for images as it never gives any misses. That is also on XMLHTTP

    I made few changes in code which you gave to just gather timinings and few other details.

    Sample entry -
    B077PW9V3J

    Sub Button1_Click()
    Dim xmlReq As Object
    Dim htmlDoc As Object
    Dim htmlElement As Object
    Dim htmlElement1 As Object
    Dim url As String
    Dim i As Long
    Application.ScreenUpdating = False
    Dim StartTime As Double
    Dim SecondsElapsed As Double
    Dim rRange As Range
    Dim rCell As Range
    StartTime = Timer
    'On Error GoTo errHandler
    Set ws = Worksheets("Sheet1")

    Set xmlReq = CreateObject("MSXML2.XMLHTTP")
    Set htmlDoc = CreateObject("HTMLFile")

    With ws
    LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
    Set rRange = .Range("A5:A" & LastRow)
    End With

    For Each rCell In rRange
    If Len(rCell) > 0 Then
    url = "https://www.amazon.in/dp/" & rCell.Value
    With xmlReq
    .Open "GET", url, False
    .send
    End With
    If xmlReq.Status <> 200 Then
    rCell.Offset(, 2).Value = "Error " & xmlReq.Status & ": " & xmlReq.statusText
    Else
    htmlDoc.body.innerHTML = xmlReq.responseText
    Set htmlElement = htmlDoc.getElementById("imgTagWrapperId")
    Set htmlElement1 = htmlDoc.getElementById("productTitle")
    If htmlElement Is Nothing Then
    rCell.Offset(, 2).Value = "Item not found"
    Else
    rCell.Offset(, 2).Value = htmlElement.innerHTML
    End If
    If htmlElement1 Is Nothing Then
    rCell.Offset(, 1).Value = "Item not found"
    Else
    rCell.Offset(, 1).Value = htmlElement1.innerText
    End If
    End If
    End If
    Next rCell
    Application.ScreenUpdating = True
    SecondsElapsed = Round(Timer - StartTime, 2)
    MsgBox "Refresh Completed in" & SecondsElapsed & " seconds", vbInformation

    exitHandler:
    Set xmlReq = Nothing
    Set htmlDoc = Nothing
    Exit Sub

    errHandler:
    MsgBox "Error " & Err.Number & ": " & Err.Description, vbCritical, "Error"
    Resume exitHandler

    End Sub

  4. #4
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,305
    Post Thanks / Like
    Mentioned
    11 Post(s)
    Tagged
    3 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    When you say that "the entries in between randomly misses out" what do you mean exactly? Do you get "Error 404: Not Found", or maybe "Item not found", or something else?

  5. #5
    New Member
    Join Date
    Dec 2018
    Posts
    13
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    Hi Dominic, it says item not found, i made the same entry in all 500 rows and it gave data for randomly around 250-300 entries and for other it gave item not found.

  6. #6
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,305
    Post Thanks / Like
    Mentioned
    11 Post(s)
    Tagged
    3 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    That means that the page source for the entry does not contain an element whose ID is imgTagWrapperId. If you manually check the page source for the entry, do you see an element with that ID?

  7. #7
    New Member
    Join Date
    Dec 2018
    Posts
    13
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    Hi Domenic, thanks, actually that should not be the case as I copied same entry in all 500 fields, while processing the script it gave item not found for many same entries randomly. I feel that's the system cpu issues but I never got that while processing the script for getting the images.

  8. #8
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,305
    Post Thanks / Like
    Mentioned
    11 Post(s)
    Tagged
    3 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    Can you provide us with a sample entry that returns "Item not found" ?

  9. #9
    New Member
    Join Date
    Dec 2018
    Posts
    13
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    Sample entry is B077PW9V3J, just put the same entry in 500 rows in column A and try the script

  10. #10
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,305
    Post Thanks / Like
    Mentioned
    11 Post(s)
    Tagged
    3 Thread(s)

    Default Re: Excel vba - extracting data from webpage using msxml2.xmlhttp

    It's probably because the website contains dynamic content. So, when a response is received, the content hasn't fully loaded. Therefore, I would suggest that you stick with using Internet Explorer.
    Last edited by Domenic; Jan 13th, 2019 at 03:02 PM.

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •