I am wanting to be able to toggle between using Microsoft internet controls and MSXMLHTTP on the fly and use as much of the same code as possible

mctabish

New Member
Joined
Nov 2, 2009
Messages
33
I have the need to scrape some sites. Most of the time, I can use MSXMLHTTP, but there are times I neet to access the loaded page to get some of the data.
There are 2 ways that I would like to control this. The first is before launching the scrape via userform, if I know that I will need to load the page.
The other case is if I get an error when trying to reach a piece of info
It is my understanding that one I get the document in to the dov variabel that the processing of the data should be the same.

So, here is what I see as some psuado code.
VBA Code:
public Need_IeError as boolean
public need_ieForm as boolean
both Need_Ie would be set to false 
'form would have a check box where if a certain piece of info is requested, it would mark Need_Ie as true
'then the constol modual with a loop to control the scrape would launch creating the url in a for next loop, then call the 'scrape module
sub Control
'luanch form get options the start loop
for i = 1 to 
url = whatever from ws
if scrape(url) then
progress update to form
else 
failed! Need to set Need_Ie to TRUE the return the scrape
Need_IeError= true
scrape(url)
'turn off the error
Need_IeError= false
Next 
end sub

Function Srape(url) as boolean

'if either Need_ie are true lauch IE mode,
'here I am inserting the actaul code I am trying. The MSXMLHTTP portion already works with all fields, so that I what I need the IE to mimic.

    If ieMode Then
        Dim oHttp As New InternetExplorer
        oHttp.Visible = True
        oHttp.navigate url
        Do
            DoEvents
        Loop Until oHttp.readyState = READYSTATE_COMPLETE
        'Set oHttp = oHttp.document
        HTMLdoc.body.innerHTML = oHttp.document

    Else
        On Error Resume Next
        Set oHttp = New MSXML2.XMLHTTP60
        If Err.Number <> 0 Then
            Set oHttp = CreateObject("MSXML.XMLHTTPRequest")
            MsgBox "Error 0 has occured while creating a MSXML.XMLHTTPRequest object"
        End If
        On Error GoTo 0
        If oHttp Is Nothing Then
            MsgBox "For some reason I wasn't able to make a MSXML2.XMLHTTP object"
            Exit Sub
        End If

        'Open the URL in browser object
        oHttp.Open "GET", url, False
        oHttp.send
           HTMLdoc.body.innerHTML = oHttp.responseText
    End If


'Then from here I process the page calling functions  that return strings like this:
s_Get_Instructions = Get_Instructions(HTMLdoc.body)
or 
sGet_VolPrice = Get_VolPrice(HTMLdoc)
'I would need to have the function return a string "ERROR" and the if this fails, set Need_IeError to TRUE, then exit the 'SCRAPE with a FALSE with a if block like this:
if sGet_VolPrice = "ERROR" then
Need_IeError = TRUE
Scrape = FALSE 
Exit function
end if

'process all of the data as needed....

end function

The issues I am running in to is the oHttp is trying to be declared 2 tiimes, and the IE when I try to get element like this fails:
s_header = HTMLdoc.getElementsByTagName("h1")(0).innerText
But this works in the MSXMLHTTP



Any help/thoughts?
Thanks
Bruce
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.

Forum statistics

Threads
1,215,073
Messages
6,122,970
Members
449,095
Latest member
Mr Hughes

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top