Extracting web data from a sub web page

99bobster99

Board Regular
Joined
Jul 19, 2004
Messages
95
Hi,

I am using the "CreateObject("InternetExplorer.Application")" method to extract data from a webpage. This webpage is a sub page that I cannot find the data for, even after filtering through all the ".document.all(x)" items? Any ideas on how to obtain this "sub page" data, since there is other data that I need which is on the "main" page? I don't want to have to call up this sub page separately since I'm already seeing it (physically able to view it on the main page but not able to extract it) from the main page? Any ideas?
 
s = URLSource("http://apps.cnbc.com/view.asp?YYY330_XPDfu5vdhz9HyCRjr3G70R0A9hRO7fWtFsoR8c7jlUv40AuQQ+bZ2e2tRbDtPKi0jNSra6i07nSOl37m3JhZcF6UFhBqkWisiu/agZBraCwSPH4Ti6lcGQ==&uid=stocks/delayedQuotes&symbol=msft")

Ken. I had a problem with the URL. It changed every time the page auto refreshed...

A question for you guys about the IE tabs. Every since I upgraded to IE 7, I have had problems controlling instances of IE. Any way around this besides using the browser control in a userform? That is my fix for now unless there is another option...

Am having all kinds of problems posting here because of the HTML in the code. Anyway, follow the intructions below. To use in a worksheet cell, you will need to enter the function as an array.

For example, suppose that cel A1 contains "msft". Select B1:D1 and enter this function into the formula bar. =OLH(A2) Confirm with CTRL-Shift-Enter. This will place the Today's Open in B1, Day High in C1, and Day Low in D1.
<img src="http://home.fuse.net/tstom/0704081308.jpg">

Download the working example if you have any problems...

<a href="http://home.fuse.net/tstom/0704081311.zip"><img src="http://home.fuse.net/tstom/zip.gif"width="48"height="48"border="0"></a> <a href="http://home.fuse.net/tstom/0704081311.zip">0704081311.zip</a>

Here was my approach.
Add UserForm1 with Browser1.
Userform Code:
Code:
Option Explicit

Private DocumentComplete As Boolean

Friend Function GetFrameSrc(Symbol As String, Optional TimeOutSeconds As Integer = 10) As String

    Dim e As HTMLHtmlElement, s As String
    
    WebBrowser1.Navigate "http://www.cnbc.com/id/15837290?q=" & Symbol
    WaitUntilReady TimeOutSeconds
    
    On Error Resume Next
    For Each e In WebBrowser1.Document.all
        s = e.src
        If InStr(s, "delayedQuotes&symbol=" & Symbol) <> 0 Then
            GetFrameSrc = e.src
            Exit Function
        End If
    Next
End Function

Private Sub WaitUntilReady(Optional TimeOutSeconds As Integer = 10)
    Dim TimeOutTime As Date
    
    TimeOutTime = Now + TimeSerial(0, 0, TimeOutSeconds)
    DocumentComplete = False
    Do Until DocumentComplete And WebBrowser1.ReadyState = READYSTATE_COMPLETE And Not WebBrowser1.Busy
        DoEvents
    Loop
End Sub

Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    DocumentComplete = True
End Sub

UDF and supporting procedures in a standard module.
Code:
Option Explicit

Private uf As UserForm1
Private Request As WinHttpRequest

Sub ExampleCallFromVBA()
    Dim OpenLowHigh() As Currency
    OpenLowHigh = OLH("msft")
    MsgBox "Today’s Open = " & OpenLowHigh(0) & vbCrLf & _
           "Day High = " & OpenLowHigh(1) & vbCrLf & _
           "Day Low = " & OpenLowHigh(2)
End Sub

'use in worksheet or from VBA
Public Function OLH(Symbol As String) As Currency()
    Dim FrameContentWindowSrc As String, Response As String
    Dim OHLTemp(2) As Currency
    
    If uf Is Nothing Then Set uf = New UserForm1
    FrameContentWindowSrc = uf.GetFrameSrc(Symbol)
    
    If FrameContentWindowSrc <> "" Then
                
        If Request Is Nothing Then Set Request = New WinHttpRequest
        Request.Open "GET", FrameContentWindowSrc, False
        Request.send
        Response = Request.responseText
        
        ReDim OLH(2)
        
        Dim p As Long
        Dim p2 As Long
       
        p = InStr(Response, "Today’s Open")
        p = InStrRev(Response, "</span>", p)
        p2 = InStrRev(Response, ">", p) + 1
        OHLTemp(0) = Mid(Response, p2, p - p2)
        
        p = InStr(Response, "Day Low")
        p = InStr(p, Response, "<b>") + 4
        p2 = InStr(p, Response, "<")
        OHLTemp(1) = Mid(Response, p, p2 - p)
        
        p = InStr(Response, "Day High")
        p = InStr(p, Response, "<b>") + 4
        p2 = InStr(p, Response, "<")
        OHLTemp(2) = Mid(Response, p, p2 - p)
        
    End If
    
    OLH = OHLTemp
    
End Function

</b>I could not get the code with embedded HTML to render. Here is a pic.
<img src="http://home.fuse.net/tstom/0704081302.jpg">
 
Upvote 0

Excel Facts

Move date out one month or year
Use =EDATE(A2,1) for one month later. Use EDATE(A2,12) for one year later.
Tom, that is very cool!

Would I filter throught the "Response" variable to extract the real-time data and add these values as another column on the spreadsheet? Do you paste the response onto the spreadsheet and then decide where to set your delimiters?
 
Upvote 0
"Would I filter throught the "Response" variable to extract the real-time data and add these values as another column on the spreadsheet? Do you paste the response onto the spreadsheet and then decide where to set your delimiters?"

Bob. I don't understand... Are you trying to maintain a real-time loop? If so, I hope you are only watching several stocks because this is the long road to get data. Please explain... :)

I'm thinking that you are under the impression that this data will update automatically without further ado from us. Not so. This is just an example to grab a value or two with the added convenience of doing so from worksheet cells. If you are watching more than a few stocks, this will bog down in a hurry. If you do want to attemp to grab this data real time as the page auto-refreshes, there are many events using the DOM that may or may not be fired. I don't know.
 
Upvote 0
This function can be used to get the changing url for the frame similar to what John posted. One can then use a similar method to get the other data or use the other routine that I posted which is faster.

Code:
Function cnbc()
     'This project includes references to "Microsoft Internet Controls", shdocvw.dll and
     '"Microsoft HTML Object Library", mshtml.tlb
     
     'Variable declarations
    Dim myIE As New InternetExplorer 'New '
    Dim myURL As String
    Dim myDoc As HTMLDocument
    Dim s As String
     
     'Set starting URL and search string
    myURL = "http://www.cnbc.com/id/15837290?q=msft"
     
     'Make IE navigate to the URL and make browser visible
    myIE.navigate myURL
    myIE.Visible = False
     
     'Wait for the page to load
    Do While myIE.Busy Or myIE.readyState <> READYSTATE_COMPLETE
        DoEvents
    Loop
     
     'Set IE document into object    
    Set myDoc = myIE.document

     
s = myDoc.getElementById("IFRAME_ID0EKLAE15840257").src
    
    myIE.Quit
    Set myIE = Nothing
    Set myDoc = Nothing
    cnbc = s
     
End Function
 
Upvote 0
Thank you for the info Tom/Ken.

I was hoping I could extract both the real-time and the delayed data during the single navigation to the main URL. From what you guys have mentioned, it sounds like I need to navigate to the main page, extract the real-time data and then extract the current URL for the delayed data. I would then navigate to this "delayed data" URL and once there, extract the delayed data. Is this right?

To clarify my response to Tom's earlier posting, your method sounds more efficient if you could extract the real-time data as well during a single navigation. My question was, I see the "Response" string variable in your code, is the real-time data contained within this text string? If so, what is a simple technique to find that data's location in order to extract it? I was suggesting temporarily copying the entire string into a worksheet cell, find it by searching, once found I would create the Instr function to suite. The problem I'm having is that I can't copy to the worksheet cell, it kicks back errors. I'm wondering if this is because the code is part of an array?
 
Upvote 0
The Last price is in the response string. It is also at the first URL. Current volume is only in the first URL. Are these the two items you are looking for? If not, what data are you trying to get exactly?
 
Upvote 0
"Why not use a standard Web Query from Data > Import External Data?"

Because it's not nearly as much fun as writing a bunch of code and doing it the hard way. :) Amazing that not a one of us mentioned this earlier.

If you want the current colume, you will have to perform two separate queries.
 
Upvote 0
The web query method works fine, manually. Saving the query or refreshing it only gets the last table from that 2nd table's frame I suspect. I would need to test the query when the market is open.

Bob, you might consider another source like Yahoo to get what you need. You can probably get the same info there. The main advantage that you need is going to be speed.

Check out this link for yahoo queries. http://www.ozgrid.com/forum/showthread.php?t=95502
 
Upvote 0
Why use Excel or VBA in the first place?:)
 
Upvote 0

Forum statistics

Threads
1,213,489
Messages
6,113,953
Members
448,535
Latest member
alrossman

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top