Page 3 of 3 FirstFirst 123
Results 21 to 30 of 30

Extracting web data from a sub web page

This is a discussion on Extracting web data from a sub web page within the Excel Questions forums, part of the Question Forums category; Norie, what would be a better option than Excel or VBA for obtaining the data? The data I'm looking to ...

  1. #21
    Board Regular
    Join Date
    Jul 2004
    Posts
    95

    Default Re: Extracting web data from a sub web page

    Norie, what would be a better option than Excel or VBA for obtaining the data?

    The data I'm looking to retrieve is both the "real-time" last trade and volume data (which is located under the Options header) as well as the "delayed" day low, day high and today's open data (which is located under the Microsoft Corp header) at this URL;

    http://www.cnbc.com/id/15837290?q=msft

    I'd like to obtain both sets of data ("real-time" data is on the main URL and "delayed" data is on the iframe URL) in a single navigation (if possible) as per Tom's xls file. How do I get both sets of data using Tom's xls file?

  2. #22
    Board Regular
    Join Date
    Mar 2002
    Location
    Cincinnati, Ohio, USA
    Posts
    6,824

    Default Re: Extracting web data from a sub web page

    "Why not use a standard Web Query from Data > Import External Data?"

    Because the frame's URL changes constantly. So I suppose a web query would not work repeatedly.

    Bob. The additional data, that you did not mention before, can be retrieved from the results in my previous reply. I don't know which volume you are after.


  3. #23
    Board Regular
    Join Date
    Jul 2004
    Posts
    95

    Default Re: Extracting web data from a sub web page

    I'd like to retrieve the "Last Trade", "Volume" and "Time Stamp" from the upper half of the webpage (located in the upper "real-time" banner, in the region indicated by your upper circled Volume on your bitmap). And at the same time, I'd like to retrieve the "Day Low", "Day High" and "Today's Open" from the bottom half of the webpage (located in the lower "delayed" banner, in the region indicated by your lower circled Volume on your bitmap).
    Can I do this using your xls file you sent earlier?

  4. #24
    Board Regular
    Join Date
    Jul 2004
    Posts
    95

    Default Re: Extracting web data from a sub web page

    Thank you everyone for your contributions!!

    I used John/Kenneth's methods mentioned earlier and am making some headway! Looks like this is the way to go taking the IE "no more than 1 document link" security enforcement into consideration.

    Thank you again everyone for your input, I will post my final code once I get it working 100%.

  5. #25
    Board Regular
    Join Date
    Mar 2002
    Location
    Cincinnati, Ohio, USA
    Posts
    6,824

    Default Re: Extracting web data from a sub web page

    Bob. How often are you going to query this site? Are you trying to import the data into the spreadsheet as often as it updates? How many stocks are you watching?

  6. #26
    Board Regular
    Join Date
    Jul 2004
    Posts
    95

    Default Re: Extracting web data from a sub web page

    Hi Tom,

    I'd like to query the site on demand manually (vb icon on the spreadsheet). I'd be updating a single symbol at a time.

  7. #27
    Board Regular
    Join Date
    Mar 2002
    Location
    Cincinnati, Ohio, USA
    Posts
    6,824

    Default Re: Extracting web data from a sub web page

    Hi Bob. I had some fun with this just to see what could be done. Here are two example workbooks using the same code. One shows how to impliment this as a custom worksheet function to get realtime and openhighlow data on demand. The other shows how to automatically update a range of cells with the realtime data. I'll post the code in my next reply.

    UDF Example:
    0708081250.328272.UDF.zip

    Realtime Data Sync Example:
    0708081250.328272.RTD.zip

  8. #28
    Board Regular
    Join Date
    Mar 2002
    Location
    Cincinnati, Ohio, USA
    Posts
    6,824

    Default Re: Extracting web data from a sub web page

    Example code using the webbrowser control and MSHTML object library.

    Add a userform. Name it frmDataItem. Add WebBrowser1. (web browser control). There is no code in the userform.

    Add a class module named OHLArgs
    Code:
    Option Explicit
    
    Public Symbol As String
    
    Public OpenText As String
    Public LowText As String
    Public HighText As String
    
    Public OpenValue As Currency
    Public LowValue As Currency
    Public HighValue As Date
    
    Public HasError As Boolean
    Add a class module named RTDArgs
    Code:
    Option Explicit
    
    Public Symbol As String
    
    Public LastText As String
    Public VolumeText As String
    Public TimeText As String
    
    Public LastValue As Currency
    Public VolumeValue As Currency
    Public TimeValue As Date
    
    Public HasError As Boolean
    Add a class module named RTDItem
    Code:
    Option Explicit
    
    Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
    
    Private Const UrlPart As String = "http://www.cnbc.com/id/15837290?q="
    
    Public Event UpdateAvailable(Symbol As String, RealTimeData As RTDArgs)
    Public Event TimedOut(Symbol As String)
    
    'requires a reference to Microsoft HTML Object Library
    Private WithEvents TimeSpan As HTMLSpanElement
    Private LastSpan As HTMLSpanElement
    Private VolumeSpan As HTMLSpanElement
    
    Private HTMLDoc As HTMLDocument
    
    Private WithEvents WebB As WebBrowser
    
    Private pFrm As frmDataItem
    Private pSymbol As String
    Private pEnableEvents As Boolean
    Private pConnected As Boolean
    Private pTimeOutSeconds As Integer
    Private pSessionID As String
    Private pFrameSource As String
    Private pForceRefresh As Boolean
    Private pRealTimeData(2) As String
    Private pOpenHighLow(2) As String
    Private pCancelNavigation As Boolean
    Private pLastRefreshTime As Date
    Private pCallerIsUDF As Boolean
    
    Friend Function OpenHighLow() As OHLArgs
        If Not pConnected Then Exit Function
        If DateDiff("n", pLastRefreshTime, Now) > 1 Then Me.Refresh
        
        Dim Args As OHLArgs
        Set Args = New OHLArgs
        
        With Args
            .Symbol = pSymbol
            .OpenText = pOpenHighLow(0)
            .HighText = pOpenHighLow(1)
            .LowText = pOpenHighLow(2)
        
            If IsNumeric(pOpenHighLow(0)) Then
                .OpenValue = CCur(pOpenHighLow(0))
            Else
                .HasError = True
            End If
        
            If IsNumeric(pOpenHighLow(1)) Then
                .HighValue = CCur(pOpenHighLow(1))
            Else
                .HasError = True
            End If
            
            If IsNumeric(pOpenHighLow(2)) Then
                .LowValue = CDate(pOpenHighLow(2))
            Else
                .HasError = True
            End If
        End With
        
        Set OpenHighLow = Args
    End Function
    
    Friend Property Get RTD() As RTDArgs
        If Not pConnected Then Exit Property
        Dim Args As RTDArgs
        
        Set Args = New RTDArgs
        
        With Args
            .Symbol = pSymbol
            .LastText = pRealTimeData(0)
            .VolumeText = pRealTimeData(1)
            .TimeText = pRealTimeData(2)
        
            If IsNumeric(pRealTimeData(0)) Then
                .LastValue = CCur(pRealTimeData(0))
            Else
                .HasError = True
            End If
        
            If IsNumeric(pRealTimeData(1)) Then
                .VolumeValue = CCur(pRealTimeData(1))
            Else
                .HasError = True
            End If
            
            If IsDate(pRealTimeData(2)) Then
                .TimeValue = CDate(pRealTimeData(2))
            Else
                .HasError = True
            End If
        End With
        
        Set RTD = Args
    End Property
    
    Friend Function Connect(Symbol As String, _
       Optional EnableEvents As Boolean, _
       Optional TimeOutSeconds As Integer = 10, _
       Optional CallerIsUDF As Boolean) As Boolean
       
        Dim TimeOutTime As Date
        
        Me.TimeOutSeconds = TimeOutSeconds
        pCallerIsUDF = CallerIsUDF
        
        Me.Symbol = Symbol
        
        If Not pCallerIsUDF Then
            Me.EnableEvents = EnableEvents
        End If
        
        TimeOutTime = Now + TimeSerial(0, 0, TimeOutSeconds)
        Do Until Me.Connected
            Sleep 250
            If Now > TimeOutTime Then Exit Do
        Loop
        
        Connect = Me.Connected
        pLastRefreshTime = Now
    
    End Function
    
    Private Function GetProperties() As Boolean
        Dim Script As HTMLScriptElement, Frame As HTMLObjectElement
        Dim TempFrameSource As String
    
        On Error Resume Next
        Set HTMLDoc = WebB.Document
            
        For Each Script In HTMLDoc.Scripts
            If InStr(Script.innerHTML, "cnbc_RunQuoteStrip") <> 0 Then
                pSessionID = Replace(Split(Script.innerHTML, ",")(3), "'", "")
                Exit For
            End If
        Next
    
        For Each Frame In HTMLDoc.all
            TempFrameSource = Frame.src
            If InStr(TempFrameSource, "delayedQuotes&symbol=" & Symbol) <> 0 Then
                pFrameSource = TempFrameSource
                Exit For
            End If
        Next
        
        Set LastSpan = HTMLDoc.getElementById("WSODQ_" & pSymbol & "_LAST_0_" & pSessionID)
        Set VolumeSpan = HTMLDoc.getElementById("WSODQ_" & pSymbol & "_VOLUME_0_" & pSessionID)
        Set TimeSpan = HTMLDoc.getElementById("WSODQ_" & pSymbol & "_LASTTIME_0_" & pSessionID)
        
        If (Not LastSpan Is Nothing) And (Not VolumeSpan Is Nothing) And (Not TimeSpan Is Nothing) Then
            If LastSpan.innerText <> "" And VolumeSpan.innerText <> "" And TimeSpan.innerText <> "" And pFrameSource <> "" Then
                GetProperties = True
                Exit Function
            End If
        End If
    End Function
    
    Private Sub Class_Initialize()
        Set pFrm = New frmDataItem
        Set WebB = pFrm.WebBrowser1
    End Sub
    
    Private Sub Class_Terminate()
        Set WebB = Nothing
        Unload pFrm
        Set pFrm = Nothing
    End Sub
    
    Friend Sub Refresh()
        pForceRefresh = True
        Me.Symbol = pSymbol
        pLastRefreshTime = Now
    End Sub
    
    Friend Property Let TimeOutSeconds(ByVal vNewValue As Integer)
        If vNewValue < 5 Then vNewValue = 5
        pTimeOutSeconds = vNewValue
    End Property
    
    Friend Property Get TimeOutSeconds() As Integer
        TimeOutSeconds = pTimeOutSeconds
    End Property
    
    Friend Property Get EnableEvents() As Boolean
        EnableEvents = pEnableEvents
    End Property
    
    Friend Property Let EnableEvents(ByVal vNewValue As Boolean)
        pEnableEvents = vNewValue
    End Property
    
    Friend Property Get Symbol() As String
        Symbol = pSymbol
    End Property
    
    Friend Property Let Symbol(ByVal vNewValue As String)
        Dim TimeOutTime As Date, doc(1) As HTMLDocument
        
        If pSymbol <> vNewValue Or pForceRefresh Then
            pForceRefresh = False
            pSymbol = vNewValue
    '        pFrm.Show vbModeless
            pCancelNavigation = False
            TimeOutTime = Now + TimeSerial(0, 0, Int(TimeOutSeconds))
            WebB.Navigate UrlPart & pSymbol
            Do Until GetProperties
                If Now > TimeOutTime Then Exit Do
                DoEvents
            Loop
            WebB.Stop
            Set doc(0) = New HTMLDocument
            Set doc(1) = doc(0).createDocumentFromUrl(pFrameSource, "")
            Do Until doc(1).ReadyState = "complete"
                DoEvents
            Loop
            pOpenHighLow(0) = doc(1).getElementById("right").Children(0).Children(0).innerText
            pOpenHighLow(1) = doc(1).getElementById("stockQuote").getElementsByTagName("B")(2).innerText
            pOpenHighLow(2) = doc(1).getElementById("stockQuote").getElementsByTagName("B")(4).innerText
        End If
        Set LastSpan = HTMLDoc.getElementById("WSODQ_" & pSymbol & "_LAST_0_" & pSessionID)
        pConnected = True
    End Property
    
    Friend Property Get Connected() As Boolean
        Connected = pConnected
    End Property
    
    Private Sub TimeSpan_onpropertychange()
        pRealTimeData(0) = LastSpan.innerText
        pRealTimeData(1) = VolumeSpan.innerText
        pRealTimeData(2) = TimeSpan.innerText
        If pEnableEvents Then RaiseEvent UpdateAvailable(pSymbol, Me.RTD)
    End Sub
    
    Private Sub WebB_BeforeNavigate2(ByVal pDisp As Object, URL As Variant, Flags As Variant, TargetFrameName As Variant, PostData As Variant, Headers As Variant, Cancel As Boolean)
        If pCallerIsUDF Then
            Cancel = pCancelNavigation
            pCancelNavigation = True
        End If
    End Sub

    Example usage as a UDF.
    Code:
    Option Explicit
    
    Private RtdUdf As RTDItem
    
    Function OHL_UDF(Symbol As String) As Variant
        
        Dim v As Variant, OHLTemp() As Variant, OHLArgs As OHLArgs
        
        Application.Volatile False
        
        If RtdUdf Is Nothing Then
            Set RtdUdf = New RTDItem
            RtdUdf.Connect Symbol:=Symbol, CallerIsUDF:=True
        Else
            RtdUdf.Symbol = Symbol
        End If
        
        'Today's Open    Day High    Day Low Last    Volume
        'Time    Last Refresh Time   Status
        
        ReDim OHLTemp(4)
        Set OHLArgs = RtdUdf.OpenHighLow
        
        With OHLArgs
            If .HasError Then
                OHLTemp(0) = .OpenText
                OHLTemp(1) = .HighText
                OHLTemp(2) = .LowText
                OHLTemp(4) = "ERROR"
            Else
                OHLTemp(0) = .OpenValue
                OHLTemp(1) = .HighValue
                OHLTemp(2) = .LowValue
                OHLTemp(4) = "OK"
            End If
        End With
        
        OHLTemp(3) = Now
        OHL_UDF = OHLTemp
    End Function
    
    Function RTD_UDF(Symbol As String) As Variant()
        
        Dim v As Variant, RTDTemp() As Variant, RTDArgs As RTDArgs
        
        Application.Volatile False
        
        If RtdUdf Is Nothing Then
            Set RtdUdf = New RTDItem
            RtdUdf.Connect Symbol:=Symbol, CallerIsUDF:=True
        Else
            RtdUdf.Symbol = Symbol
        End If
        
        'Today's Open    Day High    Day Low Last    Volume
        'Time    Last Refresh Time   Status
        
        ReDim RTDTemp(4)
        Set RTDArgs = RtdUdf.RTD
        
        With RTDArgs
            If .HasError Then
                RTDTemp(0) = .LastText
                RTDTemp(1) = .VolumeText
                RTDTemp(2) = .TimeText
                RTDTemp(4) = "ERROR"
            Else
                RTDTemp(0) = .LastValue
                RTDTemp(1) = .VolumeValue
                RTDTemp(2) = .TimeValue
                RTDTemp(4) = "OK"
            End If
        End With
        
        RTDTemp(3) = Now
        RTD_UDF = RTDTemp
    End Function
    Example usage to get realtime data.
    Code:
    Option Explicit
    
    Private WithEvents RTD As RTDItem
    
    Sub StartRTD()
        
        Set RTD = New RTDItem
        RTD.Connect Symbol:="msft", EnableEvents:=True, TimeOutSeconds:=10, CallerIsUDF:=False
        If Not RTD.Connected Then
            MsgBox "Connection failed..."
        End If
            
    End Sub
    
    Sub StopRTD()
        Set RTD = Nothing
    End Sub
    
    Private Sub CommandButton1_Click()
        StartRTD
    End Sub
    
    Private Sub CommandButton2_Click()
        StopRTD
    End Sub
    
    Private Sub RTD_UpdateAvailable(Symbol As String, RealTimeData As RTDArgs)
        Dim DestRng As Range, RTDArgs As RTDArgs
        
        Set DestRng = [b2:e2]
      
        Set RTDArgs = RTD.RTD
        With RTDArgs
            If Not .HasError Then
                DestRng(1) = .LastValue
                DestRng(2) = .VolumeValue
                DestRng(3) = .TimeValue
                DestRng(4) = "Ok"
            Else
                DestRng(1) = .LastText
                DestRng(2) = .VolumeText
                DestRng(3) = .TimeText
                DestRng(4) = "Error"
            End If
        End With
         
        
    End Sub
    
    Private Sub RTD_TimedOut(Symbol As String)
        MsgBox "Timed Out..."
    End Sub

  9. #29
    Board Regular
    Join Date
    Mar 2002
    Location
    Cincinnati, Ohio, USA
    Posts
    6,824

    Default Re: Extracting web data from a sub web page

    If you get errors, increase the timeout.

  10. #30
    Board Regular
    Join Date
    Jul 2004
    Posts
    95

    Default Re: Extracting web data from a sub web page

    Thank you Tom, that works great! That is exactly what I was looking for! Thank you for the insight into how these embedded frames work!

    Thank you to everyone who contributed on this thread!

Page 3 of 3 FirstFirst 123

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


DMCA.com