Extracting web data from a sub web page

99bobster99

Board Regular
Joined
Jul 19, 2004
Messages
95
Hi,

I am using the "CreateObject("InternetExplorer.Application")" method to extract data from a webpage. This webpage is a sub page that I cannot find the data for, even after filtering through all the ".document.all(x)" items? Any ideas on how to obtain this "sub page" data, since there is other data that I need which is on the "main" page? I don't want to have to call up this sub page separately since I'm already seeing it (physically able to view it on the main page but not able to extract it) from the main page? Any ideas?
 

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"
By sub page to you mean a frame?

If so then take a look at the frames collection of the main document and it's document object.
 
Upvote 0
I tried searching all the "frames" as you suggested, but could not find this sub page? I'm not sure what this sub section of the webpage is, but when I tried creating a "Web Query" for it, in Excel, there is a different URL mentioned that imports only this "sub web page", as I call it? I could do another .Navigate call to this sub page, but I'm determined now to find out how to drill down to find it in the Main web page??
 
Upvote 0
Without seeing the web site's source code or the web page link and what you need, it is hard to advise further.
 
Upvote 0
The webpage I'm trying to extract info from is;

http://www.cnbc.com/id/15837290?q=msft

I'd like to retrieve the "Today's Open", "Day Low" and "Day High" values?? I can extract the banner above (Last Trade, Volume and time stamp), but not the other data. It seems to be on another webpage link??
 
Upvote 0
I found an article on HTMLWindowProxy that searches sub frames on a webpage. It didn't work very well, is this the route to go or is there another avenue?
 
Upvote 0
The Last Trade and Volume values on the banner are on http://www.cnbc.com/id/15837290?q=msft, whereas the Day Low, Day High etc. values are indeed in a separate frame, which is hosted by a different domain, http://apps.cnbc.com.

Consequently, because of the cross-frame security restrictions in Internet Explorer which prevent a script accessing the document object of a frame in a different URL, we can't access the HTML document object containing the stock data directly from the frameset.

Instead, we read the src attribute of the < iframe > and if it starts with 'http://', deem this to be the URL containing the stock data and then navigate to that URL.

Code:
Private Function getStockFrameURL(ByVal mDoc As MSHTML.IHTMLDocument2) As String

    Dim mElements As MSHTML.IHTMLElementCollection
    Dim mFrameElement As MSHTML.HTMLFrameElement
    Dim i As Integer
    
    'Loop through all < iframe > elements, looking for the first http:// frame
    
    Set mElements = mDoc.getElementsByTagName("IFRAME")

    getStockFrameURL = ""
    i = 0
    While i < mElements.Length And getStockFrameURL = ""
        Set mFrameElement = mElements.Item(i)
        Debug.Print mFrameElement.src
        If InStr(mFrameElement.src, "http://") = 1 Then
            getStockFrameURL = mFrameElement.src
        End If
        i = i + 1
    Wend

End Function
You would call this function using:

Code:
sURL = getStockFrameURL(IE.document)

IE.Navigate sURL
Assuming that IE is your CreateObject("InternetExplorer.Application") object and it currently holds the cnbc.com stock page.

Note: The above code requires the Microsoft HTML Object Library. Set this in the VB Editor via Tools - References.

Post your code if you're not sure how to fit my code in.
 
Upvote 0
For the parsing method that I discussed, you can use this. You may need to verify that the Frame's URL does not change day to day.

Be sure to add the Reference as commented below. Note how I used two possible methods to parse the data in my MidStr routine.

The forum had a problem posting the routine where I constructed the unique string for Day High. See the Quote section that follows for the correct syntax.

Code:
Sub ParseSource()
  Dim s As String
  
  s = URLSource("http://apps.cnbc.com/view.asp?YYY330_XPDfu5vdhz9HyCRjr3G70R0A9hRO7fWtFsoR8c7jlUv40AuQQ+bZ2e2tRbDtPKi0jNSra6i07nSOl37m3JhZcF6UFhBqkWisiu/agZBraCwSPH4Ti6lcGQ==&uid=stocks/delayedQuotes&symbol=msft")
  
  'Today's Open
  Debug.Print "Today's Open: ", MidStr(s, ">", "
Today’s Open")
  'Day Low
  Debug.Print "Day Low:", MidStr(s, "$", "Beta", 53)
  'Day High
  Dim q As String
  q = """"
  Debug.Print "Day High", MidStr(s, "$", _
    "" & vbCrLf & vbTab & vbTab & _
    "" & vbCrLf & vbTab & vbTab & _
    "Ask")
End Sub

'Finds mid string from sTo and then back to sFrom. So, make sTo unique.
Function MidStr(str As String, sFrom As String, sTo As String, Optional toOffset As Integer = 0) As String
  Dim strSub As String, sBegPos As Long, sEndPos As Long
  
  sEndPos = InStr(str, sTo) - toOffset
  strSub = Left(str, sEndPos)
  sBegPos = InStrRev(strSub, sFrom) + 1
  
  MidStr = Mid(strSub, sBegPos, sEndPos - sBegPos)
End Function


Function URLSource(url As String) As String
'Add Reference for Microsoft WinHTTP Services, version 5.1, for this early bind method.
    Dim Request As New WinHttpRequest
    
    Request.Open "GET", url, False
    Request.send
    URLSource = Request.responseText
End Function

Debug.Print "Day High", MidStr(s, "$", _
"</b></td>" & vbCrLf & vbTab & vbTab & _
"<td></td>" & vbCrLf & vbTab & vbTab & _
"<td class=" & q & "dotsBelow" & q & "><div>Ask")
 
Last edited:
Upvote 0
The forum had a problem with part of my code that neither quote nor code tags could handle. It simply adds the vbcrlf and vbtab codes and the q to add quotes to make a unique string.

If you want it, PM or email me.
 
Upvote 0

Forum statistics

Threads
1,213,535
Messages
6,114,194
Members
448,554
Latest member
Gleisner2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top