Yahoo finance browser version error preventing web scraping

PhilAck

New Member
Joined
Jun 29, 2019
Messages
4
I have been scraping data from Yahoo Finance pages for years with VBA code in Excel (now in Excel 2010) by opening an IE window and working with the data from the document HTML. Aside from adjusting for Y.F. making changes often I've been able to get everything I need either from stock quote pages or from portfolios I have created. That is until earlier this month. Suddenly this message began appearing on many pages opened in Internet Explorer:
"Some parts of this page is not supported on your current browser version. Please upgrade the browser to latest."
I'm working on a Win 10 PC with Excel 2010 and IE 11.
the message doesn't appear on Edge, Firefox, Opera, or on my iphone. One reason is that IE is no longer supported by Microsoft and Finance Yahoo has, I guess, decided to follow their encouragement to stop development on web pages for IE. I'm sure there are other reasons but it has laid bare the vulnerability of anyone's VBA code that relies on opening an IE window for any scraping. Trouble is, I can't seem to find another approach that avoids the error message.
I found some great APIs for a lot of the data but there are other pages (such as the portfolio pages) and other sites that I also scrape data from that I either can't find an API for or have confirmed that the company doesn't offer an API for the data. So I need a way to scrape data.
VBA in Excel (and I guess other Office products) only supports Internet Explorer. There is no native way to open a different browser. I sure would welcome some help and discussion on this issue. I understand that Office 365 is working on a new webview for Edge but that is of little help to those of us who don't subscribe.
Here is a list of the ways I have tried to get around this and the results I have had so far (all failures). Please comment, offer suggestions for further research, or laugh and tell me to give up (hopefully not that). Also please tell me how to format this better if it looks bad; I'm new at this. To decide if an approach is successful I looked for the error message above and for specific data for stocks such as AAPL, ABT, ADBE in my portfolios.
1) VBA in an Excel macro:
Code:
Sub newget()
    Dim resp        As String
    Dim xmlhttp   As New MSXML2.XMLHTTP60
    
        With xmlhttp
            .Open "GET", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL", False 
            .setRequestHeader "User-Agent", "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"
            .send
            resp = .ResponseText
        End With
End Sub
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The response includes the error message and is missing the chart. When I use the URL for one of my portfolios, the message is included but the stock data is missing.

2) also VBA in Excel Macro:
Same as above except xmlhttp is defined as New WinHttpRequest and the .setRequestHeader line is removed.
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The results are the same.

3) I tried using Selenium (both basic and wrapper) and figured out it doesn't work with Firefox any more. I couldn't get it to work with Opera either. I was able to get some response from PhantomJS but haven't been able to get the data back in a way that I can examine it.
Code:
Sub newerget()
    Dim selWD As SeleniumWrapper.WebDriver
    Dim objnode1        As Object
    Dim objNode2        As Object
    
    Set selWD = New SeleniumWrapper.WebDriver
    selWD.Start "phantomjs", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL"
    selWD.setTimeout ("120000")
    selWD.setImplicitWait (5000)
    Set objnode1 = selWD.findElementById("atomic")
End Sub
I don't know enough of the Selenium commands to look more closely at the response but the set objnode1 statement returned an error.

4) I gave up and wrote a Python script with BeautifulSoup to try to get the data:
Code:
from bs4 import BeautifulSoup
from urllib.request import urlopen
html=urlopen('https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL')
print(html.read())
The beginning of this response includes: html id="atomic" class="NoJs featurephone" lang="en-US"
This also returned the error message and when I used the URL for one of my portfolios it did not return any stock data.


Sorry for the long post. I've been searching and trying as much as I can but I'm stumped. I'm hoping I have overlooked something simple (I can hope) or that I will have at least saved someone else some time so they don't have to try the same things again that I've tried. I find it hard to believe that all of these approaches are being caught by Finance Yahoo and are being treated the same way as they treat IE but that's what my testing is showing. Any help or suggestions would be most appreciated. By the way the first line of the HTML (as seen from the developers tools) in various browsers looks like this:
Firefox: html id="atomic" class="firefox desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport HideNavrail hasScrolled scrollDown" lang="en-US"
IE: html class=" lightweight ie-11 desktop JsEnabled" id=atomic" lang="en-US"
Opera (surprisingly): html id="atomic" class="chrome desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport hasScrolled scrollDown ShowNavrail" lang="en-US"


Could the JSEnabled and the NoJS things be a clue to the problem here? There seems to be a pattern but I have no clue what it means or how to do anything about it if it is a clue.

Thanks
Phil
 

Some videos you may like

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.

lrobbo314

Well-known Member
Joined
Jul 14, 2008
Messages
2,951
Office Version
  1. 365
  2. 2019
  3. 2016
Platform
  1. Windows
Not sure what data you're trying to pull exactly, but if you enter 'https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL' as a new Power Query query, and select the source as 'From Web', I see available tables to import as

Balance Sheet
Cash Flow Statement
Dividends & Splits
Fiscal Year
Income Statement
Management Effectiveness
Profitablity
Share Statistics
and Stock Price History

Could be what you're looking for.
 

PhilAck

New Member
Joined
Jun 29, 2019
Messages
4
I just did a quick search on this as I haven't heard of it before. It looks like it's available on Excel 2016. I only have Excel 2010 right now. Is there something similar in Excel 2010?
thanks
 

lrobbo314

Well-known Member
Joined
Jul 14, 2008
Messages
2,951
Office Version
  1. 365
  2. 2019
  3. 2016
Platform
  1. Windows
I think you can download and install it from here.
 

PhilAck

New Member
Joined
Jun 29, 2019
Messages
4
Thanks for the info. I had heard about PowerPivot but had never seen it (your link is for PowerPivot). Very Interesting and I'm happy to see I can get it as an add-in to Excel 2010. I found the link for Power Query https://www.microsoft.com/en-us/download/details.aspx?id=39379 and installed that too. I agree it works for the URL I gave you and that will help for some of my data needs. However, when I tried to load a portfolio page I got the same error as all my other attempts and an interesting line at the top of the Web View pane that says "Table highlighting is disabled because this page uses Internet Explorer's Compatibility Mode." which may be why it is triggering the error.
Also, and I probably wasn't clear about this in my initial post, but the portfolios are only available with a login on Finance Yahoo and unlike the old web query (which of course no longer works) I can't log in on the web view. So even if the error message wasn't there, I don't think this would work for these pages or the ones on the other sites I need to get data from which are also behind a login. Thanks anyway.
I'm open to trying any other approaches if anyone thinks of any.
Thanks
 

Watch MrExcel Video

Forum statistics

Threads
1,118,703
Messages
5,573,712
Members
412,548
Latest member
wallisonlac
Top