Yahoo finance browser version error preventing web scraping

PhilAck

New Member
Joined
Jun 29, 2019
Messages
4
I have been scraping data from Yahoo Finance pages for years with VBA code in Excel (now in Excel 2010) by opening an IE window and working with the data from the document HTML. Aside from adjusting for Y.F. making changes often I've been able to get everything I need either from stock quote pages or from portfolios I have created. That is until earlier this month. Suddenly this message began appearing on many pages opened in Internet Explorer:
"Some parts of this page is not supported on your current browser version. Please upgrade the browser to latest."
I'm working on a Win 10 PC with Excel 2010 and IE 11.
the message doesn't appear on Edge, Firefox, Opera, or on my iphone. One reason is that IE is no longer supported by Microsoft and Finance Yahoo has, I guess, decided to follow their encouragement to stop development on web pages for IE. I'm sure there are other reasons but it has laid bare the vulnerability of anyone's VBA code that relies on opening an IE window for any scraping. Trouble is, I can't seem to find another approach that avoids the error message.
I found some great APIs for a lot of the data but there are other pages (such as the portfolio pages) and other sites that I also scrape data from that I either can't find an API for or have confirmed that the company doesn't offer an API for the data. So I need a way to scrape data.
VBA in Excel (and I guess other Office products) only supports Internet Explorer. There is no native way to open a different browser. I sure would welcome some help and discussion on this issue. I understand that Office 365 is working on a new webview for Edge but that is of little help to those of us who don't subscribe.
Here is a list of the ways I have tried to get around this and the results I have had so far (all failures). Please comment, offer suggestions for further research, or laugh and tell me to give up (hopefully not that). Also please tell me how to format this better if it looks bad; I'm new at this. To decide if an approach is successful I looked for the error message above and for specific data for stocks such as AAPL, ABT, ADBE in my portfolios.
1) VBA in an Excel macro:
Code:
Sub newget()
    Dim resp        As String
    Dim xmlhttp   As New MSXML2.XMLHTTP60
    
        With xmlhttp
            .Open "GET", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL", False 
            .setRequestHeader "User-Agent", "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"
            .send
            resp = .ResponseText
        End With
End Sub
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The response includes the error message and is missing the chart. When I use the URL for one of my portfolios, the message is included but the stock data is missing.

2) also VBA in Excel Macro:
Same as above except xmlhttp is defined as New WinHttpRequest and the .setRequestHeader line is removed.
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The results are the same.

3) I tried using Selenium (both basic and wrapper) and figured out it doesn't work with Firefox any more. I couldn't get it to work with Opera either. I was able to get some response from PhantomJS but haven't been able to get the data back in a way that I can examine it.
Code:
Sub newerget()
    Dim selWD As SeleniumWrapper.WebDriver
    Dim objnode1        As Object
    Dim objNode2        As Object
    
    Set selWD = New SeleniumWrapper.WebDriver
    selWD.Start "phantomjs", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL"
    selWD.setTimeout ("120000")
    selWD.setImplicitWait (5000)
    Set objnode1 = selWD.findElementById("atomic")
End Sub
I don't know enough of the Selenium commands to look more closely at the response but the set objnode1 statement returned an error.

4) I gave up and wrote a Python script with BeautifulSoup to try to get the data:
Code:
from bs4 import BeautifulSoup
from urllib.request import urlopen
html=urlopen('https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL')
print(html.read())
The beginning of this response includes: html id="atomic" class="NoJs featurephone" lang="en-US"
This also returned the error message and when I used the URL for one of my portfolios it did not return any stock data.


Sorry for the long post. I've been searching and trying as much as I can but I'm stumped. I'm hoping I have overlooked something simple (I can hope) or that I will have at least saved someone else some time so they don't have to try the same things again that I've tried. I find it hard to believe that all of these approaches are being caught by Finance Yahoo and are being treated the same way as they treat IE but that's what my testing is showing. Any help or suggestions would be most appreciated. By the way the first line of the HTML (as seen from the developers tools) in various browsers looks like this:
Firefox: html id="atomic" class="firefox desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport HideNavrail hasScrolled scrollDown" lang="en-US"
IE: html class=" lightweight ie-11 desktop JsEnabled" id=atomic" lang="en-US"
Opera (surprisingly): html id="atomic" class="chrome desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport hasScrolled scrollDown ShowNavrail" lang="en-US"


Could the JSEnabled and the NoJS things be a clue to the problem here? There seems to be a pattern but I have no clue what it means or how to do anything about it if it is a clue.

Thanks
Phil
 

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
Not sure what data you're trying to pull exactly, but if you enter 'https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL' as a new Power Query query, and select the source as 'From Web', I see available tables to import as

Balance Sheet
Cash Flow Statement
Dividends & Splits
Fiscal Year
Income Statement
Management Effectiveness
Profitablity
Share Statistics
and Stock Price History

Could be what you're looking for.
 
Upvote 0
I just did a quick search on this as I haven't heard of it before. It looks like it's available on Excel 2016. I only have Excel 2010 right now. Is there something similar in Excel 2010?
thanks
 
Upvote 0
Thanks for the info. I had heard about PowerPivot but had never seen it (your link is for PowerPivot). Very Interesting and I'm happy to see I can get it as an add-in to Excel 2010. I found the link for Power Query https://www.microsoft.com/en-us/download/details.aspx?id=39379 and installed that too. I agree it works for the URL I gave you and that will help for some of my data needs. However, when I tried to load a portfolio page I got the same error as all my other attempts and an interesting line at the top of the Web View pane that says "Table highlighting is disabled because this page uses Internet Explorer's Compatibility Mode." which may be why it is triggering the error.
Also, and I probably wasn't clear about this in my initial post, but the portfolios are only available with a login on Finance Yahoo and unlike the old web query (which of course no longer works) I can't log in on the web view. So even if the error message wasn't there, I don't think this would work for these pages or the ones on the other sites I need to get data from which are also behind a login. Thanks anyway.
I'm open to trying any other approaches if anyone thinks of any.
Thanks
 
Upvote 0

Forum statistics

Threads
1,213,487
Messages
6,113,941
Members
448,534
Latest member
benefuexx

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top