Website scraping

johnnyL

Well-known Member
Joined
Nov 7, 2011
Messages
4,546
Office Version
  1. 2007
Platform
  1. Windows
Hi All!

I am trying to come up with some code that will 'scrape' a couple of values from a website and save those values into a couple of Excel cells, and possibly autoupdate those cells whenever the website updates those values that were scraped.

The site that I am trying to scrape from is https://www.poloniex.com/exchange#btc_dgb

I am trying to Capture the top value listed for 'price' under the "SELL ORDERS", and the top value listed for 'price' under the "BUY ORDERS", which are located about half way down the page.

The problem that I am encountering is that, as the list of 'buy' and 'sell' orders change, the code to capture the values that I seek also change. :(

Sample code:
Code:
'
        Dim IE As New InternetExplorer
        Dim element As Object
'
'       Allow the Browser window, that we will be scraping values from, to be visible
        IE.Visible = True
'
        Application.StatusBar = "Loading website … https://www.poloniex.com/exchange#btc_dgb"        ' Update status bar to inform the user of what is occurring
'
'       Browser address that we will be scraping values from
        IE.navigate "https://www.poloniex.com/exchange#btc_dgb"
'
        Do While (IE.Busy Or IE.readyState <> READYSTATE_COMPLETE)
'           Allow mouse clicks and such while info is being scraped from internet ... Create a delay to allow the webpage to fully load before proceeding
            Application.Wait (Now + TimeValue("00:00:01"))                ' Delay for x seconds
            DoEvents
        Loop
'
        Dim Doc As HTMLDocument
        Set Doc = IE.document
'
        Application.StatusBar = "Gathering Data from website … https://www.poloniex.com/exchange#btc_dgb"    ' Update status bar to inform the user of what is occurring
'
'       Return SellPrice
        Range("poloniex_SellPrice_1").Value = Doc.getElementsByTagName("td")(3).innerText                ' <--- This line is InCorrect
'
'       Return BuyPrice
        Range("poloniex_BuyPrice_1").Value = Doc.getElementsByTagName("td")(2).innerText                 ' <--- This line is InCorrect
'
'-----------
'


Any help from the members here would be most greatly appreciated!
 

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
Re: Requested Help with Website scraping

Try...

Code:
Range("poloniex_SellPrice_1").Value = Doc.getElementById("sellOrderBookTable").getElementsByTagName("tr")(1).cells(0).innerText

Range("poloniex_BuyPrice_1").Value = Doc.getElementById("buyOrderBookTable").getElementsByTagName("tr")(1).cells(0).innerText

Hope this helps!
 
Upvote 0
Re: Requested Help with Website scraping

Try...

Code:
Range("poloniex_SellPrice_1").Value = Doc.getElementById("sellOrderBookTable").getElementsByTagName("tr")(1).cells(0).innerText

Range("poloniex_BuyPrice_1").Value = Doc.getElementById("buyOrderBookTable").getElementsByTagName("tr")(1).cells(0).innerText

Hope this helps!

Domenic, Thank You So much for your Suggestion. In all of my googling, I never realized that combining was allowed. I will test your suggestion out in the near future, rest assured. Any thoughts on code for autoupdating cells? Thank you again for your input!
 
Last edited:
Upvote 0
Re: Requested Help with Website scraping

It looks like you can either continue accessing the data using the HTTP method that you're currently using, or you can use APIs. For the former, you can simply manually run the code as desired or you can automatically have your code run at set intervals using the OnTime method of the Application object (you can search Google for examples). For the latter, have a look at the following link...

https://docs.poloniex.com/#introduction
 
Upvote 0
Re: Requested Help with Website scraping

Domenic, Thank You So much for your Suggestion. In all of my googling, I never realized that combining was allowed. I will test your suggestion out in the near future, rest assured. Any thoughts on code for autoupdating cells? Thank you again for your input!

Domenic, sorry for the late response back.

I tested your above suggested code out today and it worked marvelously, after I introduced a delay into the code to allow it to load fully! I also had a chance to look at how it works, as I understand it, and I was able to adapt it to a few other websites that I was wanting to scrape from, again, Thanks to you for that !!!

There remains one other site currently that I am having difficulty with scraping data from. That website is https://www.livecoin.net/en/trading/DGB_BTC

I would be most grateful if you, or another member, would assist me in scraping the top value listed for 'price' under the "SELLING", and the top value listed for 'price' under the "BUYING" .

In the meantime I am going to check out the link you suggested in your last post to see if I can figure out the "autoupdating" portion of my question.

Again, Thank you so much for your time, and all of the other members here that take the time, to read my questions!
 
Upvote 0
Re: Requested Help with Website scraping

Take this with a grain of salt, but as for the "autoupdating" feature, what if you set a timer for your scrip to run ever 30 minutes (or whatever interval you prefer).

This would would ensure that you always have an updated value every 30 min.

You could take it a step further and add a condition: IF the newly scraped numbers are the same as whats currently in the worksheet, dont copy.

Ps. Im brand new, so use caution when taking my advise. If it did help you, dont forget to "Thank you for posting this" and or "Like this post" :biggrin:
 
Upvote 0
Re: Requested Help with Website scraping

As for the website https://www.livecoin.net/en/trading/DGB_BTC it could be blocking your request. Some websites dont like people scraping their information, and therefor will implement speed bumps to make it harder to scrape.

You could try reverse engineering the website in question, by understanding How to prevent crawlers, and scrapers on a website.

Ps. Im brand new, so use caution when taking my advise. If it did help you, dont forget to "Thank you for posting this" and or "Like this post" :biggrin:
 
Upvote 0
Re: Requested Help with Website scraping

Domenic, Thank you for that link, But I am sad to say that I would not know how to begin to use that information. That is above my knowledge/paygrade at this point in time. :(
 
Upvote 0
Re: Requested Help with Website scraping

Take this with a grain of salt, but as for the "autoupdating" feature, what if you set a timer for your scrip to run ever 30 minutes (or whatever interval you prefer).

This would would ensure that you always have an updated value every 30 min.

You could take it a step further and add a condition: IF the newly scraped numbers are the same as whats currently in the worksheet, dont copy.

Ps. Im brand new, so use caution when taking my advise. If it did help you, dont forget to "Thank you for posting this" and or "Like this post" :biggrin:

BrandonBerner, Thank you so much for your reply!

A timer is one approach, as you suggested, although it would need to be much quicker than you suggested. I was hoping that Excel would have a more elaborate approach. I have never coded a timer executed code before, but I think I could google that with relative ease. I have experience with delays and such, so I am assuming it would involve similar coding.

Thank you so much for your suggestion.

Edit: I would like to add, by the time all of the cells updated, I am sure that the initial timer set would have expired, so basically no need for a timer, as such, basically it would be constantly looping to update.
 
Last edited:
Upvote 0
Re: Requested Help with Website scraping

As for the website https://www.livecoin.net/en/trading/DGB_BTC it could be blocking your request. Some websites dont like people scraping their information, and therefor will implement speed bumps to make it harder to scrape.

You could try reverse engineering the website in question, by understanding How to prevent crawlers, and scrapers on a website.

Ps. Im brand new, so use caution when taking my advise. If it did help you, dont forget to "Thank you for posting this" and or "Like this post" :biggrin:

BrandonBerner, I don't believe that my requests are being blocked because I can scrape one of the values that I am seeking, I just can't seem to to scrape the second value that I seek from that site. :(
 
Upvote 0

Forum statistics

Threads
1,215,409
Messages
6,124,733
Members
449,185
Latest member
hopkinsr

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top