Website scraping

johnnyL

Well-known Member
Joined
Nov 7, 2011
Messages
4,546
Office Version
  1. 2007
Platform
  1. Windows
Hi All!

I am trying to come up with some code that will 'scrape' a couple of values from a website and save those values into a couple of Excel cells, and possibly autoupdate those cells whenever the website updates those values that were scraped.

The site that I am trying to scrape from is https://www.poloniex.com/exchange#btc_dgb

I am trying to Capture the top value listed for 'price' under the "SELL ORDERS", and the top value listed for 'price' under the "BUY ORDERS", which are located about half way down the page.

The problem that I am encountering is that, as the list of 'buy' and 'sell' orders change, the code to capture the values that I seek also change. :(

Sample code:
Code:
'
        Dim IE As New InternetExplorer
        Dim element As Object
'
'       Allow the Browser window, that we will be scraping values from, to be visible
        IE.Visible = True
'
        Application.StatusBar = "Loading website … https://www.poloniex.com/exchange#btc_dgb"        ' Update status bar to inform the user of what is occurring
'
'       Browser address that we will be scraping values from
        IE.navigate "https://www.poloniex.com/exchange#btc_dgb"
'
        Do While (IE.Busy Or IE.readyState <> READYSTATE_COMPLETE)
'           Allow mouse clicks and such while info is being scraped from internet ... Create a delay to allow the webpage to fully load before proceeding
            Application.Wait (Now + TimeValue("00:00:01"))                ' Delay for x seconds
            DoEvents
        Loop
'
        Dim Doc As HTMLDocument
        Set Doc = IE.document
'
        Application.StatusBar = "Gathering Data from website … https://www.poloniex.com/exchange#btc_dgb"    ' Update status bar to inform the user of what is occurring
'
'       Return SellPrice
        Range("poloniex_SellPrice_1").Value = Doc.getElementsByTagName("td")(3).innerText                ' <--- This line is InCorrect
'
'       Return BuyPrice
        Range("poloniex_BuyPrice_1").Value = Doc.getElementsByTagName("td")(2).innerText                 ' <--- This line is InCorrect
'
'-----------
'


Any help from the members here would be most greatly appreciated!
 
Re: Requested Help with Website scraping

BrandonBerner, Thank you so much for your reply!

A timer is one approach, as you suggested, although it would need to be much quicker than you suggested. I was hoping that Excel would have a more elaborate approach. I have never coded a timer executed code before, but I think I could google that with relative ease. I have experience with delays and such, so I am assuming it would involve similar coding.

Thank you so much for your suggestion.

What about:

1. Create a new Sheet in your workbook called (EX: Scrape Log) with 3 columns.
- Column A - BUY
- Column B - SELL
- Column C - Time Stamp

2. Scrape website for Top Buy|Sell on a loop.
- Every few seconds your Scrape Log will have another entry.

3. Create a table, and filter your log based on newest date to oldest date.
- Now you have a log which will always show you the current BUY and SELL on A2:B2


We could take it a step further, and assume you want to only display the most recent BUY and SELL price.

1. In another sheet, you could create a formula
-
HTML:
='Scrape Log'! A2:C2
- This will print out the most recent collected data only, and will auto update.
 
Upvote 0

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
Re: Requested Help with Website scraping

For your other website, try...

Code:
Range("A2").value = Doc.getElementsByClassName("orderBookList___2mrBk")(0).querySelector("div.orderBookLeft___1D43u").innerText

Range("B2").value =  Doc.getElementsByClassName("orderBookList___2mrBk")(1).querySelector("div.orderBookLeft___1D43u").innerText
 
Upvote 0
Solution
Re: Requested Help with Website scraping

For your other website, try...

Code:
Range("A2").value = Doc.getElementsByClassName("orderBookList___2mrBk")(0).querySelector("div.orderBookLeft___1D43u").innerText

Range("B2").value =  Doc.getElementsByClassName("orderBookList___2mrBk")(1).querySelector("div.orderBookLeft___1D43u").innerText

Domenic, Once again you are spot on perfect in assisting me! TY so much once again!
 
Upvote 0
Re: Requested Help with Website scraping

...

There remains one other site currently that I am having difficulty with scraping data from. That website is https://www.livecoin.net/en/trading/DGB_BTC

I would be most grateful if you, or another member, would assist me in scraping the top value listed for 'price' under the "SELLING", and the top value listed for 'price' under the "BUYING" .

...

It seems that I have ran into another problem with scraping data from that site. :(

Domenic's suggested code worked flawlessly to scrape the buy and sell prices from that site, but I, for the life of me, can't seem to figure out the code that is required to scrape the values to the right of both of those values, ie the top values under the 'amount' columns on that page. :( I have figured out the code needed to do this for the other pages that I requested help from. :) But this page is being difficult.

Another lifeline would be most appreciated.
 
Upvote 0
Re: Requested Help with Website scraping

WEEEEEEEE!!!! I finally figured it out! After messing with it some more today I finally got it.

Code:
Range("livecoin.net_SellPrice_2").Value = num(Doc.getElementsByClassName("orderBookList___2mrBk")(0).querySelector("span").innerText)

Again, Thank you so much for everyone that has participated in this thread, as well as others.
 
Upvote 0
Re: Requested Help with Website scraping

Hey Guys,
I'm new to this, but I was wondering if someone could assist me with the vba to extract the inner data and add the value to a cell?
Below is the html that I am trying to get the data from.

For Example, if I just wanted to get the "$8,153.00 / Remaining"

<td colspan="3">$10,000.00 / Contract<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: EMBEDDED
</td>
<td colspan="3">$8,153.00 / Remaining<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: EMBEDDED
</td>
<td colspan="3">$3,153.00 / Remaining<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: NON-EMBEDDED
</td>
 
Upvote 0
Re: Requested Help with Website scraping

Hey Guys,
I'm new to this, but I was wondering if someone could assist me with the vba to extract the inner data and add the value to a cell?
Below is the html that I am trying to get the data from.

For Example, if I just wanted to get the "$8,153.00 / Remaining"

<td colspan="3">$10,000.00 / Contract<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: EMBEDDED
</td>
<td colspan="3">$8,153.00 / Remaining<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: EMBEDDED
</td>
<td colspan="3">$3,153.00 / Remaining<span style="margin-left:15px;">(In Plan Network)</span><br><span class="sectionHeader">Message</span>: NON-EMBEDDED
</td>

Got the webpage that you are trying to scrape that from?
 
Upvote 0

Forum statistics

Threads
1,214,798
Messages
6,121,630
Members
449,041
Latest member
Postman24

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top