Web Scrape Error : getElementsByClassName

Excel_Novice_123

New Member
Joined
Apr 2, 2021
Messages
2
Office Version
  1. 365
Platform
  1. Windows
Hi all,

Really grateful if someone can help me with the below. I am trying to get some news links from the website in the below code. It previously used to work fine and recently is causing issues, but I can't seem to figure out why.

In terms of what the code should be doing is extracting the timestamp for the fist 5 news articles in column A. Thereafter in columns b and c extracting the headline and the reference website for each of the first 5 articles.

When i run the code the first For Next loop gives an error "Object variable or with block variable not set".

If I skip over this the second For Next loop gives blank output and doesn't retrieve anything from the website.


VBA Code:
Sub Get_Web_Data()

Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim timeStamp As String

website = "https://www.forexfactory.com/news"
Set request = CreateObject("MSXML2.XMLHTTP")
request.Open "GET", website, False
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
request.send
response = StrConv(request.responseBody, vbUnicode)
html.body.innerHTML = response

For i = 1 To 5
timeStamp = html.getElementsByClassName("flexposts__nowrap flexposts__time nowrap").Item(i - 1).innerText
ActiveSheet.Cells(i + 1, 1).Value = Trim(timeStamp)
Next

Set my_data = html.getElementsByClassName("flexposts__title title")
Dim link
i = 1
For Each elem In my_data
 Set link = elem.getElementsByTagName("a")(0)
 i = i + 1
 If i > 6 Then
 Exit For
 End If
 ActiveSheet.Cells(i, 3).Value = link.href
 ActiveSheet.Cells(i, 2).Value = link.innerText
Next

End Sub
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop

John_w

MrExcel MVP
Joined
Oct 15, 2007
Messages
6,754
Have you looked at the HTML returned by your request? It includes JavaScript code and data which prevents non-browser requests from seeing the web data. The HTML doesn't contain elements with the class name "flexposts__nowrap flexposts__time nowrap", hence the error.

The easiest solution is to use IE or Chrome to scrape the data. Slower than XMLhttp requests, but they automatically decode the JavaScript, giving you the web data.
 

diddi

Well-known Member
Joined
May 20, 2004
Messages
3,301
Office Version
  1. 2010
Platform
  1. Windows
the site has request cookies which are not managed by XMLHTTP. it must have been recoded recently if it used to work :(
 

Excel_Novice_123

New Member
Joined
Apr 2, 2021
Messages
2
Office Version
  1. 365
Platform
  1. Windows
Thanks very much John_w and Diddi for the responses above, that's very helpful.

Please could i ask if you would have an example of the code that shows how to scrape with IE or Chrome? Thanks
 

diddi

Well-known Member
Joined
May 20, 2004
Messages
3,301
Office Version
  1. 2010
Platform
  1. Windows

ADVERTISEMENT

do a search for posts concerning 'selenium' it is the chrome automation tool. not something i have experience in tho
 

Worf

Well-known Member
Joined
Oct 30, 2011
Messages
4,129
A Selenium thread:

 

diddi

Well-known Member
Joined
May 20, 2004
Messages
3,301
Office Version
  1. 2010
Platform
  1. Windows

ADVERTISEMENT

@Worf. is there a thread on installing selenium for win10. i had a try and i cant seem to see it in excel
 

Worf

Well-known Member
Joined
Oct 30, 2011
Messages
4,129
It is midnight and I am going offline now, will be back tomorrow...
 

Dan_W

Active Member
Joined
Jul 11, 2018
Messages
429
Office Version
  1. 365
Platform
  1. Windows
When I came across Selenium, I used a helpful guide that I can now no longer find - the good news is that Wise Owl recently produced a video on getting started with Selenium (Youtube).
Basically, you need to:

1. Download the executable (see Releases)
2. Install SeleniumBasic
3. Download ChromeDriver (or whatever driver you need for whichever browser you want to use) - you need to make sure that the version of ChromeDriver matches your version of Chrome (v89 = v89).
4. Once the ChromeDriver.EXE file has downloaded, you need to help Selenium find it. According to Wise Owl (See here), you need to copy/move the ChromeDriver to the folder containing SeleniumBasic. I'm don't think that's the only way of doing it - according to the website, you just need to include the ChromeDriver location in your PATH environment variable (instructions).
5. You will need to add Selenium Basic to your project by adding it as a reference in the VBE.

I'd recommend looking at the VBScript examples provided in the package. Note that some of the examples use different drivers (for Firefox etc), so don't be surprised if they don't work - you will need to adjust the code to makeit work with your particular driver.
 

diddi

Well-known Member
Joined
May 20, 2004
Messages
3,301
Office Version
  1. 2010
Platform
  1. Windows
thx. i got the example running under chrome
 

Watch MrExcel Video

Forum statistics

Threads
1,129,942
Messages
5,639,108
Members
417,072
Latest member
JaimeDee

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top