Extract / scraping DIV data from website and copy to Excel

k0023382

New Member
Joined
Feb 25, 2010
Messages
45
Hi,
Here is the challenge.

In the past, elements were more or less fixed and was able to use my very well known method QueryTables under Workbooks.open

This website below has a "div", called "div.sidemeta" wihick contains the information I want to grab.
http://www.lynda.com/Web-Interactiv...ls page:1 s:relevance sa:true producttypeid:2


I have tried a different alternative:

Sub Test()
Dim IE As Object
Set IE = CreateObject("internetexplorer.application")
IE.Navigate "http://www.webiste.com"
IE.Visible = True

'Wait for the page to finish loading
Do While IE.Busy Or IE.ReadyState <> 4
DoEvents
Loop

'Get data
Set IEdoc = IE.document
Debug.Print IEdoc.body.innerhtml

IEdoc.Close
IE.Quit
End test



The data I get under "IEdoc.body.innerhtml" does not show that particular DIV.
I have tried using getElementById,getNamedItem and getElementByName without any luck.

Anybody knows how to this?
Thanks in advance.
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
how about a different approach:
Code:
Sub Scrape()

Dim doc As Object
Dim x As Long


Set doc = CreateObject("htmlfile")


With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "http://www.lynda.com/Web-Interactive-CSS-tutorials/CSS-Fundamentals/80436-2.html?srchtrk=index%3A1%0Alinktypeid%3A2%0Aq%3Acss%20fundamentals%0Apage%3A1%0As%3Arelevance%0Asa%3Atrue%0Aproducttypeid%3A2"
    .send
    Do: DoEvents: Loop Until .readystate = 4
    doc.body.innerHTML = .responsetext
    .abort
End With


With doc
   With .getElementsByTagName("dt")
        For x = 0 To .Length - 1
            Sheet1.Cells(x + 1, 1).Value = .Item(x).innerText
        Next x
   End With
    With .getElementsByTagName("dd")
        For x = 0 To .Length - 1
            Sheet1.Cells(x + 1, 2).Value = .Item(x).innerText
        Next x
   End With
End With
End Sub
 
Upvote 0
That code posts the HTML to the debug window which has a limit to the amount of text it can hold.

There's a good chance you aren't seeing all the HTML.
 
Upvote 0
Many thanks.
I have tried and a error popup widnows says
"a runtime error has occurred. do you wish to debug?"
"Line:2"

Am I missing something here?
Regards


how about a different approach:
Code:
Sub Scrape()

Dim doc As Object
Dim x As Long


Set doc = CreateObject("htmlfile")


With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "http://www.lynda.com/Web-Interactive-CSS-tutorials/CSS-Fundamentals/80436-2.html?srchtrk=index%3A1%0Alinktypeid%3A2%0Aq%3Acss%20fundamentals%0Apage%3A1%0As%3Arelevance%0Asa%3Atrue%0Aproducttypeid%3A2"
    .send
    Do: DoEvents: Loop Until .readystate = 4
    doc.body.innerHTML = .responsetext
    .abort
End With


With doc
   With .getElementsByTagName("dt")
        For x = 0 To .Length - 1
            Sheet1.Cells(x + 1, 1).Value = .Item(x).innerText
        Next x
   End With
    With .getElementsByTagName("dd")
        For x = 0 To .Length - 1
            Sheet1.Cells(x + 1, 2).Value = .Item(x).innerText
        Next x
   End With
End With
End Sub
 
Upvote 0
What line is highlighted when the code errors?

Do you have IE9 installed? It might be a browser thing
 
Upvote 0
It seems something to do with my work pc, at home it was working.
To replicate that error I have only to to put a break point on ".abort" line.

I will try a loop and see if is working when scrapping over 1000 pages.

Trying similar thing but with a different output, instead of having back the standard HTML, a text file is returned. By using similar technique how do you get around to put the data in columns?
This is the page:
www.google.com/finance/getprices?i=60&p=10d&f=d,o,h,l,c,v&df=cpct&q=AAPL


Many thanks again.
 
Upvote 0
Hmm that's odd.

I'm totally confused, that data from google has nothing in common with the data you were looking for originally?
 
Upvote 0
Hi,
Yes, it is a totally different set of data.
The first example from Lynda, the data was contained within a DIV, the second example on Google, it just a text file, no HTML tags.
I just wonder how could scrap Google page, and keep the results in Excel.
 
Upvote 0

Forum statistics

Threads
1,213,489
Messages
6,113,953
Members
448,535
Latest member
alrossman

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top