Web Extraction

nianchi111

Board Regular
Joined
Aug 24, 2007
Messages
184
Hi,

I have a macro which finds the link and opens and extract the data.I did this by getElementByID. The problem is in the website there is no ID it has only DIV Class. I'm unable to get the data.

Code:
 Sub Gooney_Goo_Goo()
Dim ie As Object
Dim i As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True 'Just for our test
.Navigate "http://www.ownersdirect.co.uk/search_results.asp?specific_location_vchar=tenerife&action.x=30&action.y=13&action=search&class=advanced&arrival_offset_int=-1"
Do While .readyState <> 4: DoEvents: Loop
Do While .busy: DoEvents: Loop

With .Document
For i = 0 To .Links.Length - 1
If InStrB(1, .Links(i).innerText, "Apartment in Playa De Las Americas, Tenerife") Then
ie.Navigate .Links(i).href
Do While ie.readyState <> 4: DoEvents: Loop
Do While ie.busy: DoEvents: Loop
Exit For
End If
Next i
End With

End With
If ie.Document.getElementById("content").innerText <> "" Then
  Dim lines As Variant, y As Integer
  lines = Split(ie.Document.getElementById("content").innerText, vbCrLf)
  For y = LBound(lines) To UBound(lines)
    ActiveSheet.Cells(1, y + 3) = lines(y)
  Next
End If
ie.Quit
End Sub

Data Required
Sleeps 2/4 (1 Bedroom)
£200
Property Ref C797
Dave Holcroft
Tel: (UK) + 44 (0)7939 002417
Website:http://www.ownersdirect.co.uk/canaries/C797.htm
 

Some videos you may like

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
76,218
Office Version
  1. 365
Platform
  1. Windows
Take a look at getElementsByTagName and classname.

You can use the former to get all the DIVS and you can distinguish them with their classname.

By the way, there is an ID for the DIV that holds the list of properties.

Using that should help you narrow things down.
 

nianchi111

Board Regular
Joined
Aug 24, 2007
Messages
184
Hi,

Thank you so much for the reply.

I tried with Tagname and Classname both are not working.

Could you please let me know how to use the former to get all the DIVS.
 

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
76,218
Office Version
  1. 365
Platform
  1. Windows

ADVERTISEMENT

What did you try?

Did you try the ID I mentioned?
 

nianchi111

Board Regular
Joined
Aug 24, 2007
Messages
184
Yes I tried with "outer_container" for the Name portion but for the Rates and Contact details I don't find the ID and tried with Tagname and classname it is not working.

Rich (BB code):
Sub Gooney_Goo_Goo()
Dim ie As Object
Dim i As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = True 'Just for our test
.Navigate "http://www.ownersdirect.co.uk/search_results.asp?specific_location_vchar=tenerife&action.x=30&action.y=13&action=search&class=advanced&arrival_offset_int=-1"
Do While .readyState <> 4: DoEvents: Loop
Do While .busy: DoEvents: Loop

With .Document
For i = 0 To .Links.Length - 1
If InStrB(1, .Links(i).innerText, "Apartment in Playa De Las Americas, Tenerife") Then
ie.Navigate .Links(i).href
Do While ie.readyState <> 4: DoEvents: Loop
Do While ie.busy: DoEvents: Loop
Exit For
End If
Next i
End With
End With
If ie.Document.getElementById("outer_container").innerText <> "" Then
  Dim lines As Variant, y As Integer
  lines = Split(ie.Document.getElementById("outer_container").innerText, vbCrLf)
  For y = LBound(lines) To UBound(lines)
    ActiveSheet.Cells(1, y + 3) = lines(y)
  Next
End If
ie.Quit
End Sub
 

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
76,218
Office Version
  1. 365
Platform
  1. Windows
That's not the ID I meant it's something like list of properties.

By the way, you should create references to the objects you are working with rather than using something like this.
Code:
lines = Split(ie.Document.getElementById("outer_container").innerText, vbCrLf

To do that start at the top by creating a reference to the document.
Code:
Set doc = ie.Document
Now when you can use doc whenever you need to refer to the document.

Then if you are working with an element/elements:
Code:
' Create reference to collection of all DIV elements in document
Set colAllDivs = doc.getElementsByTagName("DIV")

With a collection of elements, like DIVs above you can loop through them like this.
Code:
Set rng = Range("A1") ' just to illustrate
 
' loop through all DIVs and output to worksheet
For Each objDiv In colAllDivs
    rng.Value = objDiv.classname
    rng.Offset(,1).Value = objDiv.InnerText 
    Set rng = rng.Offset(1)
Next objDiv
 

Watch MrExcel Video

Forum statistics

Threads
1,122,719
Messages
5,597,736
Members
414,170
Latest member
Mdm

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top