Excel vba scraper - guidance

Anka

New Member
Joined
Oct 20, 2012
Messages
45
Office Version
  1. 2016
  2. 2010
Platform
  1. Windows
Hi all,

I think this is the most hilarious praser you've seen so far.
Just that I was able to understand, up till now.
Now, I want to get the date and hour, but i don't can...
If someone has the time to give some directions, I thank her in advance.

image.jpg


And this is my code. :oops:
Code:
Sub ParseHTML2()
Dim htm As Object: Set htm = CreateObject("htmlfile")
Dim tr As Object
Dim doc As HTMLDocument
Sheets("Sheet2").Activate
    Range("A1:O1").ClearContents
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "http://www.betexplorer.com/soccer/netherlands/eerste-divisie-2014-2015/den-bosch-almere-city/I7kIHAQP/", False
        .send
        htm.body.innerHTML = .responseText
    End With
    
        'aa = htm.getElementsByTagName("a")(52).innerText   'ok
        'Sheets("Sheet2").Range("A1").Value = aa
        ab = htm.getElementsByTagName("div")(13).getElementsByTagName("li")(2).innerText ' ok
        Sheets("Sheet2").Range("A1").Value = ab
        
        ag = htm.getElementsByTagName("h1")(0).innerText  ' ok
        Sheets("Sheet2").Range("B1").Value = ag
        
        ac = htm.getElementsByTagName("h2")(0).innerText  'ok
        Sheets("Sheet2").Range("D1").Value = ac
        ad = htm.getElementsByTagName("h2")(2).innerText ' ok
        Sheets("Sheet2").Range("E1").Value = ad
        
        ar = htm.getElementById("match-date").innerText  ' ???
        Sheets("Sheet2").Range("C1").Value = ar
               
        ae = htm.getElementById("js-score").innerText    ' ok
        Sheets("Sheet2").Range("F1").Value = ae
        af = htm.getElementById("js-partial").innerText  'ok
        Sheets("Sheet2").Range("G1").Value = af
        
End Sub
 

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"

tonyyy

Well-known Member
Joined
Jun 24, 2015
Messages
1,784
Office Version
  1. 2010
Platform
  1. Windows
Anka,

You might try the following...

Code:
ar = htm.getElementById("match-date").getAttribute("data-dt")
Sheets("Sheet2").Range("C1").Value = ar
Cheers,

tonyyy
 
Last edited:
Upvote 0

Anka

New Member
Joined
Oct 20, 2012
Messages
45
Office Version
  1. 2016
  2. 2010
Platform
  1. Windows
Tonyyy,

Thanks for your prompt response.
Is almost there.

aab.jpg


Everything I've done so far has been to understand how this VBA work.
But I think I have another problem now.
In Sheet1, column D2: D & last row, I have several similar links.
What I need to do to get data from all links in that range?
Thank you.
 
Upvote 0

tonyyy

Well-known Member
Joined
Jun 24, 2015
Messages
1,784
Office Version
  1. 2010
Platform
  1. Windows
Code:
Sub ParseHTML2()
Dim htm As Object: Set htm = CreateObject("htmlfile")
Dim tr As Object
Dim doc As HTMLDocument
Dim arr As Variant
Dim r As Range
Dim i As Long
i = 0
Sheets("Sheet2").Range("A1").CurrentRegion.ClearContents
For Each r In Sheets("Sheet1").Range("D2:D" & Sheets("Sheet1").Cells(Rows.Count, 4).End(xlUp).Row)
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", r.Value, False
        .send
        htm.body.innerHTML = .responseText
    End With
    i = i + 1
    'aa = htm.getElementsByTagName("a")(52).innerText   'ok
    'Sheets("Sheet2").Range("A1").Value = aa
    ab = htm.getElementsByTagName("div")(13).getElementsByTagName("li")(2).innerText ' ok
    Sheets("Sheet2").Range("A" & i).Value = ab
    
    ag = htm.getElementsByTagName("h1")(0).innerText  ' ok
    Sheets("Sheet2").Range("B" & i).Value = ag
    
    ac = htm.getElementsByTagName("h2")(0).innerText  'ok
    Sheets("Sheet2").Range("D" & i).Value = ac
    ad = htm.getElementsByTagName("h2")(2).innerText ' ok
    Sheets("Sheet2").Range("E" & i).Value = ad
    
    ar = htm.getElementById("match-date").getAttribute("data-dt")  ' ???
    Sheets("Sheet2").Range("C" & i).Value = ar
    arr = Split(Range("C" & i).Value, delimiter:=",")
    If arr(0) < 10 Then arr(0) = "0" & arr(0)
    Range("C" & i) = arr(0) & "." & arr(1) & "." & arr(2) & " - " & arr(3) & ":" & arr(4)
    
    ae = htm.getElementById("js-score").innerText    ' ok
    Sheets("Sheet2").Range("F" & i).Value = "'" & ae
    af = htm.getElementById("js-partial").innerText  'ok
    Sheets("Sheet2").Range("G" & i).Value = af
Next r
Columns.AutoFit
End Sub
 
Upvote 0

Anka

New Member
Joined
Oct 20, 2012
Messages
45
Office Version
  1. 2016
  2. 2010
Platform
  1. Windows
Yes, that's it ! Nice work, Tonyyy !
Thank you very much.
 
Upvote 0

tonyyy

Well-known Member
Joined
Jun 24, 2015
Messages
1,784
Office Version
  1. 2010
Platform
  1. Windows
You're welcome, Anka. Glad it worked out.
 
Upvote 0

Forum statistics

Threads
1,195,625
Messages
6,010,754
Members
441,568
Latest member
abbyabby

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top