Scraping data from websites

SHARPY1

Board Regular
Joined
Oct 1, 2007
Messages
183
Hi,

A few years back somebody posted a macro to a betting website that enabled you to download results and fixtures data for any given day from a website called betexplorer.com
I have used this macro for a number of years and found it to be a great tool.
However the website has just gone through a big transformation and obviously something has changed, and the macro no longer works.
I'm guessing for a macro expert it will be a fairly easy fix, but it's beyond me.

If anybody thinks they may be able to help, please let me know.
It will be greatly appreciated

Many thanks
Richard
 
Who the devil wrote this code :oops: .

This is really a mess.

Does this whole code only scrape the soccer results from the provided URL or it does something else also ?
 
Upvote 0

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
some betting expert on a betting forum somebody called Lunatism
it basically downloads betting odds etc too.
I'm only interested in the data tab though
to show as follows
DATE LEAGUE FIXTURE HOME SCORE AWAY SCORE
 
Upvote 0
yes, so for instance the 8th December

ArgentinaArgentina: Primera B Nacional
21:00Almagro - Atletico Parana 0:1 (0:0, 0:1)


AsiaAsia: AFF Suzuki Cup
13:00Thailand - Myanmar 4:0 (1:0, 3:0)


BoliviaBolivia: Liga de Futbol Prof
00:00Bolivar - Real Potosi 2:0 (0:0, 2:0)
01:30Oriente Petrolero - Sport Boys 2:1 (1:0, 1:1)
20:00Petrolero de Yacuiba - The Strongest 0:1 (0:0, 0:1)
23:00San Jose - Blooming 1:0 (0:0, 1:0)
 
Upvote 0
This was tricky. Took longer time than I expected.

Set Reference to Microsoft Internet Controls and Microsoft HTML object library from "VBE--->Tools-->>References" before running the code:


Code:
Sub Ombir_12Dec16()
Dim i           As Long
Dim j           As Long
Dim rw          As Long
Dim cl          As Long
Dim leagname    As String
Dim matchdate   As String
Dim output      As Variant
Dim Doc         As HTMLDocument
Dim ie          As InternetExplorer
Dim league      As HTMLTableSection
Dim leagues     As IHTMLElementCollection

Set ie = New InternetExplorer

With ie
    .Visible = True
    .Navigate "http://www.betexplorer.com/results/soccer/?year=2016&month=12&day=8"
    Do While .Busy Or .ReadyState <> 4: DoEvents: Loop
End With

Set Doc = ie.Document
Set leagues = Doc.getElementsByClassName("table-matches js-nrbanner-t")(0).getElementsByTagName("tbody")

ReDim output(1 To leagues.Length * 15, 1 To 9)

For Each league In leagues
    With league
        If .className <> "js-nrbanner-tbody h-display-none" Then
        
            leagname = Application.Clean(.Children(0).innerText)
            matchdate = Doc.getElementsByClassName("in-date-navigation__cal js-window-trigger")(0).innerText
            
             For rw = 1 To .Rows.Length - 1
             j = 0
                If .Rows(rw).className <> "js-newdate" Then
                    i = i + 1: j = j + 1
                    output(i, j) = leagname
                    output(i, j + 1) = matchdate
                    output(i, j + 2) = .Rows(rw).Cells(0).Children(0).innerText
                    output(i, j + 3) = .Rows(rw).Cells(0).Children(1).innerText
                    j = 5
                    For cl = 1 To .Rows(rw).Cells.Length - 1
                        output(i, j) = .Rows(rw).Cells(cl).innerText
                        j = j + 1
                    Next
                Else
                    matchdate = .Rows(rw).innerText
                End If
            Next
        End If
    End With
Next
ie.Quit
Range("A1:I1") = Array("League", "Date", "Fixture", "Team", "Col1", "Col2", "Col3", "Col4", "Col5")
Range("A1:I1").Interior.ThemeColor = xlThemeColorAccent1
Range("A2").Resize(UBound(output, 1), UBound(output, 2)) = output
With ActiveSheet.UsedRange
    .Columns.AutoFit
    .Borders.Weight = xlThin
    .WrapText = False
End With
End Sub

Regards,
Ombir
 
Upvote 0
That's brilliant, thanks so much for your time and effort, really is appreciated :)
Just a couple of tweaks if possible?
could the order be Date/League name/fixture/home team/away team/score/home score/away score
so example:
06.12.2016 / Albania: Super League / Flamurtari - KF Tirana / Flamurtari / KF Tirana / 1-0 / 1 / 0
score is currently displaying in a time format, so 1-0 is 01:00:00

If you are able to do these little tweaks, that would be brilliant.
Many many thanks
Richard
 
Upvote 0
Finally, I have added a run macro button to the spreadsheet.
Is there a way of adding a date box or something, so I can select a different date without keep going into the macro to change it?
 
Upvote 0
Try this. Modified as per your new requirements:

Code:
Option Explicit
Private Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub Ombir_12Dec16()
Dim i           As Long
Dim j           As Long
Dim rw          As Long
Dim cl          As Long
Dim url         As String
Dim mdate       As String
Dim leagname    As String
Dim matchdate   As String
Dim output()    As String
Dim Doc         As HTMLDocument
Dim ie          As InternetExplorer
Dim league      As HTMLTableSection
Dim leagues     As IHTMLElementCollection

Set ie = New InternetExplorer

mdate = InputBox("Enter Match date in dd/mm/yyyy format." & vbCrLf & vbCrLf & "For ex : 14/12/2016")

If Not IsDate(mdate) Then
    MsgBox "Incorrect Date"
    Exit Sub
End If

url = "http://www.betexplorer.com/results/soccer/?year=" & Split(mdate, "/")(2) & "&month=" & Split(mdate, "/")(1) & "&day=" & Split(mdate, "/")(0)

With ie
    .Visible = True
    .Navigate url
    Do While .Busy Or .ReadyState <> 4: DoEvents: Loop
End With

Set Doc = ie.Document
Set leagues = Doc.getElementsByClassName("table-matches js-nrbanner-t")(0).getElementsByTagName("tbody")

ReDim output(1 To leagues.Length * 15, 1 To 11)

For Each league In leagues
    With league
        If .className <> "js-nrbanner-tbody h-display-none" Then
        
            leagname = Application.Clean(.Children(0).innerText)
            matchdate = Doc.getElementsByClassName("in-date-navigation__cal js-window-trigger")(0).innerText
            
             For rw = 1 To .Rows.Length - 1
             j = 0
                If .Rows(rw).className <> "js-newdate" Then
                    i = i + 1: j = j + 1
                    output(i, j) = matchdate
                    output(i, j + 1) = leagname
                    output(i, j + 2) = .Rows(rw).Cells(0).Children(0).innerText
                    output(i, j + 3) = .Rows(rw).Cells(0).Children(1).innerText
                    output(i, j + 4) = Split(output(i, j + 3), "-")(0)
                    output(i, j + 5) = Split(output(i, j + 3), "-")(1)
                    j = 7
                    For cl = 1 To .Rows(rw).Cells.Length - 1
                        output(i, j) = .Rows(rw).Cells(cl).innerText
                        j = j + 1
                    Next
                Else
                    matchdate = .Rows(rw).innerText
                End If
            Next
        End If
    End With
Next
ie.Quit
Range("A1:K1") = Array("Date", "League", "Time", "Fixture", "Team1", "Team2", "Col1", "Col2", "Col3", "Col4", "Col5")
Range("A1:K1").Interior.ThemeColor = xlThemeColorAccent1
Range("A2").Resize(UBound(output, 1), UBound(output, 2)) = output
With ActiveSheet.UsedRange
    .Columns.AutoFit
    .Borders.Weight = xlThin
    .WrapText = False
End With
End Sub
 
Upvote 0
Ombir,
That's superb. thank you so much for your time, it truly is much appreciated.
Thanks again
Have a great week
Cheers
Richard
 
Upvote 0

Forum statistics

Threads
1,215,817
Messages
6,127,041
Members
449,356
Latest member
tstapleton67

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top