Results 1 to 7 of 7

Thread: Extracting data from website HTML using VBA

  1. #1
    New Member
    Join Date
    Oct 2016
    Posts
    36
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Extracting data from website HTML using VBA

    Hi all,

    I have literally no idea how to do this, but i have seen that it is possible. I am looking for a VBA script that extracts some data (in the

    From the following site i wish to extract two different pieces of data:

    https://www.betexplorer.com/soccer/s...eague-1/stats/

    I would like to extract Matches Played and Matches Remaining
    I would like to extract Home Goals and Away Goals (per match)

    In the HTML code i can see the data so i guess it must be possible to grab the data somehow.

    Could anyone help me?
    Last edited by mattadams84; Apr 26th, 2019 at 09:18 AM.

  2. #2
    New Member
    Join Date
    Oct 2016
    Posts
    36
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    Here is the HTML code from the page that contains the data :

    HTML Code:
    <div class="box-overflow"><div class="box-overflow__in"><table class="table-main leaguestats"><tr><th>League Progress</th><th>Total</th><th>%</th></tr><tr><td>Matches played</td><td>198</td><td>100 %</td></tr><tr><td>Matches remaining</td><td>0</td><td>0 %</td></tr><tr><th>Outcomes</th><th>Total</th><th>%</th></tr><tr><td>Home team wins</td><td>78</td><td>39 %</td></tr><tr><td>Draws</td><td>55</td><td>28 %</td></tr><tr><td>Away team wins</td><td>65</td><td>33 %</td></tr><tr><th>Goals</th><th>Total</th><th>Per match</th></tr><tr><td>Goals scored</td><td>551</td><td>2.78</td></tr><tr><td>Home goals</td><td>287</td><td>1.45</td></tr><tr><td>Away goals</td><td>264</td><td>1.33</td></tr><tr><th>Over/Under 2.5 stats</th><th>Total</th><th>%</th></tr><tr><td>Over 2.5</td><td>108</td><td>55 %</td></tr><tr><td>Under 2.5</td><td>90</td><td>45 %</td></tr></table></div></div>
    Last edited by mattadams84; Apr 26th, 2019 at 09:50 AM.

  3. #3
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,855
    Post Thanks / Like
    Mentioned
    35 Post(s)
    Tagged
    5 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    The following macro will retrieve the stats from the specified URL, and place them in a newly created worksheet within the active workbook.

    Note that the code uses early binding, and that you will need to set a reference (Visual Basic Editor >> Tools >> References) to the following libraries...

    Code:
    1) Microsoft XML, v6.0
    2) Microsoft HTML Object Library
    Here's the code, which should be copied/pasted into a regular module (Visual Basic Editor >> Insert >> Module)...

    Code:
    Option Explicit
    
    Sub GetSoccerStats()
    
    
        'Set a reference (VBE > Tools > References) to the following libraries:
        '   1) Microsoft XML, v6.0
        '   2) Microsoft HTML Object Library
        
        Dim xmlReq As New MSXML2.XMLHTTP60
        Dim objDoc As New MSHTML.HTMLDocument
        Dim objTable As MSHTML.htmlTable
        Dim objTableRow As MSHTML.htmlTableRow
        Dim strURL As String
        Dim strResp As String
        Dim strText As String
        Dim rw As Long
        
        strURL = "https://www.betexplorer.com/soccer/south-korea/k-league-1/stats/"
        
        With xmlReq
            .Open "GET", strURL, False
            .send
            If .Status <> 200 Then
                MsgBox "Error " & .Status & ":  " & .statusText
                Exit Sub
            End If
            strResp = .responseText
        End With
        
        Worksheets.Add
        
        objDoc.body.innerHTML = strResp
        
        Set objTable = objDoc.getElementsByClassName("table-main leaguestats")(0)
        
        If Not objTable Is Nothing Then
            rw = 1
            For Each objTableRow In objTable.Rows
                strText = objTableRow.Cells(0).innerText
                Select Case strText
                    Case "Matches played", "Matches remaining", "Home goals", "Away goals"
                        Cells(rw, "a").Value = objTableRow.Cells(0).innerText
                        Cells(rw, "b").Value = objTableRow.Cells(1).innerText
                        Cells(rw, "c").Value = objTableRow.Cells(2).innerText
                        rw = rw + 1
                End Select
            Next objTableRow
            Columns("a").AutoFit
        End If
        
        Set xmlReq = Nothing
        Set objDoc = Nothing
        Set objTable = Nothing
        Set objTableRow = Nothing
    
    
    End Sub
    Hope this helps!

  4. #4
    New Member
    Join Date
    Oct 2016
    Posts
    36
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    This works perfectly, thank you very much!

    I have a request: I actually want to run this macro for many different URLS... I have a worksheet already created that has a list of Football Leagues (in the rows), the columns then hold the data.

    You can find the file here : https://www.dropbox.com/s/77sol24sty...oals.xlsm?dl=0

    This is a file where i will add leagues as i go. Is it possible to adapt your code so that it can populate the columns in this sheet? I do not need to pull in the names of the data (matches remaining, home goals, away goals etc) i only need the figures. They would have to populate the columns as per the sheet. As you can see there are many leagues so it would need to loop through each row and then use the corresponding URL

    You will notice that there is a column that contains the word CURRENT. This is to indicate that it should use the URL in the Current URL column. If I change that column to LAST i would like it to use the URL in the Last URL column.

    For each league it will be different if use CURRENT or LAST.

    Any help is greatly appreciated. Many thanks again for your help!!!!

    Kind regards

  5. #5
    MrExcel MVP
    Join Date
    Mar 2004
    Location
    Canada
    Posts
    18,855
    Post Thanks / Like
    Mentioned
    35 Post(s)
    Tagged
    5 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    You're very welcome, glad I could help. For your next question, through, please start a new thread and hopefully someone will be able to provide you with the help you need.

    Cheers!

  6. #6
    Board Regular
    Join Date
    Feb 2019
    Posts
    138
    Post Thanks / Like
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    Hi Domenic,

    I have a very similar situation with the exception of my HTML(file:///C:/Users/HOME/Dropbox/DAILY_...S/DAY_END.HTML) has anywhere from 28-35 separate tables daily however the file name, location, table headers and structure are the same everyday. I just need to get it into a designated worksheet(TABLES) with a macro where my look ups in a separate worksheet can go to work and disseminate the information I am after. There are no specific rows or columns in the worksheet where it needs to go so I would say if it started at A1 that would be great. I will write a macro to delete all data on workbook open followed by an import of the data.

    Any help would be greatly appreciated.
    Thank You.

  7. #7
    New Member
    Join Date
    Aug 2019
    Posts
    1
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Extracting data from website HTML using VBA

    Hi,

    I have got basically the same issue, but just adapting the above code will not work. This is table I am trying to retrieve data from










    Thank you for the help

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •