VBA Scraping from dynamic web page

Bruzio

Board Regular
Joined
Aug 20, 2020
Messages
85
Office Version
  1. 2016
Platform
  1. Windows
  2. Mobile
Hi everybody. I'm trying to download soccer results from Football results 21 august 2020, I can't as I'd like.
My question was posted on a italian forum, but we were unable to resolve.
I trust in your knowledge.


What do i need?

6 columns:

LEAGUE - START - HOME - AWAY - SCORE- STATUS

What are the problems?

1) Macro doesn't download all scheduled events.
2) The content of the rows isn't correct
2) The web page has a more results button that is always visible and clickable, and i need to load all hidden pages.

I expect this, see screen:

This is my current code:

VBA Code:
Sub fromweb()
    Dim IE As Object
    Dim Doc As Object
    Dim x As Object
    Dim i As Long
    Dim myArray As Variant
    Set IE = CreateObject("InternetExplorer.Application")
    Application.ScreenUpdating = False
    Cells.Clear
    Const myURL As String = "fctables.com/livescore/21_08_2020/"
    With IE
        .navigate myURL
        .Visible = True
        Do While .Busy: DoEvents: Loop
        Do While .readyState <> 4: DoEvents: Loop
        Application.Wait Now + TimeValue("0:00:10")
    End With
    myArray = Array("LEAGUE", "START", "HOME", "AWAY", "SCORE", "STATUS")
    With Range("A1:F1")
        .Value = myArray
    End With
    Set Doc = IE.Document
    For Each x In Doc.getElementsByClassName("league")
        i = i + 1: j = 0
        Cells(i + 1, 1) = Replace(Replace(x.innerText, "Table", ""), Chr(10), "")
        For Each y In Doc.getElementsByClassName("col-xs-8 col-sm-9 col-lg-10 truncate")
            Cells(i + 1, 1) = y.innerText
            Cells(i + 1, 2) = Doc.getElementsByClassName("date")(j).innerText
            Cells(i + 1, 3) = Doc.getElementsByClassName("home")(j).innerText
            Cells(i + 1, 4) = Doc.getElementsByClassName("away")(j).innerText
            Cells(i + 1, 5) = Doc.getElementsByClassName("score")(j).innerText
            Cells(i + 1, 6) = Doc.getElementsByClassName("status_name")(j).innerText
            j = j + 1: i = i + 1
        Next
    Next
    IE.Quit
    Set IE = Nothing
    Set Doc = Nothing
    Application.ScreenUpdating = True
End Sub

How to proceed correctly?
Thanks for the attention
 

Attachments

  • test.jpg
    test.jpg
    109.2 KB · Views: 32

smozgur

BatCoder
Joined
Feb 28, 2002
Messages
1,400
You know there is the main page and the odd pages that are opened for each match. That's where the odds data is retrieved from.

Unless the main page has no indication about the "there is no odd" situation for the matches, then the odd page will be opened anyway. So, unfortunately, the downloading time will not be different.

Hope it makes sense.
 

Excel Facts

Back into an answer in Excel
Use Data, What-If Analysis, Goal Seek to find the correct input cell value to reach a desired result

Bruzio

Board Regular
Joined
Aug 20, 2020
Messages
85
Office Version
  1. 2016
Platform
  1. Windows
  2. Mobile
Yes I understand, but If gameOdd.getElementsByClassName("name").Item(0).innerText = clsOddGroupItem.items.Item(k), if item is empty I would like to skip download.
 

Attachments

  • class.jpg
    class.jpg
    14.6 KB · Views: 3

smozgur

BatCoder
Joined
Feb 28, 2002
Messages
1,400
Yes I understand, but If gameOdd.getElementsByClassName("name").Item(0).innerText = clsOddGroupItem.items.Item(k), if item is empty I would like to skip download.
To clarify better: those elements are in the odds page, which means after the individual odd page is loaded. So, the download already happens to get there. It doesn't matter you make a further comparison or not. Only if there is an indication for "no odds" on the "main" page, then you can avoid downloading the odds page.
 

Watch MrExcel Video

Forum statistics

Threads
1,129,383
Messages
5,635,942
Members
416,889
Latest member
dhegs

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top