Data Scraping

AndrewKent

Well-known Member
Joined
Jul 26, 2006
Messages
889
Hi there folks,

I know this is possible as I have been able to do this with other websites (albeit my coding could be better) but I am looking to scrape links from a webpage, and place them in a list on a worksheet. So, for example, I would go to the following link:

http://msn.foxsports.com/foxsoccer/eredivisie/scores?week=1&timeframe=1

...and once here, I am looking to extract all the links which are shown as "Match Stats", an example of which, looks like this:

http://msn.foxsports.com/foxsoccer/eredivisie/matchStats?gameId=2010080610370

I did this previously by extracting the source code, and looking through that but there has to be a smarter way to do this.

Kind regards,

Andy
 

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.
Andy

Didn't I already post code for this?

I've definitely got it.

The only problem I can see with it is that all it does is get the link, nothing else.

Actually when I check out some of the extracted links it seems to be for matches on or around 10th August last year.

eg Roda JC vs. Twente Enschede

Actually it seems to be the matches in the first link you posted.:eek:
 
Last edited:
Upvote 0
Apologies, I couldn't remember if I had/hadn't posted this previously. I'll have a quick look...
 
Upvote 0
Andy

Don't bother - I just looked at the code, I'd only ran it, and it's a bit of a mess.

Don't know what I was thinking - perhaps all those divs confused me.

I've partially redone it so it does actually use the Links collection as I've suggested.

It also now extracts the team names, and I think I can get the stats on that page when you click Expand.

That's just goals, cars and subs though - any use to you?

Anyway, here's what I've got now.

Still a bit of a mess but it works for me with that link.:)
Rich (BB code):
Option Explicit
 
Sub GetStatsLinks()
Dim IE As Object
Dim doc As Object
Dim lnks As Object
Dim lnk As Object
Dim cls
Dim rng As Range
Dim T As Long
Dim strTeam As String
Dim pos As Long
 
    Set IE = CreateObject("InternetExplorer.Application")
 
    Set rng = Worksheets(2).Range("A1")

    With IE
 
        .Visible = True
 
        .navigate "http://msn.foxsports.com/foxsoccer/eredivisie/scores?week=1&timeframe=1"
 
        Do Until .ReadyState = 4
            DoEvents
        Loop
 
        Set doc = IE.Document
 
        Set lnks = doc.Links
 
        For Each lnk In lnks
 
            cls = lnk.className
 
            If cls = "fsiLeagueDataLink" Then

                pos = InStr(lnk.href, "teams")

                If pos > 0 Then
                    strTeam = Mid(lnk.href, pos + 6)
                    strTeam = Left(strTeam, InStr(strTeam, "/") - 1)
                    rng.Offset(, T).Value = strTeam
                    T = T + 1
                End If
 
                If InStr(lnk.href, "matchStats") > 0 Then
                    rng.Offset(, 2).Value = lnk.href
                    Set rng = rng.Offset(1)
                End If

                If T = 2 Then T = 0
 
            End If
 
        Next lnk

    End With
 
    Set IE = Nothing
 
End Sub
 
Upvote 0

Forum statistics

Threads
1,224,564
Messages
6,179,547
Members
452,925
Latest member
duyvmex

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top