ExcelStudent00
New Member
- Joined
- Oct 13, 2016
- Messages
- 2
Hello everyone,
I've come here with no other options. Usually I'm able to figure out the solutions to the problems that I run into but this time I am truly stumped. If possible, please help me out.
I am trying to extract tables from the following url: Pittsburgh Steelers at New England Patriots - September 10th, 2015 | Pro-Football-Reference.com
I am able to successfully extract the Scoring, Game Info, Officials, Expected Points Summary, and Team Stats tables, but I am finding it impossible to access the "Passing, Rushing and Receiving" and "Defense" tables.
I can see that the design for these tables are different. They seem to be wrapped up or hidden, but I can't extract them via the following methods.
This is the code I'm using:
Certain tables I can grab by ID:
Certain tables I can grab by ClassName:
I have both references Microsoft HTML Object Library and Microsoft Internet Controls activated.
--------------
As a side-note, I first tried to access the tables via Query Tables. I did find success, but I didn't like how I had to do it. I couldn't access the table through a regular query, so I did it manually and recorded the macro.
Here's the code:
I didn't like how heavy this method was. It seems to be adding each individual query to the workbook, and then adding 3 worksheets for each table after doing a select all to extract each tables data. Is it possible to do this any cleaner or smoother?
I'd really appreciate the help, thanks.
I've come here with no other options. Usually I'm able to figure out the solutions to the problems that I run into but this time I am truly stumped. If possible, please help me out.
I am trying to extract tables from the following url: Pittsburgh Steelers at New England Patriots - September 10th, 2015 | Pro-Football-Reference.com
I am able to successfully extract the Scoring, Game Info, Officials, Expected Points Summary, and Team Stats tables, but I am finding it impossible to access the "Passing, Rushing and Receiving" and "Defense" tables.
I can see that the design for these tables are different. They seem to be wrapped up or hidden, but I can't extract them via the following methods.
This is the code I'm using:
Certain tables I can grab by ID:
Code:
[INDENT=2] Set elemCollection = Doc.getElementById(tableId)
For r = 0 To (elemCollection.Rows.Length - 1)
marker = 1
For C = 0 To (elemCollection.Rows(r).Cells.Length - 1)
ws2.Cells(r + 1 + LR, C + marker) = elemCollection.Rows(r).Cells(C).innerText
Next C
Next r
[/INDENT]
Certain tables I can grab by ClassName:
Code:
Set elemCollection = Doc.getElementsByClassName(tableId)
For Each objOne In elemCollection
For r = 0 To (objOne.Rows.Length - 1)
marker = 1
For C = 0 To (objOne.Rows(r).Cells.Length - 1)
ws2.Cells(r + 1 + LR, C + marker) = objOne.Rows(r).Cells(C).innerText
Next C
Next r
Next
I have both references Microsoft HTML Object Library and Microsoft Internet Controls activated.
--------------
As a side-note, I first tried to access the tables via Query Tables. I did find success, but I didn't like how I had to do it. I couldn't access the table through a regular query, so I did it manually and recorded the macro.
Here's the code:
Code:
Sub TestingWebQueryImport()
'
' TestingWebQueryImport Macro
'
'
Sheets.Add After:=ActiveSheet
ActiveWorkbook.Queries.Add name:="Defense Table", Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""http://www.pro-football-reference.com/boxscores/201610060sfo.htm""))," & Chr(13) & "" & Chr(10) & " Data27 = Source{27}[Data]," & Chr(13) & "" & Chr(10) & " #""Changed Type"" = Table.TransformColumnTypes(Data27,{{""Player"", type text}, {""Tm"", type text}, {""Def Interceptions Int"", type text}, {""Def Interceptions Yds"", type text}, {""Def Interceptions TD"", type text}," & _
" {""Def Interceptions Lng"", type text}, {""Sacks & Tackles Sk"", type text}, {""Sacks & Tackles Tkl"", type text}, {""Sacks & Tackles Ast"", type text}, {""Fumbles FR"", type text}, {""Fumbles Yds"", type text}, {""Fumbles TD"", type text}, {""Fumbles FF"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " #""Changed Type"""
Sheets.Add After:=ActiveSheet
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Defense Table""" _
, destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Defense Table]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Defense_Table"
.Refresh BackgroundQuery:=False
End With
ActiveWorkbook.Queries.Add name:="Passing, Rushing, & Receiving Table", _
Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""http://www.pro-football-reference.com/boxscores/201610060sfo.htm""))," & Chr(13) & "" & Chr(10) & " Data25 = Source{25}[Data]," & Chr(13) & "" & Chr(10) & " #""Changed Type"" = Table.TransformColumnTypes(Data25,{{""Player"", type text}, {""Tm"", type text}, {""Passing Cmp"", type text}, {""Passing Att"", type text}, {""Passing Yds"", type text}, {""Passing TD"", type text}," & _
" {""Passing Int"", type text}, {""Passing Sk"", type text}, {""Passing Yds2"", type text}, {""Passing Lng"", type text}, {""Passing Rate"", type text}, {""Rushing Att"", type text}, {""Rushing Yds"", type text}, {""Rushing TD"", type text}, {""Rushing Lng"", type text}, {""Receiving Tgt"", type text}, {""Receiving Rec"", type text}, {""Receiving Yds"", type text}, {" & _
"""Receiving TD"", type text}, {""Receiving Lng"", type text}, {""Fumbles Fmb"", type text}, {""Fumbles FL"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " #""Changed Type"""
Sheets.Add After:=ActiveSheet
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Passing, Rushing, & Receiving Table""" _
, destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Passing, Rushing, & Receiving Table]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Passing__Rushing____Receiving_Table"
.Refresh BackgroundQuery:=False
End With
ActiveWorkbook.Queries.Add name:="Scoring Table", Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""http://www.pro-football-reference.com/boxscores/201610060sfo.htm""))," & Chr(13) & "" & Chr(10) & " Data16 = Source{16}[Data]," & Chr(13) & "" & Chr(10) & " #""Changed Type"" = Table.TransformColumnTypes(Data16,{{""Quarter"", Int64.Type}, {""Time"", type time}, {""Tm"", type text}, {""Detail"", type text}, {""ARI"", Int64.Type}, {""SFO"", Int64.Type}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " #""Changed Type"""
Sheets.Add After:=ActiveSheet
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Scoring Table""" _
, destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Scoring Table]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Scoring_Table"
.Refresh BackgroundQuery:=False
End With
ActiveWorkbook.Queries.Add name:="Team Stats Table", Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""http://www.pro-football-reference.com/boxscores/201610060sfo.htm""))," & Chr(13) & "" & Chr(10) & " Data23 = Source{23}[Data]," & Chr(13) & "" & Chr(10) & " #""Changed Type"" = Table.TransformColumnTypes(Data23,{{"""", type text}, {""ARI"", type text}, {""SFO"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " #""Changed Type"""
Sheets.Add After:=ActiveSheet
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Team Stats Table""" _
, destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Team Stats Table]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Team_Stats_Table"
.Refresh BackgroundQuery:=False
End With
Selection.ListObject.QueryTable.Refresh BackgroundQuery:=False
Selection.ListObject.QueryTable.Refresh BackgroundQuery:=False
Selection.ListObject.QueryTable.Refresh BackgroundQuery:=False
Selection.ListObject.QueryTable.Refresh BackgroundQuery:=False
Range("I14").Select
End Sub
I didn't like how heavy this method was. It seems to be adding each individual query to the workbook, and then adding 3 worksheets for each table after doing a select all to extract each tables data. Is it possible to do this any cleaner or smoother?
I'd really appreciate the help, thanks.