Fast ways to grab table from HTML using VBA ?

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
727
Office Version
  1. 365
Platform
  1. Windows
Hi,

i found this vba script which works super fast, however it only works if there is only 1 table on the website.

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://ffrk.kongbakpao.com/character-usable-abilities/", False
        .Send
        oDom.body.innerHtml = .responseText
    End With
    
    With oDom.getelementsbytagname("table")(0)
        ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
        For Each oRow In .Rows
            For Each oCell In oRow.Cells
                data(x, y) = oCell.innerText
                y = y + 1
            Next oCell
            y = 1
            x = x + 1
        Next oRow
    End With
    
    Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
End Sub

Now im looking for something similar which works for specific tables, i.e when you "get data from web" and it displays table 0, table 1 etc....

Does anyone know of a way? get data from web is pretty slow as it works differently to the above i think

Appreciate any help
 

Excel Facts

Add Bullets to Range
Select range. Press Ctrl+1. On Number tab, choose Custom. Type Alt+7 then space then @ sign (using 7 on numeric keypad)

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
727
Office Version
  1. 365
Platform
  1. Windows
an example would be
Code:
https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners

contains multiple tables but i only want to pull data (values only) from table [Nations that won the World Cup] and the above vba doesnt work for this
 
Last edited:

Domenic

MrExcel MVP
Joined
Mar 10, 2004
Messages
19,500
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!
 

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
727
Office Version
  1. 365
Platform
  1. Windows

ADVERTISEMENT

I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

Hi,
With this VBA the text from the table columns merge

Is there a way to retain original table formatting ?
 

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
727
Office Version
  1. 365
Platform
  1. Windows

ADVERTISEMENT

I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

I think you'll need to format the table your self as desired...

Ye thats the thing. Cells from the table get merged together so it's impossible to read or format
 

Domenic

MrExcel MVP
Joined
Mar 10, 2004
Messages
19,500
The code shouldn't format your worksheet at all. It simply copies the values to your worksheet. Maybe your sheet is already formatted?
 

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
727
Office Version
  1. 365
Platform
  1. Windows
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

The code shouldn't format your worksheet at all. It simply copies the values to your worksheet. Maybe your sheet is already formatted?

Values is all I want but for example the html table has in column a "Tomato" and column B "Green"

It's putting all this in one cell "TomatoGreen"
 

Norie

Well-known Member
Joined
Apr 28, 2004
Messages
76,280
Office Version
  1. 365
Platform
  1. Windows
You'll need to look at the HTML behind the page to see how the table is structured and then change the code so that it handles that rather than only grabbing the innerText of each cell in the table.

PS Why aren't you using Get data from web... or perhaps PowerQuery (Get & Transform Data)?
 

Watch MrExcel Video

Forum statistics

Threads
1,129,569
Messages
5,637,109
Members
416,959
Latest member
Mohzein

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top