Fast ways to grab table from HTML using VBA ?

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
785
Office Version
  1. 365
Platform
  1. Windows
Hi,

i found this vba script which works super fast, however it only works if there is only 1 table on the website.

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://ffrk.kongbakpao.com/character-usable-abilities/", False
        .Send
        oDom.body.innerHtml = .responseText
    End With
    
    With oDom.getelementsbytagname("table")(0)
        ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
        For Each oRow In .Rows
            For Each oCell In oRow.Cells
                data(x, y) = oCell.innerText
                y = y + 1
            Next oCell
            y = 1
            x = x + 1
        Next oRow
    End With
    
    Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
End Sub

Now im looking for something similar which works for specific tables, i.e when you "get data from web" and it displays table 0, table 1 etc....

Does anyone know of a way? get data from web is pretty slow as it works differently to the above i think

Appreciate any help
 

Excel Facts

Who is Mr Spreadsheet?
Author John Walkenbach was Mr Spreadsheet until his retirement in June 2019.
an example would be
Code:
https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners

contains multiple tables but i only want to pull data (values only) from table [Nations that won the World Cup] and the above vba doesnt work for this
 
Last edited:
Upvote 0
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!
 
Upvote 0
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

Hi,
With this VBA the text from the table columns merge

Is there a way to retain original table formatting ?
 
Upvote 0
I think you'll need to format the table your self as desired...
 
Upvote 0
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

I think you'll need to format the table your self as desired...

Ye thats the thing. Cells from the table get merged together so it's impossible to read or format
 
Upvote 0
The code shouldn't format your worksheet at all. It simply copies the values to your worksheet. Maybe your sheet is already formatted?
 
Upvote 0
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

The code shouldn't format your worksheet at all. It simply copies the values to your worksheet. Maybe your sheet is already formatted?

Values is all I want but for example the html table has in column a "Tomato" and column B "Green"

It's putting all this in one cell "TomatoGreen"
 
Upvote 0
You'll need to look at the HTML behind the page to see how the table is structured and then change the code so that it handles that rather than only grabbing the innerText of each cell in the table.

PS Why aren't you using Get data from web... or perhaps PowerQuery (Get & Transform Data)?
 
Upvote 0

Forum statistics

Threads
1,214,522
Messages
6,120,022
Members
448,939
Latest member
Leon Leenders

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top