Issue with scraping with VBA

Vbanoob98

Board Regular
Joined
Sep 13, 2019
Messages
128
Hello,

I am trying to get this text = onclick="window.open('DisplayCertificateOfAnalysis.aspx?docId=10542339')

HTML:
<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>

</title></head>
<body>
    <form method="post" action="xxxxxxxxxxxxxxxxxxxxxxx" id="form1">
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="v4NCd7O24jB3LLlPXWQuVZtDg5Ws60x2gVCnDaOMmjmp6H6gS2lZfjFFCey39c6ca8V1C4z8+mgOFo70bZIy7Vn97qYwbrHA2WZm5WpGYlp/Vjk00+y/HMBCjhPceYJWVyuwrHDU6Bz7PGELafSRIY8ZljWBzyOQJm9hrNvQJH00xMV6y/QN0EXsLDUzJ6q5Wqw/ugvOPaIx4RdC+fI7R1dw0u+bfhSC2yV+He6CAS01wVJ2CiZTqxdi1NDmAeRJ3ox+vxbqW6JLfnYPx/1+zaqX0w5HwBPamy15Z1oKBQcSQBSlO3Go9VL8k3IoEay0" />
</div>

<div class="aspNetHidden">

    <input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="17302BD3" />
    <input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="aXBIwKjzvwDcNrY3G3hMpeLlZSIfBo1c25obC6W/VmTJpCItQluPePQfshtTOFXc39zGG4P0T9iYljffBG0ADW++1JR43761sxDXpYjk8t/S3bHDp7wlx9sTBEhAFjIs" />
</div>
    <div>

        <table style="width:100%">
            <tr>
                <td style="width:25%">                </td>
                <td style="width:50%"> <B>Product:</B> xxxxxxxxxxxxxxxxxx (xxxxxxxxxxxxxx)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<B>Batch:</B> 0152             </td>
                <td style="width:25%">                </td>
            </tr>
            <tr>
                <td style="width:25%">                </td>
                <td>
                    <table id="tblFiles" align="Center" style="border-width:1px;border-style:Solid;width:100%;">
    <tr style="background-color:#ADC6CD;border-color:White;">
        <th>File Name</th><th>File Type</th><th>File Size</th><th></th>
    </tr><tr align="center">
        <td>xxxxxxxxxx.pdf</td><td>PDF</td><td>IDnumber</td><td><input type="image" name="ctl02" src="images/PDF.png" onclick="window.open(&#39;DisplayCertificateOfAnalysis.aspx?docId=10542339&#39;); return false;" /></td>
    </tr>
</table>
                </td>
                <td style="width:25%">                </td>
            </tr>
        </table>
    </div>
    </form>
</body>
</html>

I have the following code that extracts the table from the website, but it only gets the visible table values and not the underlying data in the web source

VBA Code:
Sub get_table_web()
Dim ig As Object
Dim urlc As String
urlc = "xxxxxxxxx"
Set ig = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
ig.Visible = True
ig.navigate urlc
Do While ig.Busy: DoEvents: Loop
Application.Wait (Now + TimeValue("0:00:05"))


Dim tb As HTMLTable
Set tb = ig.document.getElementById("tblFiles")
Dim rowcounter As Integer
Dim columncounter As Integer
rowcounter = 4
columncounter = 2
Dim tro As HTMLTableRow
Dim tdc As HTMLTableCell
Dim thu
Dim mys As Worksheet
Set mys = ThisWorkbook.Sheets("Financial_Futures")
For Each tro In tb.getElementsByTagName("tr")
For Each thu In tro.getElementsByTagName("th")
mys.Cells(rowcounter, columncounter).Value = thu.innerText
columncounter = columncounter + 1
Next thu
For Each tdc In tro.getElementsByTagName("td")
mys.Cells(rowcounter, columncounter).Value = tdc.innerText
columncounter = columncounter + 1
Next tdc
columncounter = 1
rowcounter = rowcounter + 1
Next tro
End Sub
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Get the input element within the td and then its onclick attribute, something like this:
VBA Code:
Dim inputElement As HTMLInputElement

For Each tdc In tro.getElementsByTagName("td")
    Set inputElement = tdc.getElementsByTagName("input")(0)
    If Not inputElement Is Nothing Then
        Debug.Print inputElement.getAttribute("onclick")
    End If
    mys.Cells(rowcounter, columncounter).Value = tdc.innerText
    columncounter = columncounter + 1
Next tdc
 
Upvote 0

Forum statistics

Threads
1,215,039
Messages
6,122,799
Members
449,095
Latest member
m_smith_solihull

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top