VBA- Nested For Each Loop to Scrape HTML Data Table- If Error move to next cell

Kc7487

New Member
Joined
Mar 19, 2016
Messages
20
Hey Mr. Excel peeps!

I have been working for quite a while on a way to scrape data from an HTML table. I have been coming along further and further, but am stuck again.

Thanks to John_w, (Thanks John_w!), I have edited my code with the following code below. I was running across an error after a few rows, and I'm not entirely certain why. In any case, I wanted to experiment with simply skipping the cell causing the error, and then moving on to the next cell in the row, or next row if it's on the last cell of the row. So to do so, I added the on Error, Resume, and GoTo parts.....

Code:
ScrapingData:

 'On Error GoTo ErrorMessageBox
 
 
 Dim table As Object, tRow As Object, tCell As Object
    Dim rowIndex
    Set table = IE.document.getElementsByTagName("TABLE")(3)   '0 = 1st table, 1 = 2nd table, etc.
    

    For Each tRow In table.Rows
        
        For Each tCell In tRow.Cells
        
        'on error resume next might work here as well?
            On Error GoTo SKIPCELL
            
            Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText 'this line is erroring after doing like 3 rows
            
            GoTo NOERRORNEXT
            
SKIPCELL:
 Resume SKIPCELL2
SKIPCELL2:

NOERRORNEXT:
              
        Next tCell
    
    Next tRow


As this code stands, it is not working as expected. It usually works for a few rows out of the 500 or so, then freezes excel, and seems to eventually result in endless looping between "Next tCell" and "Resume SKIPCELL2" when I try stepping through.

Any idea what I might be doing wrong?
 

Excel Facts

Can a formula spear through sheets?
Use =SUM(January:December!E7) to sum E7 on all of the sheets from January through December
It would help with more context... can you tell us where your scraping data from?

Also your labels don't make an sense to me. Can you explain their purpose?
 
Upvote 0
Hi mrmmickle1 thanks for responding,

I would just link you to the site of course, but it is proprietary. It is an HTML data table with the TR tag for each row and a TD tag for each column in that row. The tricky thing is I cannot actually find an "inner text" tag? But the code I am using is definitely working at first, just erroring out after a few rows (out of the total 500 or so). I will post my entire VBA code below to give some context if it helps, (it's kind of long and some extraneous parts, sorry) but the part giving trouble is the part I posted at first. I have since edited that part a little more trying to get the code to work by skipping only the cells that cause an error and then moving on to the next cell in the row (or next row if the error cell is last). I don't know WHY it throws an error on these cells, but I was thinking maybe it is extra data or something not part of the table.....

In terms of labels, I am just trying to use them for the IFERROR GOTO stuff. And then I read you must use a RESUME to end the first error handler which then requires another label to goto....

To give a quick rundown of what I want it to do: To get to the table you must sign in (this is why I have to use data scraping and cannot simply use a web query, if I use a web query you have to sign in manually through the web query window, because VBA cannot pass the username and password to the form controls INSIDE OF the web query window.... I CAN make it sign in automatically in a separate window, but then the web query window is still not recognized as signed in). SO, the code first signs in, and then if there is an error, it passes on to the next part of scraping the data (because if it is signed in already, then the table appears instead of the sign in window). Once scrapping the data, I want the code to cycle through each row of the table, and cycle through each column in that row, and pass all of the values to excel (basically I just want to copy and paste exactly the same data into excel as is in the table, in the same positions, same cells, etc...)

By the way I also tried actually doing a copy paste thing using send keys but then a windows security box pops up like 39 times and requires clicking the box manually a bunch of times because VBA cannot control the box. I couldn't seem to disable this windows security box either.

You will see all of this nonsense commented away with my personal notes on why it's not working that way.....

Anyway here is the code...

Code:
Private Sub CommandButton1_Click()


' CREATE_IEAPP Macro
' Testing signing in creating an internet explorer application and passing username and pw to it
'

'THIS MACRO IS EDITED FROM IEAPP MACRO ,trying to pass values directly to the control boxes on the sign in form and not use send keys


    Dim i As Long
    Dim IE As Object
    Dim objElement As Object
    Dim objCollection As Object
    Dim shape As Excel.shape
Application.DisplayAlerts = False
Application.ScreenUpdating = False

 ' Statusbar
        Application.StatusBar = "MY CODE is loading. Please wait, will take a moment..."
    
    
    Worksheets("Auto SSD Here").Cells.ClearContents
    
    For Each shape In Worksheets("Auto SSD Here").Shapes
        shape.Delete
    Next
    
    
    ' Create InternetExplorer Object
    Set IE = CreateObject("InternetExplorer.Application")
 
    ' You can uncoment Next line To see form results
    IE.Visible = True
 
    ' Send the form data To URL As POST binary request
    IE.Navigate Sheets("Auto Todays SSD").Range("A1").Value 'Changed from MACRO WORKAROUND A13 due to Web Querry not working....trying to go straight to the SSD to login
 
 
    ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
  
  'on error skip to scraping data because the error is likely that the internet explorer object is already signed in
  On Error GoTo ScrapingData
  
  
    
    ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
             'the line below is working fine, take out the ' to login hard coded again, but trying to use a cell to pass the value to the login form, SEEMS TO WORK ON LINE BELOW IT
             'IE.Document.getElementByID("Ecom_User_ID").Value = "HARD CODED USERNAME GOES HERE" 'looks like username ID is the same as name, inspecting the source did not have an ID
              IE.document.getElementByID("Ecom_User_ID").Value = Range("B14")
    
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
             'the line below is working fine, take out the ' to login hard coded again, but trying to use a cell to pass the value to the login form, SEEMS TO WORK ON LINE BELOW IT
             'IE.Document.getElementByID("password-password").Value = "HARD CODED PASSWORD GOES HERE" 'using get elements by ID here
              IE.document.getElementByID("password-password").Value = Range("B15")
             
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
    
            IE.document.getElementByID("loginButton").Click
 
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
 
 
 
 
 'THIS SCRAPING DATA PART IS WHAT IS GIVING ME TROUBLE
 'SCRAPING THE DATA USING A WEB QUERY DOES NOT SEEM TO WORK, BECAUSE THE WEB QUERY IS NOT RECOGNIZED AS LOGGED IN, EVEN IF EXCEL OPENED UP AN INTERNET EXPLORER OBJECT AND LOGGED IN A FEW SECONDS BEFORE.
 
 ' THEREFORE TRYING TO SCRAPE THE DATA INSIDE OF THE SAME INTERNET EXPLORER OBJECT WHICH EXCEL VBA USES TO LOG IN. USING COPY/PASTE DOES ACTUALLY USUALLY SEEM TO WORK (USING SEND KEYS) BUT UNFORTUNATELY
 'SOMETIMES INSTEAD OF COPYING AND PASTING THE SSD PART, IT WILL SOMEHOW SKIP THE COPY PORTION, AND SIMPLY PASTE WHATEVER WAS LAST COPIED TO THE CLIPBOARD IN CELL A1
 
 'Furthermore and even more unfortunately, a windows security dialog box pops up which requires clicking on cancel like 39 times before the SSD appears....
 
 'NOW TRYING TO USE A DIFFERENT CODE THAT DOES NOT UTILIZE THE CLIPBOARD OR SENDKEYS, AND COMMENTING AWAY ALL OF THIS TEMPORARILY.
 
'ScrapingData:
'
'Application.DisplayAlerts = False
'
'    SendKeys "^a"
'
'    Do While IE.Busy
'        Application.Wait DateAdd("s", 1, Now)
'    Loop
'
'   SendKeys "^c"
'
'    Do While IE.Busy
'        Application.Wait DateAdd("s", 1, Now)
'    Loop
'
'
'   Worksheets("Auto SSD Here").Cells.UnMerge
'   Worksheets("Auto SSD Here").Paste Destination:=Worksheets("Auto SSD Here").Range("A1")
'   Worksheets("Auto SSD Here").Cells.UnMerge
'
'    ' Clean up
'    'IE.Quit 'Closes the window,deactivate if you want to leave ssd window open to make sure pasted properly, etc...
'    Set IE = Nothing
'    Set objElement = Nothing
'    Set objCollection = Nothing
'
'    Application.StatusBar = ""
'    Application.DisplayAlerts = True
'
'Exit Sub
'ErrorMessageBox:
'    'MsgBox "Error, you are probably already signed in, OR the code is no longer working and needs attention, contact Me."





'This is the new code for scraping that does not use the clipboard and sendkeys, the portion above has all been commented INOP so that if needed it can be re-activated
'This code below has a problem too though, it only populates into the A column, all data is in one column.... inquiring on Mr Excel

'ScrapingData:
'
'With IE
'            x = .document.body.innertext
'            x = Replace(x, Chr(10), Chr(13))
'            x = Split(x, Chr(13))
'            Worksheets("Auto SSD Here").Range("A1").Resize(UBound(x)) = Application.Transpose(x) 'WHAT DO THE LAST 3 LINES DO???
'
'            .Quit
'        End With
'
'        'Cleaning up
'         Set IE = Nothing
'         Set objElement = Nothing
'         Set objCollection = Nothing
'         Application.StatusBar = ""
'         Application.DisplayAlerts = True
'
'
'The problem with this code above is that every line of the SSD populates into only 1 column. It does get all of the data, and put each row on a different row (but also scrapes notes).
         
 'Got this code below off of Mr Excel after inquiring
 
 
ScrapingData:

 'On Error GoTo ErrorMessageBox
 
 
 Dim table As Object, tRow As Object, tCell As Object
    Dim rowIndex
    Set table = IE.document.getElementsByTagName("TABLE")(3)   '0 = 1st table, 1 = 2nd table, etc.
    

    For Each tRow In table.Rows
        
        For Each tCell In tRow.Cells
        
        On Error Resume Next
        
                   'On Error GoTo SKIPCELL 'reactivate this if deactivate on error resume next
            
            Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText 'this line is erroring after doing like 3 rows
            
            On Error GoTo 0 'GET RID OF THIS LINE OF CODE ONCE DONE TRYING TO TROUBLESHOOT!!!!!!!!!!!!!!!!!!!!!!!
            
            'GoTo NOERRORNEXT 'reactivate this if deactivate on error resume next
            
'SKIPCELL: 'reactivate this if deactivate on error resume next
 'Resume SKIPCELL2 'reactivate this if deactivate on error resume next
'SKIPCELL2: 'reactivate this if deactivate on error resume next

'NOERRORNEXT: 'reactivate this if deactivate on error resume next
              
        Next tCell
    
    Next tRow
  
 
 
 
 ' Clean up
    'IE.Quit 'Closes the window,deactivate if you want to leave ssd window open to make sure pasted properly, etc...
    Set IE = Nothing
    Set objElement = Nothing
    Set objCollection = Nothing

    Application.StatusBar = ""
    Application.DisplayAlerts = True
    Application.ScreenUpdating = True
 
  Exit Sub
 
'ErrorMessageBox:
 '   MsgBox "Error, exit Excel completely and retry, if still throwing this error, coding needs attention, contact Me."
 
 ' Clean up
    'IE.Quit 'Closes the window,deactivate if you want to leave ssd window open to make sure pasted properly, etc...
    Set IE = Nothing
    Set objElement = Nothing
    Set objCollection = Nothing

    Application.StatusBar = ""
    Application.DisplayAlerts = True
    Application.ScreenUpdating = True
 
    End Sub
 
Last edited:
Upvote 0
To clarify however, the code as it stands now is still not allowing me to skip the error and try to continue the loop and see if I still get the full table I am looking for. I must not be doing error handling within the double loop properly because it still errors out and/or infinite loops and freezes excel no matter where I try placing the various different error handlers I've tried...
 
Upvote 0
FYI with that current code I posted, I am currently getting the error "Run-time error '451': Property let procedure not defined and property get procedure did not return an object" on the line

For Each tCell In tRow.Cells
 
Upvote 0
Go into the VBE and hit Ctrl + G to pull up the immediate window. Once that's done run your code. When you get the error....

Type this into the immediate window.... what does it return?
Code:
?tCell.innerText


May also want to check and make sure your data is table 3 as you indicate....
 
Upvote 0
Thank you for the remarks mrmmickle1! It was actually in table 4, but I did try numerous values with my old code and it did not work properly. BUT good news!

After working on this for a while. I got it to work. I will post the code here just in case anybody is looking for it. It actually signs into the website too, and then pulls the table. MAKE SURE IF YOU USE THIS CODE THAT YOU CHANGE THE USERNAME AND PASSWORD HTML NAMES AND SUBMIT BUTTON HTML NAME TO WHATEVER THE HTML CALLS THEM ON YOUR PAGE, VERY LIKELY IT WILL NOT BE THE SAME AS MINE. I ended up scraping ALL tables, and then use excel functions to filter down to what I need. This seemed to work better for my case. I believe (but am not certain at all) that I may have had other tables INSIDE OF the table I wanted? And possibly this was somehow causing an error? In any case, this worked for me.

Code:
Private Sub CommandButton1_Click()


    Dim IE As Object
    Dim doc As Object
    Dim strURL As String
    Dim USERNAMEinput As String
    Dim PASSWORDinput As String
    
Application.DisplayAlerts = False
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False

'On Error GoTo CleanUp 'going to clean up due to manual calculation still set and causes errors that break sheet with NA's until forced recalculation

' Statusbar
        Application.StatusBar = "CODE is loading. Please wait, this should take a minute or so..."

 Worksheets("Auto SSD Here").Cells.ClearContents


    strURL = Sheets("URL Reference").Range("A30").Value 'THIS SETS THE URL, SWITCH TO HARD CODED URL, OR SWITCH REFERENCE TO THE CELL THAT CONTINES YOUR URL

    Set IE = CreateObject("InternetExplorer.Application")
    With IE
         .Visible = False 'Make this true to see the internet explorer window excel is using, False to see nothing

        .navigate strURL
        Do Until .ReadyState = 4: DoEvents: Loop
        Do While .Busy: DoEvents: Loop
        
'Resume SignIn
SignIn:
            
            
            'on error skip to scraping data because the error is likely that the internet explorer object is already signed in
  On Error GoTo ScrapingData
  
  
  
    
    ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
    
    
             'the line below is working fine, take out the ' to login hard coded again, but now using a user input box to pass the value to the login form
             'IE.Document.getElementByID("Ecom_User_ID").Value = "HARD CODED USERNAME GOES HERE"
              IE.document.getElementByID("Ecom_User_ID").Value = Range("A1")
              'above will try to input A1 in user ID, keeping it in simply because if it is already signed in, it will fail, and error to scraping data BUT CHANGE THIS FOR YOUR URL HTML, BECAUSE IT IS PROBABLY NOT CALLED ECOMUSERID FOR YOU
              
              
              USERNAMEinput = Application.InputBox("Enter your ID Number/User Name:", "Input Box Text", Type:=2)
               IE.document.getElementByID("Ecom_User_ID").Value = USERNAMEinput
              'CHANGE ABOVE FOR YOUR URL HTML
    
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
    
              
              
              PASSWORDinput = Application.InputBox("Enter your Password:", "Input Box Text", Type:=2)
              IE.document.getElementByID("password-password").Value = PASSWORDinput
              'CHANGE ABOVE FOR YOUR URL HTML
              
             
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
    
    
    
            IE.document.getElementByID("loginButton").Click
            'CHANGE ABOVE FOR YOUR URL HTML
            
            
 
      ' Wait while IE loading...
    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop
 
 

            
ScrapingData:
Resume ScrapingData1
ScrapingData1:
            
            
                Set doc = IE.document
                
                GetAllTables doc 'this calls the entire scraping macro below

                .Quit
            End With
            
            
            'THIS LINE ALERTS THE USER THAT LOGIN FAILED (IT CHECKS TO SEE IF ROW 8 IS BLANK BECAUSE ROW 8 SHOULD CONTAIN A TABLE TITLE)
'CHANGE THIS FOR YOUR CODE BECAUSE ROW 8 MAY NOT ALWAYS CONTAIN SOMETHING, BUT THIS WAS THE BEST WAY I COULD FIGURE OUT HOW TO ALERT THE USER THAT THE SIGN IN PROCESS FAILED AND SO SCRAPING WRONG INFO
            If ThisWorkbook.Worksheets("Auto SSD Here").Range("A8").Value = "" Then
                MsgBox "Your login credentials were incorrect, try logging in again. (If you are SURE you logged in correctly, then the code might need attention, contact ME.)"
            End If
            
CleanUp:
 
    ' Clean up
    Application.StatusBar = ""
    Application.DisplayAlerts = True
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True
    Application.ScreenUpdating = True

            
            
            
            
            
            
        End Sub
        
        
        
        

        Sub GetAllTables(doc As Object)

             ' get all the tables from a webpage document, doc, and put them in a new worksheet

            Dim ws As Worksheet
            Dim rng As Range
            Dim tbl As Object
            Dim rw As Object
            Dim cl As Object
            Dim tabno As Long
            Dim nextrow As Long
            Dim I As Long
            
            
Application.DisplayAlerts = False
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual



            Set ws = Worksheets("Auto SSD Here")

            For Each tbl In doc.getElementsByTagName("TABLE")
                tabno = tabno + 1
                nextrow = nextrow + 1
                Set rng = ws.Range("B" & nextrow)
                rng.Offset(, -1) = "Table " & tabno
                For Each rw In tbl.Rows
                    For Each cl In rw.Cells
                        rng.Value = cl.outerText
                        Set rng = rng.Offset(, 1)
                        I = I + 1
                    Next cl
                    nextrow = nextrow + 1
                    Set rng = rng.Offset(1, -I)
                    I = 0
                'below I am trying to insert a status bar update for every new row
'CHANGE THIS TO SUIT YOUR SHEET SHEET THOUGH, BECAUSE THIS IS ASSUMING THERE ARE ABOUT 5-6 HUNDRED ROWS TO SCRAPE. I WASN'T SURE HOW TO MAKE THE MACRO COUNT THE ROWS, AND THEN COMPARE THE REAL ROWS
'AGAINST THE COMPLETED ROWS, SO INSTEAD IT IS AN ESTIMATE. MACRO MAY FINISH BEFORE REACHING 100% OR AFTER REACHING 120% OR SO IN MY CASE, BUT THIS IS OKAY FOR MY PURPOSES.
                Application.StatusBar = "Approx. " & nextrow / 5.5 & "% complete."

                
                Next rw
                
            Next tbl

            'ws.Cells.ClearFormats

        End Sub
 
Upvote 0

Forum statistics

Threads
1,214,601
Messages
6,120,460
Members
448,965
Latest member
grijken

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top