Thanks Thanks:  0
Likes Likes:  0
Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: Excel VBA to Sign Into and then Extract Data from a Website

  1. #1
    New Member
    Join Date
    Mar 2016
    Posts
    20
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Excel VBA to Sign Into and then Extract Data from a Website

    Hello Mr. Excel folks!

    I have made great strides on my attempts to automate logging into, and then extracting data, from a website.

    I have gotten all the way to the point where the data is indeed in excel! However, the data is all going in only one column, column A, and I cannot figure out how to fix this problem. I was hoping somebody could take a look at this code and maybe give me a hint as to what I need to do to make the table populate properly (each cell in a different cell).

    It is so close, but with the entire row (like 30 columns or so) populating into ONLY one column (column A) it just won't work for me.

    Please, any help would be greatly appreciated.

    I will post my current code below.


    Code:
    Private Sub CommandButton1_Click()
    
    
    ' CREATE_IEAPP Macro
    ' Testing signing in creating an internet explorer application and passing username and pw to it
    '
    
    'THIS MACRO IS EDITED FROM IEAPP MACRO, trying to pass values directly to the control boxes on the sign in form and not use send keys
    
    
        Dim i As Long
        Dim IE As Object
        Dim objElement As Object
        Dim objCollection As Object
    Application.DisplayAlerts = False
        
        Worksheets("Auto SSD Here").Cells.ClearContents
        
        
        ' Create InternetExplorer Object
        Set IE = CreateObject("InternetExplorer.Application")
     
        ' You can uncoment Next line To see form results
        IE.Visible = True
     
        ' Send the form data To URL As POST binary request
        IE.Navigate Sheets("Auto Todays SSD").Range("A1").Value 'Changed from MACRO WORKAROUND A13 due to Web Querry not working....trying to go straight to the SSD to login
     
        ' Statusbar
            Application.StatusBar = "My CODE is loading. Please wait..."
     
        ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
      
      'on error scip to scraping data because the error is likely that the internet explorer object is already signed in
      On Error GoTo ScrapingData
      
      
        
        ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
                 'the line below is working fine, take out the ' to login hard coded again, but trying to use a cell to pass the value to the login form, SEEMS TO WORK ON LINE BELOW IT
                 'IE.Document.getElementByID("Ecom_User_ID").Value = "HARD CODED USERNAME GOES HERE" 
                  IE.document.getElementByID("Ecom_User_ID").Value = Range("B14")
        
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
                 'the line below is working fine, take out the ' to login hard coded again, but trying to use a cell to pass the value to the login form, SEEMS TO WORK ON LINE BELOW IT
                 'IE.Document.getElementByID("password-password").Value = "HARD CODED PASSWORD GOES HERE"
                  IE.document.getElementByID("password-password").Value = Range("B15")
                 
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
        
                IE.document.getElementByID("loginButton").Click
     
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
     
     
     
     
     'THIS SCRAPING DATA PART IS WHAT IS GIVING ME TROUBLE
     'SCRAPING THE DATA USING A WEB QUERY DOES NOT SEEM TO WORK, BECAUSE THE WEB QUERY IS NOT RECOGNIZED AS LOGGED IN, EVEN IF EXCEL OPENED UP AN INTERNET EXPLORER OBJECT AND LOGGED IN A FEW SECONDS BEFORE.
     
     ' THEREFORE TRYING TO SCRAPE THE DATA INSIDE OF THE SAME INTERNET EXPLORER OBJECT WHICH EXCEL VBA USES TO LOG IN. USING COPY/PASTE DOES ACTUALLY USUALLY SEEM TO WORK (USING SEND KEYS) BUT UNFORTUNATELY
     'SOMETIMES INSTEAD OF COPYING AND PASTING THE SSD PART, IT WILL SOMEHOW SKIP THE COPY PORTION, AND SIMPLY PASTE WHATEVER WAS LAST COPIED TO THE CLIPBOARD IN CELL A1
     
     'Furthermore and even more unfortunately, a windows security dialog box pops up which requires clicking on cancel like 39 times before the SSD appears....
     
     'NOW TRYING TO USE A DIFFERENT CODE THAT DOES NOT UTILIZE THE CLIPBOARD OR SENDKEYS, AND COMMENTING AWAY ALL OF THIS TEMPORARILY.
     
    'ScrapingData:
    '
    'Application.DisplayAlerts = False
    '
    '    SendKeys "^a"
    '
    '    Do While IE.Busy
    '        Application.Wait DateAdd("s", 1, Now)
    '    Loop
    '
    '    SendKeys "^c"
    '
    '    Do While IE.Busy
    '        Application.Wait DateAdd("s", 1, Now)
    '    Loop
    '
    '
    '   Worksheets("Auto SSD Here").Cells.UnMerge
    '   Worksheets("Auto SSD Here").Paste Destination:=Worksheets("Auto SSD Here").Range("A1")
    '   Worksheets("Auto SSD Here").Cells.UnMerge
    '
    '    ' Clean up
    '    'IE.Quit 'Closes the window,deactivate if you want to leave ssd window open to make sure pasted properly, etc...
    '    Set IE = Nothing
    '    Set objElement = Nothing
    '    Set objCollection = Nothing
    '
    '    Application.StatusBar = ""
    '    Application.DisplayAlerts = True
    '
    'Exit Sub
    'ErrorMessageBox:
    '    'MsgBox "Error, you are probably already signed in, OR the code is no longer working and needs attention, contact xxxxxxxxxxx."
    
    
    
    
    
    'This is the new code for scraping that does not use the clipboard and sendkeys, the portion above has all been commented INOP so that if needed it can be re-activated
    'This code below has a problem too though, it only populates into the A column, all data is in one column.... inquiring on Mr Excel
    
    ScrapingData:
    
    With IE
                x = .document.body.innertext
                x = Replace(x, Chr(10), Chr(13))
                x = Split(x, Chr(13))
                Worksheets("Auto SSD Here").Range("A1").Resize(UBound(x)) = Application.Transpose(x) 'WHAT DO THE LAST 3 LINES DO???
    
                .Quit
            End With
    
            'Cleaning up
             Set IE = Nothing
             Set objElement = Nothing
             Set objCollection = Nothing
             Application.StatusBar = ""
             Application.DisplayAlerts = True
    
    
    'The problem with this code above is that every line of the SSD populates into only 1 column. It does get all of the data, and put each row on a different row (but also scrapes notes).
             
             
             
             
             
             
        End Sub

  2. #2
    Board Regular
    Join Date
    Oct 2007
    Posts
    5,646
    Post Thanks / Like
    Mentioned
    6 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    If the web data is in a HTML table element then you could extract the rows and columns like this:
    Code:
        Dim table As Object, tRow As Object, tCell As Object
        Dim rowIndex
        Set table = IE.document.getElementsByTagName("TABLE")(0)   '0 = 1st table, 1 = 2nd table, etc.
        For Each tRow In table.Rows
            For Each tCell In tRow.Cells
                Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText
            Next
        Next
    The above code assumes the data is in the first table. For a different table change the 0 in the ("TABLE")(0) line. You can determine the correct table by looking at the HTML source, or trial and error by changing the 0 to 1, 2, 3, etc.

  3. #3
    New Member
    Join Date
    Mar 2016
    Posts
    20
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Thanks so much John_w!

    I imputed your code instead of the part of my code that was giving me trouble. This seems MUCH closer. Unfortunately though, I am still getting an error. It actually does pull 3 rows completely successfully with all 31 columns. Then it pulls a 4th row with only 22 columns, and the rest of the 500 or so rows it is not pulling up at all (I assume because the error is stopping it).

    It errors out on this line inside of the loop:
    Code:
    Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText
    Inside of the other line of code where you specify which table to grab, I tried it with the 0, with a 1, then a 2, then a 3. It seems 3 is the table I need. (1 and 2 were some header type things, 4,5,6 were extraneous tables that are NOT the missing rows, and not information I actually need. Maybe it would somehow help to just pull ALL tables, but somehow I doubt it.....)

    I don't know if this helps, but I did read the HTML source. I'm not great at reading this stuff, but it looks like when the coding gets to the table I want, there is a TON of lines of code. All of the lines of code look pretty much the same, except for the individual content of each individual cell. It looks as though each new row starts with a TR and then end with a /TR, and then every new column in that row starts with a TD, and then ends with a /TD. I would simply post exactly what page I'm trying to scrape, but of course it is proprietary.

    I was wondering John_w if you (or anybody else too for that matter?) might have some insight on this issue? Thanks!

  4. #4
    Board Regular
    Join Date
    Oct 2007
    Posts
    5,646
    Post Thanks / Like
    Mentioned
    6 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    What is the error?

    Is the table being loaded/generated dynamically? It could be that the table is not completely loaded when the loop is running. In that case the code should wait until the table is completely loaded before looping through the rows and columns. However, without seeing the HTML I can't really help further.

    Quote Originally Posted by Kc7487 View Post
    It looks as though each new row starts with a TR and then end with a /TR, and then every new column in that row starts with a TD, and then ends with a /TD.
    That is a HTML table structure. TR tags are the rows and within each row there are TD tags for the columns.

  5. #5
    New Member
    Join Date
    Mar 2016
    Posts
    20
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Oops! I thought I posted the error, sorry about that. The error which Excel is throwing on the line I specified earlier (...Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText...) is as follows:

    ..."Run-time error '1004': Application-defined or object-defined error"...

    In terms of the table being loaded/generated dynamically, I am uncertain of what this means exactly, however I do have the internet explorer object window visible, and can see the entirety of the table loads up and displays inside the window at the same time, (so I believe it should all be completely loaded while the loop is running???)

    I wanted to like to experiment with simply skipping the error, and trying to load whatever cells come next in the loop, but the coding I have been adding (On Error Resume Next) does not seem to fit this particular situation, it still errors the same way as before. Any idea what error handling procedure I could use to try and accomplish skipping the error to see if I still get the proper results?

    Or I suppose even better yet, any idea on how I might try to properly resolve the error with this additional information? Or any other information I should try to figure out that could help? (I realize of course it would be easier given the HTML code, my apologies, I know it makes it much more broad of an issue to solve, I wish I could just post it :-/).

  6. #6
    Board Regular
    Join Date
    Oct 2007
    Posts
    5,646
    Post Thanks / Like
    Mentioned
    6 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    I don't know why that error would occur there. Perhaps you could check the values of tRow.rowIndex, tCell.cellIndex and tCell.innerText to see if they are sensible values.

    Sorry, but I can't help further without the URL and access to the page.

  7. #7
    New Member
    Join Date
    Mar 2016
    Posts
    20
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Thank you for the remarks John_w, I have good news! After working on this for a while. I got it to work! I will post the code here just in case anybody is looking for it. It actually signs into the website too, and then pulls the table. MAKE SURE IF YOU USE THIS CODE THAT YOU CHANGE THE USERNAME AND PASSWORD HTML NAMES AND SUBMIT BUTTON HTML NAME TO WHATEVER THE HTML CALLS THEM ON YOUR PAGE, VERY LIKELY IT WILL NOT BE THE SAME AS MINE. I ended up scraping ALL tables, and then use excel functions to filter down to what I need. This seemed to work better for my case. I believe (but am not certain at all) that I may have had other tables INSIDE OF the table I wanted? And possibly this was somehow causing an error? In any case, this worked for me.

    Code:
    Private Sub CommandButton1_Click()
    
    
        Dim IE As Object
        Dim doc As Object
        Dim strURL As String
        Dim USERNAMEinput As String
        Dim PASSWORDinput As String
        
    Application.DisplayAlerts = False
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False
    
    'On Error GoTo CleanUp 'going to clean up due to manual calculation still set and causes errors that break sheet with NA's until forced recalculation
    
    ' Statusbar
            Application.StatusBar = "CODE is loading. Please wait, this should take a minute or so..."
    
     Worksheets("Auto SSD Here").Cells.ClearContents
    
    
        strURL = Sheets("URL Reference").Range("A30").Value 'THIS SETS THE URL, SWITCH TO HARD CODED URL, OR SWITCH REFERENCE TO THE CELL THAT CONTINES YOUR URL
    
        Set IE = CreateObject("InternetExplorer.Application")
        With IE
             .Visible = False 'Make this true to see the internet explorer window excel is using, False to see nothing
    
            .navigate strURL
            Do Until .ReadyState = 4: DoEvents: Loop
            Do While .Busy: DoEvents: Loop
            
    'Resume SignIn
    SignIn:
                
                
                'on error skip to scraping data because the error is likely that the internet explorer object is already signed in
      On Error GoTo ScrapingData
      
      
      
        
        ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
        
        
                 'the line below is working fine, take out the ' to login hard coded again, but now using a user input box to pass the value to the login form
                 'IE.Document.getElementByID("Ecom_User_ID").Value = "HARD CODED USERNAME GOES HERE"
                  IE.document.getElementByID("Ecom_User_ID").Value = Range("A1")
                  'above will try to input A1 in user ID, keeping it in simply because if it is already signed in, it will fail, and error to scraping data BUT CHANGE THIS FOR YOUR URL HTML, BECAUSE IT IS PROBABLY NOT CALLED ECOMUSERID FOR YOU
                  
                  
                  USERNAMEinput = Application.InputBox("Enter your ID Number/User Name:", "Input Box Text", Type:=2)
                   IE.document.getElementByID("Ecom_User_ID").Value = USERNAMEinput
                  'CHANGE ABOVE FOR YOUR URL HTML
        
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
        
                  
                  
                  PASSWORDinput = Application.InputBox("Enter your Password:", "Input Box Text", Type:=2)
                  IE.document.getElementByID("password-password").Value = PASSWORDinput
                  'CHANGE ABOVE FOR YOUR URL HTML
                  
                 
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
        
        
        
                IE.document.getElementByID("loginButton").Click
                'CHANGE ABOVE FOR YOUR URL HTML
                
                
     
          ' Wait while IE loading...
        Do While IE.Busy
            Application.Wait DateAdd("s", 1, Now)
        Loop
     
     
    
                
    ScrapingData:
    Resume ScrapingData1
    ScrapingData1:
                
                
                    Set doc = IE.document
                    
                    GetAllTables doc 'this calls the entire scraping macro below
    
                    .Quit
                End With
                
                
                'THIS LINE ALERTS THE USER THAT LOGIN FAILED (IT CHECKS TO SEE IF ROW 8 IS BLANK BECAUSE ROW 8 SHOULD CONTAIN A TABLE TITLE)
    'CHANGE THIS FOR YOUR CODE BECAUSE ROW 8 MAY NOT ALWAYS CONTAIN SOMETHING, BUT THIS WAS THE BEST WAY I COULD FIGURE OUT HOW TO ALERT THE USER THAT THE SIGN IN PROCESS FAILED AND SO SCRAPING WRONG INFO
                If ThisWorkbook.Worksheets("Auto SSD Here").Range("A8").Value = "" Then
                    MsgBox "Your login credentials were incorrect, try logging in again. (If you are SURE you logged in correctly, then the code might need attention, contact ME.)"
                End If
                
    CleanUp:
     
        ' Clean up
        Application.StatusBar = ""
        Application.DisplayAlerts = True
        Application.Calculation = xlCalculationAutomatic
        Application.EnableEvents = True
        Application.ScreenUpdating = True
    
                
                
                
                
                
                
            End Sub
            
            
            
            
    
            Sub GetAllTables(doc As Object)
    
                 ' get all the tables from a webpage document, doc, and put them in a new worksheet
    
                Dim ws As Worksheet
                Dim rng As Range
                Dim tbl As Object
                Dim rw As Object
                Dim cl As Object
                Dim tabno As Long
                Dim nextrow As Long
                Dim I As Long
                
                
    Application.DisplayAlerts = False
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    
    
    
                Set ws = Worksheets("Auto SSD Here")
    
                For Each tbl In doc.getElementsByTagName("TABLE")
                    tabno = tabno + 1
                    nextrow = nextrow + 1
                    Set rng = ws.Range("B" & nextrow)
                    rng.Offset(, -1) = "Table " & tabno
                    For Each rw In tbl.Rows
                        For Each cl In rw.Cells
                            rng.Value = cl.outerText
                            Set rng = rng.Offset(, 1)
                            I = I + 1
                        Next cl
                        nextrow = nextrow + 1
                        Set rng = rng.Offset(1, -I)
                        I = 0
                    'below I am trying to insert a status bar update for every new row
    'CHANGE THIS TO SUIT YOUR SHEET SHEET THOUGH, BECAUSE THIS IS ASSUMING THERE ARE ABOUT 5-6 HUNDRED ROWS TO SCRAPE. I WASN'T SURE HOW TO MAKE THE MACRO COUNT THE ROWS, AND THEN COMPARE THE REAL ROWS
    'AGAINST THE COMPLETED ROWS, SO INSTEAD IT IS AN ESTIMATE. MACRO MAY FINISH BEFORE REACHING 100% OR AFTER REACHING 120% OR SO IN MY CASE, BUT THIS IS OKAY FOR MY PURPOSES.
                    Application.StatusBar = "Approx. " & nextrow / 5.5 & "% complete."
    
                    
                    Next rw
                    
                Next tbl
    
                'ws.Cells.ClearFormats
    
            End Sub

  8. #8
    New Member
    Join Date
    Nov 2018
    Posts
    11
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Quote Originally Posted by John_w View Post
    If the web data is in a HTML table element then you could extract the rows and columns like this:
    Code:
        Dim table As Object, tRow As Object, tCell As Object
        Dim rowIndex
        Set table = IE.document.getElementsByTagName("TABLE")(0)   '0 = 1st table, 1 = 2nd table, etc.
        For Each tRow In table.Rows
            For Each tCell In tRow.Cells
                Worksheets("Auto SSD Here").Range("A1").Offset(tRow.rowIndex, tCell.cellIndex).Value = tCell.innerText
            Next
        Next
    The above code assumes the data is in the first table. For a different table change the 0 in the ("TABLE")(0) line. You can determine the correct table by looking at the HTML source, or trial and error by changing the 0 to 1, 2, 3, etc.
    Hi John,

    Will like to check how should I chng the above coding so that I can get a specific row of the table.

    Please assist.

    Many thanks.

  9. #9
    Board Regular
    Join Date
    Oct 2007
    Posts
    5,646
    Post Thanks / Like
    Mentioned
    6 Post(s)
    Tagged
    2 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Code:
        Set table = IE.document.getElementsByTagName("TABLE")(0)   '0 = 1st table, 1 = 2nd table, etc.
        Set tRow = table.Rows(0)  '0 = 1st row, 1 = 2nd row
        For Each tCell In tRow.Cells
            Worksheets("Auto SSD Here").Range("A1").Offset(0, tCell.cellIndex).Value = tCell.innerText
        Next

  10. #10
    New Member
    Join Date
    Nov 2018
    Posts
    11
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Excel VBA to Sign Into and then Extract Data from a Website

    Quote Originally Posted by John_w View Post
    Code:
        Set table = IE.document.getElementsByTagName("TABLE")(0)   '0 = 1st table, 1 = 2nd table, etc.
        Set tRow = table.Rows(0)  '0 = 1st row, 1 = 2nd row
        For Each tCell In tRow.Cells
            Worksheets("Auto SSD Here").Range("A1").Offset(0, tCell.cellIndex).Value = tCell.innerText
        Next
    Hi,

    will like to check how can I change the coding so as that I can a specific column of a table.

    Please assist.
    Thanks

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •