Data Extraction From Website

drmingle

Board Regular
Joined
Oct 5, 2009
Messages
229
My objective: I want to be able to place a first and last name combination into code (in this example Alison Smith) and see if the person is recognized by the site. If "yes", I want to download the information to the active spreadsheet. If "no", then what ever the result (i.e. nothing found, etc) can populate on the spreadsheet.

The below code is pulling some info, but not all.

Any help would be appreciated.

Code:
Sub AZMD()
 
    Dim URL As String
    Dim IE As Object
    Dim lastName As String, firstName As String, start As Long
    Dim rowOffset As Long
 
    Sheet1.Cells.ClearContents
    rowOffset = 0
 
    URL = "[URL]http://azmd.gov/glsuiteweb/clients/azbom/public/WebVerificationSearch.aspx[/URL]"
 
    Set IE = CreateObject("InternetExplorer.Application")
 
    tbLastName = "smith"
    tbFirstName = "alison"
    start = 1
 
    While start < 101
        With IE
            .Visible = False
            .navigate URL & "?tbLastName=" & tbLastName & "&tbFirstName=" & tbFirstName & "&start=" & start
             While .Busy Or .readyState <> 4: DoEvents: Wend
            Extract_HTML_Table .document, 5, Sheet1.Range("A1").Offset(rowOffset, 0)
        End With
        start = start + 20  'Next 20 results
        rowOffset = rowOffset + 20
    Wend
End Sub
 
Private Sub Extract_HTML_Table(document As Object, tableNumber As Integer, destination As Range)
    'Extract data in rows and columns from a HTML table and put the data starting at the specified destination
 
    Dim tables As Object
    Dim table As Object
    Dim row As Object, cell As Object
    Dim nrow As Long, ncol As Long
 
    Set tables = document.getElementsByTagName("TABLE")
 
    If tableNumber <= tables.Length Then
 
        'Get the tableNumber'th table
 
        Set table = tables(tableNumber - 1)
        'Fill rows and columns starting at the destination range
 
        nrow = 0
        For Each row In table.Rows
            ncol = 0
            If row.RowIndex <> 0 Then    'ignore the first row because it contains the column headings
                For Each cell In row.Cells
                    'Debug.Print cell.innerText
                    destination.Offset(nrow, ncol).Value = cell.innerText
                    ncol = ncol + 1
                Next
                nrow = nrow + 1
            End If
        Next
 
    Else
 
        MsgBox "Unable to retrieve table number " & tableNumber & " because " & vbNewLine & _
            document.URL & " contains only " & tables.Length & " tables"
 
    End If
 
End Sub
 

Excel Facts

What is the fastest way to copy a formula?
If A2:A50000 contain data. Enter a formula in B2. Select B2. Double-click the Fill Handle and Excel will shoot the formula down to B50000.
John, I appreciate your due dilligence...

There is no malicious intent by posting here and there. I simply would like a solution to the problem I am faced with...

Do you have a solution for my posted problem?

Any help would be appreciated. What do you recommend in place of query string?
 
Upvote 0
I'm sure you have no malicious intent, but it's polite and helpful if you say that you've posted on another forum; no. 2 in the posting guidelines sticky: http://www.mrexcel.com/forum/showthread.php?t=127080

Try this. Make sure you follow the comment at the top of the code.
Code:
Option Explicit

'Needs references to Microsoft Internet Controls and Microsoft HTML Object Library.  Set these in Tools - References in the VB Editor.


Public Sub AZMD_Name_Search()

    Dim IE As InternetExplorer
    Dim URL As String
    Dim HTMLdoc As HTMLDocument
    Dim lastName As String, firstName As String
    
    URL = "http://azmd.gov/glsuiteweb/clients/azbom/public/WebVerificationSearch.aspx"
    lastName = "Smith"
    firstName = "Alison"
    
    Set IE = New InternetExplorer
    
    With IE
        .navigate URL
        .Visible = True
    
        'Wait for initial page to load
    
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
    
        Set HTMLdoc = .document
    End With
    
    'Populate form fields
    
    '< input name="tbLastName" type="text" value="alison" id="tbLastName" style="width:180px;" />
    '< input name="tbFirstName" type="text" value="smith" id="tbFirstName" style="width:180px;" />
    
    HTMLdoc.all.tbLastName.Value = lastName
    HTMLdoc.all.tbFirstName.Value = firstName
    
    'Click the Name Search button
    '< input type="submit" name="btnName" value="Name Search"
    'o_nclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("btnName", "",
    'true, "", "", false, false))" id="btnName" style="width:121px;" />
    
    HTMLdoc.all.btnName.Click
        
    With IE
        'Wait for new page to load
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
        
        Sheet1.Cells.ClearContents
        
        If .LocationName = "Profile" Then
            
            'Profile found - extract data
            
            Extract_Tables HTMLdoc, Sheet1.Range("A1")
        
        ElseIf .LocationName = "Licensee Search" Then
        
            'Original page, so no results found
            '< textarea name="tbErrorName" rows="2" cols="20" id="tbErrorName" style="color:Red;background-color:White;height:42px;width:100%;">No doctors found with those specifications.< /textarea>< /td>

            Sheet1.Range("A1").Value = HTMLdoc.all.tbErrorName.innerText

        End If
        
    End With
    
End Sub


Private Sub Extract_Tables(document As HTMLDocument, destination As Range)
    
    'Extract data in rows and columns from all HTML tables and put the data starting at the specified destination
    'This is similar to how a Web Query formats the retrieved data
 
    Dim tables As IHTMLElementCollection
    Dim table As HTMLTable
    Dim row As HTMLTableRow, cell As HTMLTableCell
    Dim rowOffset As Long, colOffset As Long
    Dim lines As Variant
    Dim i As Integer
    
    Set tables = document.getElementsByTagName("TABLE")
 
    rowOffset = 0
    For Each table In tables
        For Each row In table.Rows
            colOffset = 0
            For Each cell In row.Cells
                'Only extract cell text if this cell has no nested < td> elements, meaning it is the innermost cell element
                If cell.all.tags("TD").Length = 0 Then
                    'Put each HTML < br> 'line' in separate rows
                    lines = Split(cell.innerText, vbCrLf)
                    For i = 0 To UBound(lines)
                        destination.Offset(rowOffset + i, colOffset).Value = lines(i)
                    Next
                    colOffset = colOffset + 1
                End If
            Next
            'Increment row by the number of rows just written to
            rowOffset = rowOffset + i
        Next
    Next
         
End Sub
 
Upvote 0
As you requested I made the reference adjustment:
Microsoft Internet Controls and Microsoft HTML Object Library

I ran the code and got a message box with the following information...

Compile Error:
User-defined type not defined

It seems to be hanging up on the following line of code:
Code:
IE As InternetExplorer

Any ideas?
 
Upvote 0
That error means you haven't set the reference to Microsoft Internet Controls. Please check.
 
Upvote 0
Wonderful job...just what I needed

I had trouble with the Microsoft Internet controls.

I had to find it on my machine and load it...

For those following this thread, I found what I needed here:

C:\WINDOWS\system32\shdocvw.dll

John, thanks a million.
 
Upvote 0
Okay, glad it worked for you.

For completion you should really post in your tek-tips thread with a link to this thread saying that your question has been solved.
 
Upvote 0
I too am looking for a similar macro, would it be possible to get assistance please?
I am trying to create a macro which will interface with a website. I want it to select the value in a cell excel spreadsheet say "b2" insert it in the search box on this web page "http://www.ean-search.org/" then copy the 2 results shown beow the search box into 2 adjoining fields in the same spead sheet that the search value was obtained from. the Macro needs to allow all values in the same column to be searched in the same way, so needs to continue down the column. Kind regards Mike
 
Upvote 0

Forum statistics

Threads
1,215,853
Messages
6,127,328
Members
449,376
Latest member
karenmccabe

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top