extract info from website.

medic5678

Board Regular
Joined
Nov 13, 2018
Messages
67
Hi everyone. I need to extract information from a website. This is for an application for Department of Corrections. We need to enter inmate numbers, where we will get to this page (all public information).

To iterate through the inmate list, we simply need to change the address by changing the inmate number part of it (447711 in this particular case).

so, staff will enter an inmate numbers in column a on the spreadsheet, press a macro key and the macro will go to the website and extract the inmate names to column b, birthdates to c, etc.

The end result is a list of inmate numbers with names and other information, which we will print and use.

Thanks to all who reply!
 

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
I didn't see any copyright notice on that site, and anyway is up to you comply with your law.

Going to your request:
-Suppose that you have in cell A2 and downward of the active sheet a list of Id, and you wish to import some of the information from the main table
-you create, for example on C1 to the right, the list of headers you want to extract, exactely as they appear on the web site (ex: "DC Number:"). You don't need to list all the headers available in the table, nor have they to be in the same sequence; but you may not leave empty cells between the headers.

Then you copy the following code in an new (empty) standard module of your vba:
Code:
Dim IE As Object  

Sub GetInmInfo()
Dim bUrl As String, I As Long, myTRs As Object, cTxt As String
Dim cHead As String, J As Long, myMatch, myTB0 As Range
'
Set myTB0 = Sheets("IdDetails").Range("C1")         '<<< The start if Headers to import
'
Range(myTB0.Offset(1, 0), myTB0.End(xlDown).Resize(, Range(myTB0, myTB0.End(xlToRight)).Columns.Count)).ClearContents
Range(myTB0.Offset(1, 0), myTB0.End(xlDown).Resize(, Range(myTB0, myTB0.End(xlToRight)).Columns.Count)).NumberFormat = "@"
bUrl = "Offender Information Search"
If IE Is Nothing Then Set IE = CreateObject("InternetExplorer.Application")
For I = 2 To Cells(Rows.Count, "A").End(xlUp).Row
    If Cells(I, 1) <> "" Then
        Call Accedi(Replace(bUrl, "###", Cells(I, 1).Value, , , vbTextCompare))
        Set myTRs = IE.document.getElementById("ctl00_ContentPlaceHolder1_tblProfile").getElementsByTagName("tr")
        On Error Resume Next
        For J = 0 To myTRs.Length               'read main table content:
            cHead = myTRs(J).getElementsByTagName("th")(0).innertext
            myMatch = Application.Match(cHead, Range(myTB0, myTB0.End(xlToRight)), False)
            If Not IsError(myMatch) Then
                cTxt = myTRs(J).getElementsByTagName("td")(0).innertext
                myTB0.Cells(I, myMatch) = cTxt
            End If
        Next J
        On Error GoTo 0
    End If
Next I
On Error Resume Next
IE.Quit
Set IE = Nothing
MsgBox ("Import Completed")
End Sub

Sub Accedi(ByVal myURL As String)
Dim myDoc, myF
'myURL = tWkSh.Cells(1, 1).Value             'L'url della pagina da accedere
If InStr(1, myURL, "http", vbTextCompare) = 1 Then
    If IE Is Nothing Then Set IE = CreateObject("InternetExplorer.Application")
    With IE
        .Visible = True         '... ie visibLE
        .Navigate myURL         '... go to url
'        Sleep 50
        mystart = Timer
        Do While .Busy: DoEvents:  Loop  'Attesa not busy
        Do While .ReadyState <> 4
            DoEvents
            If Timer > (mystart + 5) Then Exit Do
        Loop 'Attesa document
    End With
    mystart = Timer
    Do  ' additional wait
        DoEvents
        If Timer > mystart + 0.5 Or Timer < mystart Then Exit Do
    Loop
'    Debug.Print "Completato", Format(Now, "hh:mm:ss")
End If
End Sub

The line marked <<< need to be taylored to your need, with the sheet name and the starting cell of your headers

When your list of Id is ready in your active worksheet, then run Sub GetInmInfo

It will first clear the current content of the table, then will search each Id into using the site Search, and set each returned information under the proper header

A sample of my test worksheet:



Extracting more information is just a question of time and patience; and these are rare commodities these days…

Bye
 
Upvote 0
Wow. That was a much more comprehensive solution than I was thinking about. Thanks, you are indeed the master of Excel!
 
Upvote 0

Forum statistics

Threads
1,214,625
Messages
6,120,598
Members
448,973
Latest member
ksonnia

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top