Help With Scraping Emails VBA

fcrisantos

New Member
Joined
Sep 6, 2016
Messages
5
Hello All,

I am very new to this and I have been taking bits and pieced of codes that I have either found online or on youtube.

[+>[SCROOLL TO BOTTOM FOR QUESTION]<+]



I want to collect emails from this website: The State Bar of California Home Page

I have automated the process to click and search all the necessary items to get to the table ( Search by County)

There are two parts to the code:

Part 1:
Works well ( But, I do need to clean up the retuned URLs)

______________________________________________________________________
Sub extractTablesData()


Dim IE As Object, obj As Object
Dim myCounty As String
Dim t As Integer, r As Integer, c As Integer
Dim elemCollection As Object


Set IE = CreateObject("InternetExplorer.Application")


myCounty = InputBox("Enter County")


With IE


.Visible = True
.Navigate ("http://members.calbar.ca.gov/fal/MemberSearch/QuickSearch?FreeText=&x=60&y=3&SoundsLike=false#searchlink")




While IE.readystate <> 4
DoEvents
Wend


Set AllHyperLinks = IE.document.getElementsByTagName("A")


For Each hyper_link In AllHyperLinks

If hyper_link.innerText = "Advanced Search" Then
hyper_link.click
Exit For

End If

Next


For Each obj In IE.document.all.Item("County").Options
If obj.innerText = myCounty Then
obj.Selected = True
End If


Next obj




IE.document.getElementById("advSearch").click




Do While IE.busy: DoEvents: Loop




Dim html As htmlDocument
Dim Link As Object
Dim ElementCol As Object
Dim erow As Long


Application.ScreenUpdating = False
IE.Visible = False


Do While IE.readystate <> READYSTATE_COMPLETE
Application.StatusBar = "Loading Site..."
DoEvents
Loop


Set html = IE.document
Set ElementCol = html.getElementsByTagName("a")
For Each Link In ElementCol
erow = Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1).Value = Link
Cells(erow, 1).Columns.AutoFit
Next




Application.StatusBar = ""
Application.ScreenUpdating = True

End With


Set IE = Nothing






End Sub

_______________________________________________________________________



Part 1 collects the member IDs and puts them into excel - part 2 will then collect the emails from the member IDs

This is where I need help

Part 2:




_______________________________________________________________________
Sub extractTablesData()
Dim IE As Object, obj As Object
Dim t As Integer, r As Integer, c As Integer
Dim elemCollection As Object




Set IE = CreateObject("InternetExplorer.Application")




With IE




.Navigate Sheets("Sheet1").Range("A1").Value
.Visible = True






While IE.readystate <> 4
DoEvents
Wend






ThisWorkbook.Sheets("Sheet2").Range("A1:K500").ClearContents


Set elemCollection = IE.document.getElementsByTagName("e-mail")


For t = 0 To (elemCollection.Length - 1)


For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
ThisWorkbook.Worksheets(2).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t



End With


End Sub


________________________________________________________________________________________________

This is where the emails are difficult to extract - The Html of the site has over a dozen different email links- how is it

possible to extract the correct one and ignore the rest? Or is this measure the site admins use to prevent this? Thanks

in Advanced.


UHuigiSm4ki
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
Looks quite easy to me to identify the links you want - they all contain 'mailto' in the URL.

What exactly are you having problems with?
 
Upvote 0
Thank you for replying so quickly.

I'm having a problem getting the exact email below. When I try to extract, it gives me all of the emails pictures above.

and it looks like this in my excel cell:

uyegqkt@ayesifs.edutojir@shk.netjerry@amantelaw.comytitljgq@suhgg.edubwtb@caugmgwt.orgdbigfs@bfgonjo.orgfdtmollg@aaey.govupemtlo@dhcskif.netsnqhl@unenqeu.comqkfccddif@ingyndof.orgoirtrkj@gcyw.edumfgoir@ahjnhd.edukdrj@busjpfqn.orgibgeoido@wtbag.nethnhkucklb@kskuo.orgjpsqf@epyctja.comlsgwoly@wtfbnfgq.netnurdyeqbu@npejch.govpygi@ukftg.govrcroqn@ksah.gov

<tbody>
</tbody>

I just want the Jerry@----.com email
email%2Bquestion%2B2.png
 
Upvote 0

Forum statistics

Threads
1,214,798
Messages
6,121,635
Members
449,043
Latest member
farhansadik

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top