VBA Scrape Data from Website

quick_question

New Member
Joined
May 31, 2011
Messages
30
I am trying to scrape data from a website, in 2 Phases:

Phase 1:

The website is a directory that is only displaying 6 profiles per page. Because of this, there are many pages of information.

I want to go through each page (1, 2, 3, etc.) and extract (or copy) the "link address" (the information found in the green square highlighted in yellow in the below image) from each page. When each link address is copied, I would like it to be pasted into "Sheet2" cell B2, B3, B4, etc. always pasting into the next available row in column B.

1619104448321.png


Once all the profile URL's have been cataloged, move on to Phase 2...


Phase 2:

Use the link addresses that have been copied in B2:B#### (which appear as a url: https://directory....) to scrape each page for more information.

These link addresses are profile page url's.

I want to extract (copy & paste) the following information from each page to Sheet3 from left to right beginning in column A row 2.

Then Cycle through each different profile url, taken from Sheets B2, B3, etc. and extract (copy & paste) the same information in row 3, 4, 5, etc.

I'm only blocking specific information which would identify individuals name, address and phone numbers.

1619106918936.png


Scroll down for more information...

1619107003238.png
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop

diddi

Well-known Member
Joined
May 20, 2004
Messages
3,329
Office Version
  1. 2010
Platform
  1. Windows
i suspect you can completely bypass the first phase by POSTing the full URL ending with 999900001 and iterating the member number. it depends on the setup of the site but you can either use something like xmlhttp or selenium. without the URL i cant help further as i need to tinker to get scrapes to work, the code does not just flow and work first time.
 

Dan_W

Active Member
Joined
Jul 11, 2018
Messages
485
Office Version
  1. 365
Platform
  1. Windows
I agree with Diddi. I would also make the observation that this doesn't look straightforward, judging from the obscure class names that they've opted to use. I note from the code that there is a comment that Javascript needs to be enabled - is that why you've opted for Internet Explorer?
 

Watch MrExcel Video

Forum statistics

Threads
1,133,614
Messages
5,659,846
Members
418,532
Latest member
roynaz11

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top