Automatically extract website data to a spreadsheet

Corried

Board Regular
Joined
Dec 19, 2019
Messages
217
Office Version
  1. 2019
Platform
  1. Windows
  2. Web
Hello, Thank you for sharing..

I am looking for courses on how to automatically extract data from a website to an excel spreadsheet.

The excel file contains fifty links. Each link has the same process, to create a stock market report for fifty companies. Use macro and chrome to generate data.

Can anyone show me a step by step process on how I can do this?

Please watch the video and Excel workbook below to learn more.



I hope you all understand.

Cheers
 

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
Next step will be to use CSS rather than XPath to retrieve the element.
 
Upvote 0
I did not test it, but try replacing that line with this one:

VBA Code:
Cells(i, 4) = driver.findelementbycss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div > div:nth-child(3) > div > div.rowsContainer-DS-EntryPoint1-2 > div:nth-child(8) > div.factsRowValue-DS-EntryPoint1-2")
 
Upvote 0
I did not test it, but try replacing that line with this one:

VBA Code:
Cells(i, 4) = driver.findelementbycss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div > div:nth-child(3) > div > div.rowsContainer-DS-EntryPoint1-2 > div:nth-child(8) > div.factsRowValue-DS-EntryPoint1-2")
I'm sorry Worf.

This is my error:
error 7.JPG


 
Upvote 0
I did not test it, but try replacing that line with this one:

VBA Code:
Cells(i, 4) = driver.findelementbycss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div > div:nth-child(3) > div > div.rowsContainer-DS-EntryPoint1-2 > div:nth-child(8) > div.factsRowValue-DS-EntryPoint1-2")
Morning Wolf. Thanks for your patience...

I was wondering, what would have case the code to break on my side? I am using Excel 2019.

I am not sure if different versions may cause the problem. Then again. Also was wondering if you have any other coding I can use...

Thanks in advance
 
Upvote 0
I was working on this last night and then a power outage happened. :mad:

I will get back to it as soon as possible.
 
Upvote 0
I was working on this last night and then a power outage happened. :mad:

I will get back to it as soon as possible.
? Wow. I am sorry Wolf. It looks like you had a hard night ?. About a month ago. Our country had a total blackout for 5 hours. No, air condition, no fans, no t.v. pich black. Only thing remarkable was the beautiful stars across the sky. This reminded me how blessed I am to be part of this life God called me to live... funny, how we can receive blessings of the area we don't focus on "within"...
 
Upvote 0
I am travelling and using another computer. Please test the code below with the same company that worked for me:

res2.png


VBA Code:
Public driver As New ChromeDriver
Sub Money()
Dim pt As WebElement, key As New Selenium.Keys, s$
driver.get "https://www.msn.com/en-us/money/stockdetails/fi-bjqejc?duration=1D"
Application.Wait Now + TimeValue("0:00:06")
Set pt = driver.FindElementByXPath("//*[@id=""searchBox""]/input")
pt.Click
pt.SendKeys ActiveSheet.[c2]
Application.Wait Now + TimeValue("0:00:06")
DoEvents
Set pt = driver.FindElementByCss("#searchBox > span > span > svg")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
DoEvents
[a2] = driver.URL
[b2] = driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div > div:nth-child(1) >" & _
" div > div.firstSection-DS-EntryPoint1-1 > div.quoteInfo-DS-EntryPoint1-1 > div.title-DS-EntryPoint1-1 > span.displayName-DS-EntryPoint1-1").Text
[d2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[8]/div[2]").Text
[e2] = driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 >" & _
" div > div > div:nth-child(1) > div > div.secondSection-DS-EntryPoint1-1 > div.priceInfo-DS-EntryPoint1-1 > div >" & _
" div.mainPrice.color_red-DS-EntryPoint1-1").Text
s = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[2]").Text
s = s & " / " & driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[1]").Text
[h2] = s
[j2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[5]/div[2]").Text
' 5-year
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[1]/div[2]/div/div/button[8]").Click
Application.Wait Now + TimeValue("0:00:06")
[i2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div/table/tbody/tr[2]/td[12]/div").Text
' analysis
driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div" & _
" > div:nth-child(1) > div > div.secondSection-DS-EntryPoint1-1 > div.navigation-DS-EntryPoint1-1 > div > div > button:nth-child(5) > span").Click
Application.Wait Now + TimeValue("0:00:10")
driver.SwitchToNextWindow
s = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[2]/li[2]/p").Text
s = s & " / " & driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[3]/li[2]/p").Text
[g2] = s
[L2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[6]/li[2]/p").Text
[k2] = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[4]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' growth
Application.Wait Now + TimeValue("0:00:12")
'MsgBox driver.URL
Set pt = driver.FindElementByCss _
("#main > div.content-div.fullwidth.loaded > div.main-region.maincontainer.fullwidth.stckdtl" _
& " > div.dynaloadable > div:nth-child(4) > div > div > div.key-stats-area > ul > li:nth-child(2)")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
[F2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[2]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' company
driver.FindElementByXPath("//*[@id=""profile""]/a").Click
Application.Wait Now + TimeValue("0:00:06")
driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[2]/div/button[1]").Click
Application.Wait Now + TimeValue("0:00:06")
[m2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[1]").Text
MsgBox "Success!"
End Sub
 
Upvote 0
I am travelling and using another computer. Please test the code below with the same company that worked for me:

View attachment 61704

VBA Code:
Public driver As New ChromeDriver
Sub Money()
Dim pt As WebElement, key As New Selenium.Keys, s$
driver.get "https://www.msn.com/en-us/money/stockdetails/fi-bjqejc?duration=1D"
Application.Wait Now + TimeValue("0:00:06")
Set pt = driver.FindElementByXPath("//*[@id=""searchBox""]/input")
pt.Click
pt.SendKeys ActiveSheet.[c2]
Application.Wait Now + TimeValue("0:00:06")
DoEvents
Set pt = driver.FindElementByCss("#searchBox > span > span > svg")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
DoEvents
[a2] = driver.URL
[b2] = driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div > div:nth-child(1) >" & _
" div > div.firstSection-DS-EntryPoint1-1 > div.quoteInfo-DS-EntryPoint1-1 > div.title-DS-EntryPoint1-1 > span.displayName-DS-EntryPoint1-1").Text
[d2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[8]/div[2]").Text
[e2] = driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 >" & _
" div > div > div:nth-child(1) > div > div.secondSection-DS-EntryPoint1-1 > div.priceInfo-DS-EntryPoint1-1 > div >" & _
" div.mainPrice.color_red-DS-EntryPoint1-1").Text
s = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[2]").Text
s = s & " / " & driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[1]").Text
[h2] = s
[j2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[5]/div[2]").Text
' 5-year
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[1]/div[2]/div/div/button[8]").Click
Application.Wait Now + TimeValue("0:00:06")
[i2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div/table/tbody/tr[2]/td[12]/div").Text
' analysis
driver.FindElementByCss("#root > div:nth-child(1) > div > div.mainContentLayout-DS-EntryPoint1-1 > div > div" & _
" > div:nth-child(1) > div > div.secondSection-DS-EntryPoint1-1 > div.navigation-DS-EntryPoint1-1 > div > div > button:nth-child(5) > span").Click
Application.Wait Now + TimeValue("0:00:10")
driver.SwitchToNextWindow
s = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[2]/li[2]/p").Text
s = s & " / " & driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[3]/li[2]/p").Text
[g2] = s
[L2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[6]/li[2]/p").Text
[k2] = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[4]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' growth
Application.Wait Now + TimeValue("0:00:12")
'MsgBox driver.URL
Set pt = driver.FindElementByCss _
("#main > div.content-div.fullwidth.loaded > div.main-region.maincontainer.fullwidth.stckdtl" _
& " > div.dynaloadable > div:nth-child(4) > div > div > div.key-stats-area > ul > li:nth-child(2)")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
[F2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[2]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' company
driver.FindElementByXPath("//*[@id=""profile""]/a").Click
Application.Wait Now + TimeValue("0:00:06")
driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[2]/div/button[1]").Click
Application.Wait Now + TimeValue("0:00:06")
[m2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[1]").Text
MsgBox "Success!"
End Sub
Thanks Wolf, for reaching out. I really appreciate on you not giving up on me. I copied and pasted your code. The 1st section as before is plugged in... I am not sure what is the problem.

Please view video: error.webm

Thanks in advance
 
Upvote 0

Forum statistics

Threads
1,215,353
Messages
6,124,463
Members
449,163
Latest member
kshealy

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top