Data import from web: specific info

virtuosok

Board Regular
Joined
Sep 2, 2020
Messages
103
Office Version
  1. 2016
Platform
  1. Windows
Hi,
Is it possible to automatically pull the following bit from Fixed-Dose Trial in Early Parkinson's Disease (PD) - Full Text View - ClinicalTrials.gov

Locations
Show
Show 23 study locations


(you can see this closer to the end of the above page)

When I go the conventional route, Data - New Query - from other sources - from web - insert link... I am unable to see this bit hence cannot set up importing. I need "23" in any way or form, as a standalone number or as part of a sentence... as long as Excel can automatically import anything with "23" in it, my case is resolved.
 

Some videos you may like

Excel Facts

Excel Can Read to You
Customize Quick Access Toolbar. From All Commands, add Speak Cells or Speak Cells on Enter to QAT. Select cells. Press Speak Cells.

Saurabhj

Active Member
Joined
Jun 6, 2020
Messages
413
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Hi,

Kindly use below code:

'This program requires references to the following in Tools -> References:
'1 Microsoft Internet Controls
'2. Microsoft HTML Object Library

VBA Code:
Sub scrape_link()
    'If any element is not available, the program will move to next line.
    On Error Resume Next
   
    Dim HTMLDoc As New HTMLDocument
    Dim ieBrowser As New InternetExplorer
    Dim reqdValue As String
 
    'To open and show Internet Explorer
    ieBrowser.Visible = True
   
    'To Open website in Internet Explorer
    ieBrowser.navigate "https://clinicaltrials.gov/ct2/show/study/NCT04201093"
   
    Do
    ' Wait till the Browser is loaded
    Loop Until ieBrowser.readyState = READYSTATE_COMPLETE
   
    Set HTMLDoc = ieBrowser.document
   
    reqdValue = HTMLDoc.getElementById("EXPAND_CONTROL-Locations").innerText
    ActiveSheet.Range("A2") = reqdValue
    ActiveSheet.Range("A3") = Mid(reqdValue, 8, WorksheetFunction.Find(" ", reqdValue, 8) - 8)
   
    Set ieBrowser = Nothing
    Set HTMLDoc = Nothing
End Sub
 

virtuosok

Board Regular
Joined
Sep 2, 2020
Messages
103
Office Version
  1. 2016
Platform
  1. Windows
Thanks Saurabhj,
Can you walk me through this (specifically, the "tools->references piece") please? I tried to add the code suggested as I normally would for VBA but this is probably not the intent.
 

Saurabhj

Active Member
Joined
Jun 6, 2020
Messages
413
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Hi,

Do the following steps:

1. In VBA Editor, click on Tools option and select references. (see image)
2. Select options Microsoft Internet Controls and Microsoft HTML Object Library (See image)
3. Run the macro.
4. A2 cell of active sheet will display "Show 23 study locations"
5. A3 cell of active sheet will show 23. (see image)
 

Attachments

  • Referencesforwebscraping.PNG
    Referencesforwebscraping.PNG
    21.1 KB · Views: 4
  • tools-references.PNG
    tools-references.PNG
    77.2 KB · Views: 4
  • output.PNG
    output.PNG
    22.5 KB · Views: 4

virtuosok

Board Regular
Joined
Sep 2, 2020
Messages
103
Office Version
  1. 2016
Platform
  1. Windows

ADVERTISEMENT

Awesome, thanks!
2 questions please to take it to the next level:
1. Can I have one macro looking in 6 different locations, pulling the same type of output? E.g. in the example below I not only need the "23" pulled from Fixed-Dose Trial in Early Parkinson's Disease (PD) - Full Text View - ClinicalTrials.gov - but also "32" pulled from Flexible-Dose Trial in Early Parkinson's Disease (PD) - Full Text View - ClinicalTrials.gov (and so on, for 6 pages in total). In that case, which lines should I add to the source code?
2. This one is absolutely optional, but is it possible for IE enquiry to be run behind the scenes, without necessarily opening a browser?
 

Saurabhj

Active Member
Joined
Jun 6, 2020
Messages
413
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Hi,

Make below changes -
1. Add site names as I added for URL 1 and URL 2 you provided.
2. To hide IE, removed line ieBrowser.visible = true

VBA Code:
 'Declare variable at module level
    Dim siteURL(1 To 6) As String
    Dim sitenum As Integer
    
Sub showInfo()
    siteURL(1) = "https://clinicaltrials.gov/ct2/show/study/NCT04201093"
    siteURL(2) = "https://clinicaltrials.gov/ct2/show/NCT04223193?term=NCT04223193&draw=2&rank=1"
    siteURL(3) = ""
    siteURL(4) = ""
    siteURL(5) = ""
    siteURL(6) = ""
    For sitenum = 1 To 6
        scrape_info
    Next
End Sub


VBA Code:
Sub scrape_info()
    On Error Resume Next
    Dim HTMLDoc As New HTMLDocument
    Dim ieBrowser As New InternetExplorer
    Dim reqdValue As String
        
        'To Open website in Internet Explorer
        ieBrowser.navigate siteURL(sitenum)
        
        Do
        ' Wait till the Browser is loaded
        Loop Until ieBrowser.readyState = READYSTATE_COMPLETE
        
        Set HTMLDoc = ieBrowser.document
        
        reqdValue = HTMLDoc.getElementById("EXPAND_CONTROL-Locations").innerText
        ActiveSheet.Range("A" & sitenum) = Mid(reqdValue, 8, WorksheetFunction.Find(" ", reqdValue, 8) - 8)
  
    Set ieBrowser = Nothing
    Set HTMLDoc = Nothing
End Sub
 
Solution

Watch MrExcel Video

Forum statistics

Threads
1,127,848
Messages
5,627,244
Members
416,233
Latest member
Riddlemethis

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top