Convert HTML doc to XML for xPath Parsing

manofcheese

New Member
Joined
Jan 5, 2019
Messages
8
I am trying to find a good way to convert an HTML document to XML so I can use an Xpath expression for parsing. As a test I'd like to get this xpath expression to pull information from What Is My IP❓ Shows your real public IP address - IPv4 - IPv6
XML:
//*[@id="post-7"]/div[1]/div[1]/div/ul/li[2]

So far this is what I have but I haven't been able to get the response converted to XML yet.

VBA Code:
Sub xpathTest()
Dim myURL As String
Dim strPath As String
Dim xml_obj As Object

myURL = "https://www.whatismyip.com/"
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
        
xml_obj.Open "GET", myURL, False
xml_obj.Send

Set myxml = xml_obj.responseXML

strPath = "//*[@id=" & """" & "post-7" & """" & "]/div[1]/div[1]/div/ul/li[2]"


Dim dom As DOMDocument30
Dim nodelist As IXMLDOMNodeList
   
   
myxml.SetProperty "SelectionLanguage", "XPath"

Set nodelist = myxml.DocumentElement.SelectNodes(strPath)
   
Debug.Print "Found " & nodelist.Length & " Node"


Debug.Print result

End Sub
 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
Well HTML is not XML...
I know that, but I also know that you can use xpath on html because they are so similar. I've done it in Python and I've used the xpath helper chrome extension to get things I need. I'm just trying to figure out a way to do this in vba
 
Upvote 0

Forum statistics

Threads
1,214,991
Messages
6,122,628
Members
449,095
Latest member
bsb1122

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top