Extracting the text from a <a href= tag

ScooterNorm

New Member
Joined
Feb 25, 2024
Messages
22
Office Version
  1. 2016
Platform
  1. Windows
I load a web page using ServerXMLHTL60 which has the following HTML code.

...
<tbody id="docResultsTbody" class="tablesorterTbody">
<tr class="doc_row dataRowTR">
<td class="textCenter">
<input type="checkbox" class="resSelects" id="tmDoc1"></input>
</td>
<td class="">Feb. 26, 2024</td>
<td class="">
<a href="/documentviewer?caseId=sn97775647&amp;docId=ABN20240226074801&amp;linkId=1"
target="ABN20240226074801">ABANDONMENT-Notice</a>
</td>
<td class=""></td>
<td class="end">PDF</td>
<td class="hide">20240226074801</td>
</tr>
...

I do a `Set elems1 = htmlDoc.getElementsByClassName("doc_row dataRowTR")

and the 1st entry returns the following innerHTML text:
<TD class=textCenter><INPUT id=tmDoc1 class=resSelects type=checkbox></INPUT> </TD>
<TD>Feb. 26, 2024</TD>
<TD><A href="about:/documentviewer?caseId=sn97775647&amp;docId=ABN20240226074801&amp;linkId=1" target=ABN20240226074801>ABANDONMENT-Notice</A> </TD>
<TD></TD>
<TD class=end>PDF</TD>
<TD class=hide>20240226074801</TD>

What call do I use next to get the text from the "<a href=" tag?

Then I need to format that text to call the page which looks like this:
Rich (BB code):
https://tsdr.uspto.gov/documentviewer?caseId=sn97775647&docId=ABN20240226074801&linkId=1#docIndex=0&page=1

Thanks for all your help,
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
Try something like this:

VBA Code:
Debug.Print Replace(elems1(0).getElementsByTagName("A")(0).href,"about:","https://tsdr.uspto.gov") & "#docIndex=0&page=1"
 
Upvote 1
Solution
Try something like this:

VBA Code:
Debug.Print Replace(elems1(0).getElementsByTagName("A")(0).href,"about:","https://tsdr.uspto.gov") & "#docIndex=0&page=1"
Hey John,

Thank you very much for your suggestion. It worked great with one small tweak.

VBA Code:
Instead of:
Debug.Print Replace(elems1(0).getElementsByTagName("A")(0).href, "about:", _
    "https://tsdr.uspto.gov") & "#docIndex=0&page=1"

This worked:
Debug.Print Replace(elems1.getElementsByTagName("A")(0).href, "about:", _
    "https://tsdr.uspto.gov") & "#docIndex=0&page=1"

Thanks so much for your suggestion, this helped me solve it.
-Norm
 
Upvote 0

Forum statistics

Threads
1,215,071
Messages
6,122,964
Members
449,094
Latest member
Anshu121

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top