How to extract XML from HTML, I think?

bubbapost

Board Regular
Joined
Mar 11, 2009
Messages
116
Hello,

I have a friend who is recieving a data dump & in a format that is not easily imported into excel. I think it's a mixture of HTML & XML. I would like to create a macro that will extract the XML from the other data.

Here is a sample:

<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">

Will anyone help me?

Thank you.
 

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.

Forum statistics

Threads
1,214,968
Messages
6,122,506
Members
449,089
Latest member
RandomExceller01

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top