How to extract XML from HTML, I think?

bubbapost

Board Regular
Joined
Mar 11, 2009
Messages
116
Hello,

I have a friend who is recieving a data dump & in a format that is not easily imported into excel. I think it's a mixture of HTML & XML. I would like to create a macro that will extract the XML from the other data.

Here is a sample:

<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">

Will anyone help me?

Thank you.
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.

Forum statistics

Threads
1,215,460
Messages
6,124,949
Members
449,198
Latest member
MhammadishaqKhan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top