Removing HTML code from Excel cell

sarayewo

New Member
Joined
Apr 28, 2009
Messages
33
I have a report that comes out of one of our web-based systems in which I that contains user input. Whenever I get the report this particular cell contains the HTML code that was used to format its contents in the web system.

I need a code that I can execute to remove the HTML code and only leave the actual content... Would this be possible?

I tried simply using Find/Replace and replacing "<*>" with blanks, but I get an error message "Formula too long"... Any ideas?
 

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
This is excellent, but when I run this in Excel 2007, it is replacing the text with XXXXXX for all data it fixes. I can see the values when I am in a specific cell, but when copy/paste it elsewhere, it's just XXXXX. How can I fix this?

You could just wrap the preceding code in the Replace VBA function and specify it to replace your chosen characters:

Rich (BB code):
Sub RemoveTags()
Dim r As Range
Selection.NumberFormat = "@"  'set cells to text numberformat
With CreateObject("vbscript.regexp")
  .Pattern = "\<.*?\>"
  .Global = True
  For Each r In Selection
    r.Value = Replace(.Replace(r.Value, ""),"&nbsp"," ")
  Next r
End With
End Sub

The amendment is in Red.
 
Upvote 0
Richard Schollar has made a huge contribution to my career :)

now, I'd like to see what other goodies that "CreateObject" can do -- is it calling a library in Vb, Vbscript?
Thanks a lot !! I had NO problems whatever with this routine, and then I replace all "nbsp;" and then I'm good to go.
 
Upvote 0
I found this thread really useful to sort out several tags in my Excel that I wanted removing. I have modified the original replace statement nesting several times for various & tags.

I have one final hurdle I just can't seem to get over because I don't know VBA really (I don't even get how that .pattern works), so am relying on these type of forums to piece it together. I don't really understand what the difference is between the replace and the .replace in the nested statements and don't know what order they process in.

My final problem that I want to solve is how to replace the <br> tag with ALT+Enter to put a new line. The original code works fine for getting rid of all the tags in one go, but before I get rid of them all I want to replace just the line breaks and insert and actual new line, then continue to remove the rest of the HTML tags and the & tags.

My replace looks like this (it's within a sub that uses different var names). I'm actually replacing the codes for & less than ; etc. but the preview shows them as < for the search string and the replacement string (I'm sure you get the idea).

Rich (BB code):
With CreateObject("vbscript.regexp")
  .Pattern = "\<.*?\>"
  .Global = True
  For Each DataRange In Selection
    DataRange.Value = Replace(Replace(Replace(Replace(Replace(.Replace(DataRange.Value, " "),"<br>", vbNewLine), "&amp", "&"), "<", "<"), ">", ">"), "@", "@")
'    DataRange.Value = Replace(Replace(Replace(Replace(.Replace(Replace("<br>", vbNewLine),DataRange.Value, " "), "&amp", "&"), "<", "<"), ">", ">"), "@", "@")  Next DataRange
End With

The commented out replace line didn't work, I was trying to see if swapping round the order of the original .replace " " with the replace <br> next to it would work but it didn't.

Please help me sort this out.
Thanks.

You could just wrap the preceding code in the Replace VBA function and specify it to replace your chosen characters:

Rich (BB code):
Sub RemoveTags()
Dim r As Range
Selection.NumberFormat = "@"  'set cells to text numberformat
With CreateObject("vbscript.regexp")
  .Pattern = "\<.*?\>"
  .Global = True
  For Each r In Selection
    r.Value = Replace(.Replace(r.Value, ""),"&nbsp"," ")
  Next r
End With
End Sub

The amendment is in Red.
 
Upvote 0

Forum statistics

Threads
1,214,978
Messages
6,122,545
Members
449,089
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top