How to delete text between two specific words in cell?

ea2146

New Member
Joined
Apr 9, 2012
Messages
30
Hello,

I am trying to create an excel function that will identify two specific words embedded in messages of varying lengths (that I scraped from a message board online) and delete all the text in between those two words.

Specifically, the message board that I scraped the data from allowed users to quote other users and then respond to those quoted messages. The quoted text is always presented in the following way:



PREVIOUS MESSAGE AUTHOR Username said:

Message written by the previous author

Click to expand...

Message written by the CURRENT MESSAGE AUTHOR


Here is what it looks like in excel:

User235235 said:


This is the text I need deleted.

Click to expand...


This is the text I want to save.
User8888986655 said:


This is the text I need deleted. Blah blah Blah blah Blah blah Blah blah

Click to expand...


This is the text I want to save. Blah blah Blah blah Blah blah
User2222222222 said:


This needs to be deleted.

Click to expand...


Save this please

<tbody>
</tbody>

To complicate things further, sometimes there are more than one quoted messages from a previous author and the words "said:" and "Click to expand..." appear more than once in that case.
Therefore, I need an excel function that will identify the words " said:" and "Click to expand..." and delete all the words in between those two phrases (as well as those two words).

Any suggestions would be greatly appreciated. Thanks so much in advance!!
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
with PowerQuery (Get&Transform):

rawraw.2
User235235 said:


This is the text I need deleted.

Click to expand...


This is the text I want to save.
This is the text I want to save.
User8888986655 said:


This is the text I need deleted. Blah blah Blah blah Blah blah Blah blah

Click to expand...


This is the text I want to save. Blah blah Blah blah Blah blah
This is the text I want to save. Blah blah Blah blah Blah blah
User2222222222 said:


This needs to be deleted.

Click to expand...


Save this please
Save this please

Code:
[SIZE=1]// Table1
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Split = Table.SplitColumn(Source, "raw", Splitter.SplitTextByEachDelimiter({"#(lf)"}, QuoteStyle.Csv, true), {"raw.1", "raw.2"}),
    ROC = Table.SelectColumns(Split,{"raw.2"})
in
    ROC[/SIZE]
 
Last edited:
Upvote 0
Thank you very much! I'm sorry to be so ignorant, but can you say more about how I enter this code into PowerQuery to produce the end result in column 2? I understand that the Get&Transform commands are under the data tab, but I am not sure what to do from there. Thanks again!
 
Upvote 0
Formula version in case you need it:


Book1
AB
1User235235 said: This is the text I need deleted. Click to expand... This is the text I want to save.This is the text I want to save.
2User8888986655 said: This is the text I need deleted. Blah blah Blah blah Blah blah Blah blah Click to expand... This is the text I want to save. Blah blah Blah blah Blah blahThis is the text I want to save. Blah blah Blah blah Blah blah
3User2222222222 said:This needs to be deleted.Click to expand...Save this pleaseSave this please
4User2222222222 said: This needs to be deleted. Click to expand User 1456 Said: This needs to be deleted. Click to expand... Save this pleaseSave this please
Sheet3
Cell Formulas
RangeFormula
B1=TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(A1,CHAR(10),""),"Click to expand...",REPT(" ",LEN(A1))),LEN(A1)))
 
Last edited:
Upvote 0
Excellent! This is perfect. Thank you all very much for your helpful responses. Much appreciated!
 
Upvote 0
How about:

=TRIM(RIGHT(SUBSTITUTE(A2,"Click to expand...",REPT(" ",250)),250))
 
Upvote 0
how I enter this code into PowerQuery to produce the end result in column 2? I understand that the Get&Transform commands are under the data tab, but I am not sure what to do from there.

select your source data and use ctrl+t
change header to "raw"
check in name manage the name of the table
select any cell in this table and in Get&Transform find From Table (depend of the Excel version) and use it
it will open PowerQuery Editor, find Advanced Editor - open - paste code from the post (replace all there with the new code)
check table name and adjust if necessary then close PQ Editor
Close&Load to
1.Table
2.a) new worksheet
b) existing worksheet - select cell

remember that Power Query is case sensitive so raw is not the same as RAW or Raw
 
Last edited:
Upvote 0

Forum statistics

Threads
1,213,544
Messages
6,114,249
Members
448,556
Latest member
peterhess2002

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top