Deleting part of text based on criteria

stu222a

New Member
Joined
Jun 21, 2011
Messages
5
Hi

I was hoping some excel whizz may be able to help me with this one. Ive been reading this forum for the last month or so and picked up lots of really useful tips that are making my work easier; thank very much!

So now i have a specific question of my own...

I have a text file with lots of data, i have pasted it into excel so i can manipulate it, and am hoping there may be a formula or macro that can make this process quicker..

Basically i am looking to delete certain parts of the text. I cant really explain properly so i have provided an example of the text below:


sample of four lines, ive just pasted them straight into excel so each line falls within 1 cell (i could delimit but the data varies and they can contain more or less data).

<TABLE style="WIDTH: 1216pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=1621><COLGROUP><COL style="WIDTH: 1216pt; mso-width-source: userset; mso-width-alt: 59282" width=1621><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; WIDTH: 1216pt; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20 width=1621>link 2991 2992 category 11 speed 30 mph force merge force across cost factor 0.800 junction visibility 30.00 m gap acceptance lane merge 2.00 gap acceptance lane cross 2.00 gap acceptance path cross 2.00 arc centre #5060 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3152 3179 category 11 speed 20 mph width 8.6 m force merge force across gap acceptance look next arc centre #5410 </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3219 3264 category 11 speed 20 mph width 8.6 m force merge force across junction visibility 30.00 m gap acceptance lane merge 0.00 gap acceptance lane cross 0.00 arc centre #5346 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3222 3221 category 11 speed 25 mph force merge force across junction visibility 40.00 m gap acceptance lane merge 0.50 gap acceptance lane cross 0.50 arc centre #1828 left </TD></TR></TBODY></TABLE>
I am looking to remove any text that says the following:

"gap acceptance lane merge x.xx"
"gap acceptance lane cross x.xx"
"gap acceptance path cross x.xx"

I would use a find replace but the numbers proceeding the text vary. Additionally, there is another phrase often contained that included the text "gap acceptance" (i,e, gap acceptance look next) which i do not want to remove.

So the lines above would be converted to the following:

<TABLE style="WIDTH: 1216pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=1621><COLGROUP><COL style="WIDTH: 1216pt; mso-width-source: userset; mso-width-alt: 59282" width=1621><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; WIDTH: 1216pt; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20 width=1621>link 2991 2992 category 11 speed 30 mph force merge force across cost factor 0.800 junction visibility 30.00 m arc centre #5060 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3152 3179 category 11 speed 20 mph width 8.6 m force merge force across gap acceptance look next arc centre #5410 </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3219 3264 category 11 speed 20 mph width 8.6 m force merge force across junction visibility 30.00 m arc centre #5346 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3222 3221 category 11 speed 25 mph force merge force across junction visibility 40.00 m arc centre #1828 left </TD></TR></TBODY></TABLE>
The second part to my problem is that i am looking to replace any text that reads "junction visibility X.xx m" with "junction visibility 20.00 m" i.e. all junction visibility should be set at 20 m so in the four lines would then read..

<TABLE style="WIDTH: 1216pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=1621><COLGROUP><COL style="WIDTH: 1216pt; mso-width-source: userset; mso-width-alt: 59282" width=1621><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; WIDTH: 1216pt; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20 width=1621>link 2991 2992 category 11 speed 30 mph force merge force across cost factor 0.800 junction visibility 20.00 m arc centre #5060 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3152 3179 category 11 speed 20 mph width 8.6 m force merge force across gap acceptance look next arc centre #5410 </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3219 3264 category 11 speed 20 mph width 8.6 m force merge force across junction visibility 20.00 m arc centre #5346 left </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20>link 3222 3221 category 11 speed 25 mph force merge force across junction visibility 20.00 m arc centre #1828 left </TD></TR></TBODY></TABLE>

Any help would be much appreciated. And appologies if this is a stupid question

Stuart
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
Not stupid at all, and fun to solve without using a macro... try this.

1) Insert a blank row at the top
2) Data is in column A starting at A2
3) In B1 put a title of gap acceptance lane merge
4) In C1 put a title of gap acceptance lane cross
5) In D1 put a title of gap acceptance path cross
6) In E1 put a title of junction visibility

Now the formulas...

1) In B2 enter this formula:

=TRIM(IF(ISNUMBER(SEARCH(B$1,A2)),LEFT(A2, SEARCH(B$1,A2)-2) & MID(A2, SEARCH(B$1,A2)+LEN(B$1)+5,LEN(A2)), A2))

2) Copy that across the row into C2 and D2

3) In E2 enter this formula:

=TRIM(IF(ISNUMBER(SEARCH(E$1,D2)),LEFT(D2, SEARCH(E$1,D2)-2) & " junction visibility 20.00 " & MID(D2, SEARCH(E$1,D2)+LEN(E$1)+7,LEN(D2)), D2))

4) Now copy the cells B2:E2 down the whole data set.

The answers are now in column E.

To flatten the answers:

1) Highlight column E
2) Copy
3) Select column G and Edit > Paste Special > Values
 
Upvote 0
Thanks!!

That works perfectly thanks so much.

I think i was getting there with the following...

=REPLACE(A1,FIND((LEFT(RIGHT(A1,LEN(A1)-(FIND("gap acceptance lane merge",A1,1))+1),30)),A1),30,"")

=REPLACE(A1,FIND((LEFT(RIGHT(A1,LEN(A1)-(FIND("gap acceptance lane cross",A1,1))+1),31)),A1),31,"")

=REPLACE(A1,FIND((LEFT(RIGHT(A1,LEN(A1)-(FIND("gap acceptance path cross",A1,1))+1),30)),A1),30,"")

but it was returning errors where the find rerturned no value.

Just for future reference is there a way to combine several replace functions within one forumla?

Thanks for your help!
 
Upvote 0
Hi

Sorry to be a pain but ive noticed a problem

I can easily manually fix it as there are very few entries that have this specific problem

the problem occurs when the gap acceptance value is a negative and therefore the number of characters increases

e.g.

link 1118 2230w category 7 force merge force across gap acceptance lane merge -2.00 gap acceptance lane cross -2.00 gap acceptance path cross -2.00

this line is convertd to

<TABLE style="WIDTH: 768pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=1024><COLGROUP><COL style="WIDTH: 768pt; mso-width-source: userset; mso-width-alt: 37449" width=1024><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: #ece9d8; BORDER-LEFT: #ece9d8; BACKGROUND-COLOR: transparent; WIDTH: 768pt; HEIGHT: 15pt; BORDER-TOP: #ece9d8; BORDER-RIGHT: #ece9d8" height=20 width=1024>link 1118 2230w category 7 force merge force across000 in column E so has the extra '000' at the end

another example is

link 2987 2980 category 11 force merge force across junction visibility 20.00 m000 arc centre #5032 left
Can this be avoided by revising the formula?
</TD></TR></TBODY></TABLE>
 
Upvote 0
Always try to send up a set of sample data that represents all the hurdles to overcome.

Try this as the base B2 formula:

=TRIM(IF(ISNUMBER(SEARCH(B$1,A2)),LEFT(A2, SEARCH(B$1,A2)-2) & " " & MID(A2, SEARCH(B$1,A2)+LEN(B$1)+6,LEN(A2)), A2))
 
Upvote 0

Forum statistics

Threads
1,224,616
Messages
6,179,912
Members
452,949
Latest member
beartooth91

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top