Regex to split substring

dmd123

New Member
Joined
Aug 16, 2016
Messages
5
Hi,

I am trying to figure out how to implement a regex solution to a data field and was looking for some help.

I have got a fair bit done using mid and search/find but I am still doing a lot of clean up to the data after extraction to compensate for the different way people record the same info. The regular expression I want to extract from is:

## hours @ £##.##

I have been using the @ sign to get the hours and the £ sign to get the cost. However....some people write hrs, others write hour or hours and some just hr. The hours may be part values as well, so 10.5 or 9.25 etc. For the price some will include the decimal place and others won't. The expression sits within a larger string and is never in the same place. I had tried to use find and replace to standardise the hours data but it starts to change existing text that includes some of the combinations - for example gilchrist becomes gilchourist, not ideal!!

The £ value also has issues, some people put a space after the £ sign whilst others incude no decimal point so £15 instead of £15.00 I have used Search within MID to get the starting point for extraction but because of the variation I often have values coming out with other aspects of the string that I don't want to see.

By using the @ sign I can get at the data either side of it. The main issue is extracting the hours. I thought I could use a regex to pick up on the variation and still pull through the numerical value for the hours irrespective of whether someone has input hr/hrs/hour/hours.

I was contemplating writing a series of formulas using mid and then choosing the one that returned the cleanest value However, when tweaking the formula manually this changes from returning 3 characters to returning eight. Whilst it works I get the feeling there is a better way and when I looked I saw regex and thought that might work. I am looking to get the hours in one cell in the row and the £value in another.

An example of the output would be that the expression 10 hrs @ £50.50 results in 10 in one cell and £50.50 in another.

Thanks for your help.
 
I have them all on one sheet at minute and it doesn't like it, going really slow.
Sounds like you might have a fair bit of data. My RegEx UDF would run considerably faster over larger data if constructed like below. However, this will still likely be no faster than Rick's UDF. The native worksheet formulas are faster than either of the UDFs. As you have suggested though, it really is a matter of choosing the method that gives the best results.
Code:
Public Const RXpattern As String = "(\d+\.?\d*)(\D+@\D+)(\d+\.?\d*)"

Function HrCost(S As String, HC As String) As Double
  Static RX As Object
  
  If RX Is Nothing Then Set RX = CreateObject("VBScript.RegExp")
  RX.Pattern = RXpattern
  HrCost = RX.Execute(S)(0).Submatches(IIf(UCase(HC) = "H", 0, 2))
End Function
 
Upvote 0

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.

Forum statistics

Threads
1,216,101
Messages
6,128,841
Members
449,471
Latest member
lachbee

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top