Extract Text String with PQ

AllisterB

Board Regular
Joined
Feb 22, 2019
Messages
120
Office Version
  1. 365
Platform
  1. Windows
I have a column that includes text consisting of 7 consecutive non-blank characters in the format AAANNNN (Alpha A-Z and Numeric 0-9).
I want to extract this field the 7 characters and place the extracted text string in a new Column. If the AAANNNN pattern does not exist on a rowthen no output is made.

The current M code does not do this. I have also tried an approach using by "Add Columns By Example".

Does anyone have any ideas?

Thank You



By Example (revised.xlsx
FGH
2ID
3(ABC:FGH4567)
4(ABC:HJK4567
5ABC:GH3456
6ABC:A KL5678
7:YUI4567
8ABC:HJKO1234
9GHK1234
10:GHY5672))
11:GHY567))
12
13
14
15
16IDCustom.1Comment
17(ABC:FGH4567)should be FGH4567
18(ABC:HJK4567HJK4567OK
19ABC:GH3456Correct treatment as not AAA1234 format
20ABC:A KL5678Correct Output as it contains a blank
21:YUI4567YUI4567OK
22ABC:HJKO1234Correct treatment as not AAA1234 format
23GHK1234GHK1234OK
24:GHY5672))should be GHY5672
25:GHY567))Correct treatment as not AAA1234 format
Sheet1


Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom.1", each if Text.Length(List.Last(Text.Split([ID], ":")))=7 then  List.Last(Text.Split([ID], ":")) else null)
in
    #"Added Custom"
 
This worked fine - thank you

However in my real data if there is a space following th colon it doesn't work - so I have introduced a step to replace ": " with "".
 
Upvote 0

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
another approach
IDResult
(ABC:FGH4567)FGH4567
(ABC:HJK4567HJK4567
ABC:GH3456
ABC:A KL5678
:YUI4567YUI4567
ABC:HJKO1234
GHK1234GHK1234
:GHY5672))GHY5672
:GHY567))
ABCD123
ABCHJFK1234)*BJFK1234

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    IF1 = Table.AddColumn(Source, "Custom", each if Text.Contains([ID], " ") then null else if not Text.Contains([ID], ":") then ":"&[ID] else if not Text.Contains([ID], ")") then [ID]&")" else [ID]),
    ETBD = Table.TransformColumns(IF1, {{"Custom", each Text.BetweenDelimiters(_, ":", ")"), type text}}),
    IF2 = Table.AddColumn(ETBD, "Result", each if Text.Length(Text.Select([Custom], {"A".."Z"})) = 3 and Text.Length(Text.Select([Custom], {"0".."9"})) = 4 then [Custom] else if not Text.Contains([ID], ":") and Text.Length([Custom]) > 7 then Text.End([Custom], 7) else null),
    TSC = Table.SelectColumns(IF2,{"Result"})
in
    TSC
question is: with this example, the target string ( JKO1234 ) in 3-4 format should be extracted from the yellow cell?
 
Last edited:
Upvote 0
another approach
IDResult
(ABC:FGH4567)FGH4567
(ABC:HJK4567HJK4567
ABC:GH3456
ABC:A KL5678
:YUI4567YUI4567
ABC:HJKO1234
GHK1234GHK1234
:GHY5672))GHY5672
:GHY567))
ABCD123
ABCHJFK1234)*BJFK1234

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    IF1 = Table.AddColumn(Source, "Custom", each if Text.Contains([ID], " ") then null else if not Text.Contains([ID], ":") then ":"&[ID] else if not Text.Contains([ID], ")") then [ID]&")" else [ID]),
    ETBD = Table.TransformColumns(IF1, {{"Custom", each Text.BetweenDelimiters(_, ":", ")"), type text}}),
    IF2 = Table.AddColumn(ETBD, "Result", each if Text.Length(Text.Select([Custom], {"A".."Z"})) = 3 and Text.Length(Text.Select([Custom], {"0".."9"})) = 4 then [Custom] else if not Text.Contains([ID], ":") and Text.Length([Custom]) > 7 then Text.End([Custom], 7) else null),
    TSC = Table.SelectColumns(IF2,{"Result"})
in
    TSC
question is: with this example, the target string ( JKO1234 ) in 3-4 format should be extracted from the yellow cell?

I have just returned from a week out of the office

Thank you for your reply - yes the JKO12354 would be the correct result


Allister
 
Upvote 0
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    IF1 = Table.AddColumn(Source, "Custom", each if Text.Contains([ID], " ") then null else if not Text.Contains([ID], ":") then ":"&[ID] else if not Text.Contains([ID], ")") then [ID]&")" else [ID]),
    ETBD = Table.TransformColumns(IF1, {{"Custom", each Text.BetweenDelimiters(_, ":", ")"), type text}}),
    IF2 = Table.AddColumn(ETBD, "Result", each if Text.Length(Text.Select([Custom], {"A".."Z"})) = 3 and Text.Length(Text.Select([Custom], {"0".."9"})) = 4 then [Custom] else if not Text.Contains([ID], ":") and Text.Length([Custom]) > 7 then Text.End([Custom], 7) else if Text.Contains([ID], ":") and Text.Length([Custom]) >= 7 then Text.End([Custom], 7) else null ),
    TSC = Table.SelectColumns(IF2,{"Result"})
in
    TSC
IDResult
(ABC:FGH4567)FGH4567
(ABC:HJK4567HJK4567
ABC:GH3456
ABC:A KL5678
:YUI4567YUI4567
ABC:HJKO1234JKO1234
GHK1234GHK1234
:GHY5672))GHY5672
:GHY567))
ABCD123
ABCHJFK1234)*BJFK1234
 
Upvote 0

Forum statistics

Threads
1,216,014
Messages
6,128,287
Members
449,436
Latest member
blaineSpartan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top