I'm trying to parse out text strings from a source that is not consistently formatted. These extractions will be inserted into separate cells.
In a "perfect scenario", the data source should be formatted in the following manner:
Cell C1
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, Bruce Wayne 4a. Pilot, Hal Jordan 4b. Engineering, Barry Allen 4c. IT, Peter Parker 4d. Security, Tony Stark
The use of numbers, spaces and commas should always be constant. Blue text can be considered my markers with which to find things.
Sometimes the data comes over like this:
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, 4a. , Hal Jordan 4b. , 4c. , 4d. ,
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, Bruce Wayne 4a. , Hal Jordan 4b. , Barry Allen 4c. , Peter Parker 4d. , Tony Stark
Business, Clark Kent 2. Solution, Diana Prince 3. Operational, 4a. , Hal Jordan
I need to allow for these variances and extract the names and areas of expertise (the descriptions that are sometimes included just before the person's name like Engineering and Security above).
So using the "perfect scenario" above, it would look like this:
[TABLE="class: grid, width: 800"]
<tbody>[TR]
[TD]A
[/TD]
[TD]B
[/TD]
[TD]C
[/TD]
[TD]D
[/TD]
[TD]E
[/TD]
[TD]F
[/TD]
[TD]G
[/TD]
[/TR]
[TR]
[TD]Clark Kent
[/TD]
[TD]Diana Prince
[/TD]
[TD]Bruce Wayne
[/TD]
[TD]Pilot, Hal Jordan
[/TD]
[TD]Engineering, Barry Allen
[/TD]
[TD]IT, Peter Parker
[/TD]
[TD]Security, Tony Stark
[/TD]
[/TR]
</tbody>[/TABLE]
My formulas so far look like this, but I end up having to add special tweaks to cells E, F and G:
[TABLE="class: grid, width: 710"]
<tbody>[TR]
[TD]A
[/TD]
[TD]B
[/TD]
[TD]C
[/TD]
[TD]D
[/TD]
[TD]E
[/TD]
[TD]F
[/TD]
[TD]G
[/TD]
[/TR]
[TR]
[TD]=TRIM(MID($C1,14,FIND("2",$C1)-14))
[/TD]
[TD]=TRIM(MID($C1,FIND("n,",$C1)+2,FIND("3",$C1)-FIND("n,",$C1)-2))
[/TD]
[TD]=TRIM(MID($C1,FIND("l,",$C1)+2,FIND("4",$C1)-FIND("l,",$C1)-2))
[/TD]
[TD]=IF($G$4=TRUE,IF(LEFT(MID($C$1,FIND("4a",$C1)+4,(FIND("4b",$C1)-FIND("4a",$C1))-5),1)=",",MID($C$1,FIND("4a",$C1)+6,(FIND("4b",$C1)-FIND("4a",$C1))-7),MID($C$1,FIND("4a",$C1)+4,(FIND("4b",$C1)-FIND("4a",$C1))-5)),RIGHT($C1,LEN($C1)-FIND("4a",$C1)-5))
[/TD]
[TD]=IF($H$4=TRUE,IF(LEFT(MID($C$1,FIND("4b",$C1)+4,(FIND("4c",$C1)-FIND("4b",$C1))-5),1)=",",MID($C$1,FIND("4b",$C1)+6,(FIND("4c",$C1)-FIND("4b",$C1))-7),MID($C$1,FIND("4b",$C1)+4,(FIND("4c",$C1)-FIND("4b",$C1))-5)),RIGHT($C1,LEN($C1)-FIND("4b.",$C1,LEN($F7))-3))
[/TD]
[TD]Gets confusing at this point
[/TD]
[TD]Gets confusing at this point
[/TD]
[/TR]
</tbody>[/TABLE]
Any help would be very, very welcome.
In a "perfect scenario", the data source should be formatted in the following manner:
Cell C1
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, Bruce Wayne 4a. Pilot, Hal Jordan 4b. Engineering, Barry Allen 4c. IT, Peter Parker 4d. Security, Tony Stark
The use of numbers, spaces and commas should always be constant. Blue text can be considered my markers with which to find things.
Sometimes the data comes over like this:
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, 4a. , Hal Jordan 4b. , 4c. , 4d. ,
1. Business, Clark Kent 2. Solution, Diana Prince 3. Operational, Bruce Wayne 4a. , Hal Jordan 4b. , Barry Allen 4c. , Peter Parker 4d. , Tony Stark
Business, Clark Kent 2. Solution, Diana Prince 3. Operational, 4a. , Hal Jordan
I need to allow for these variances and extract the names and areas of expertise (the descriptions that are sometimes included just before the person's name like Engineering and Security above).
So using the "perfect scenario" above, it would look like this:
[TABLE="class: grid, width: 800"]
<tbody>[TR]
[TD]A
[/TD]
[TD]B
[/TD]
[TD]C
[/TD]
[TD]D
[/TD]
[TD]E
[/TD]
[TD]F
[/TD]
[TD]G
[/TD]
[/TR]
[TR]
[TD]Clark Kent
[/TD]
[TD]Diana Prince
[/TD]
[TD]Bruce Wayne
[/TD]
[TD]Pilot, Hal Jordan
[/TD]
[TD]Engineering, Barry Allen
[/TD]
[TD]IT, Peter Parker
[/TD]
[TD]Security, Tony Stark
[/TD]
[/TR]
</tbody>[/TABLE]
My formulas so far look like this, but I end up having to add special tweaks to cells E, F and G:
[TABLE="class: grid, width: 710"]
<tbody>[TR]
[TD]A
[/TD]
[TD]B
[/TD]
[TD]C
[/TD]
[TD]D
[/TD]
[TD]E
[/TD]
[TD]F
[/TD]
[TD]G
[/TD]
[/TR]
[TR]
[TD]=TRIM(MID($C1,14,FIND("2",$C1)-14))
[/TD]
[TD]=TRIM(MID($C1,FIND("n,",$C1)+2,FIND("3",$C1)-FIND("n,",$C1)-2))
[/TD]
[TD]=TRIM(MID($C1,FIND("l,",$C1)+2,FIND("4",$C1)-FIND("l,",$C1)-2))
[/TD]
[TD]=IF($G$4=TRUE,IF(LEFT(MID($C$1,FIND("4a",$C1)+4,(FIND("4b",$C1)-FIND("4a",$C1))-5),1)=",",MID($C$1,FIND("4a",$C1)+6,(FIND("4b",$C1)-FIND("4a",$C1))-7),MID($C$1,FIND("4a",$C1)+4,(FIND("4b",$C1)-FIND("4a",$C1))-5)),RIGHT($C1,LEN($C1)-FIND("4a",$C1)-5))
[/TD]
[TD]=IF($H$4=TRUE,IF(LEFT(MID($C$1,FIND("4b",$C1)+4,(FIND("4c",$C1)-FIND("4b",$C1))-5),1)=",",MID($C$1,FIND("4b",$C1)+6,(FIND("4c",$C1)-FIND("4b",$C1))-7),MID($C$1,FIND("4b",$C1)+4,(FIND("4c",$C1)-FIND("4b",$C1))-5)),RIGHT($C1,LEN($C1)-FIND("4b.",$C1,LEN($F7))-3))
[/TD]
[TD]Gets confusing at this point
[/TD]
[TD]Gets confusing at this point
[/TD]
[/TR]
</tbody>[/TABLE]
Any help would be very, very welcome.