Extract Multiple Sections of Text in a String, Between Two Repeating Words

blafarm

Board Regular
Joined
Oct 14, 2018
Messages
59
Hi All,

I have a bit of a challenge and was hoping for some guidance.

I have a very long string of text in a single cell and I need to extract multiple sections of text that appear between two words that repeat in the string.

For example, here is a very simplified version of the text string in Cell A1:

{"date": 5/7/19 headline:"GE Posts Profit" source:"CNBC"}{"date": 5/8/19 headline:"GE Dividend Shrink" source:"MSN"}{"date": 5/9/19 headline:"GE Bankrupt" source:"WSJ"}

I have had limited success with the following formula, however it only extracts the text between the first appearance of the words: "headline" and "source":

Formula: =MID(A1,SEARCH("headline",A1)+2,SEARCH("source:",A1)-SEARCH("headline",A1)-4)

Result: GE Posts Profit


This formula does not capture ALL of the headlines in the full string -- only the first headline.

My initial goal is to learn a formula that will extract ALL of the headlines within the string, and that will place the results in a succession of horizontal or vertical cells that either look like this:

GE Posts ProfitGE Dividends ShrinkGE Bankrupt

<tbody>
</tbody>

... or that look like this:

GE Posts Profit
GE Dividends Shrink
GE Bankrupt

<tbody>
</tbody>

I have some additional questions that are directly related to this inquiry, but I'm trying to start by keeping it simple.

Thanks very much in advance.

Cheers
 

Excel Facts

How to change case of text in Excel?
Use =UPPER() for upper case, =LOWER() for lower case, and =PROPER() for proper case. PROPER won't capitalize second c in Mccartney
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Will each successive row take on the same characteristics. To ensure a good solution, it would be beneficial to see a few more records to see if there is any consistency.
 
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hey, providing the string format follows the above rule try give this a try in cell A2:

=MID($A$1,SEARCH("|",SUBSTITUTE($A$1,"headline:","|",ROW()-1))+10,SEARCH("|",SUBSTITUTE($A$1,"source:","|",ROW()-1))-SEARCH("|",SUBSTITUTE($A$1,"headline:","|",ROW()-1))-12)

Copy this formula down to see successive headlines.
 
Last edited:
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hi,

Assuming data string in A1...

Use B1 formula copied across if you want results horizontal,
Use A2 formula copied down if you want results vertical:


Book1
ABCDE
1{"date": 5/7/19 headline:"GE Posts Profit" source:"CNBC"}{"date": 5/8/19 headline:"GE Dividend Shrink" source:"MSN"}{"date": 5/9/19 headline:"GE Bankrupt" source:"WSJ"}GE Posts ProfitGE Dividend ShrinkGE Bankrupt
2GE Posts Profit
3GE Dividend Shrink
4GE Bankrupt
5
Sheet661
Cell Formulas
RangeFormula
B1=TRIM(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE($A1,"headline:",REPT(" ",LEN($A1))),"source:",REPT(" ",LEN($A1))),(COLUMNS($B1:B1)*2-1)*LEN($A1),LEN($A1)),CHAR(34),""))
A2=TRIM(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE(A$1,"headline:",REPT(" ",LEN(A$1))),"source:",REPT(" ",LEN(A$1))),(ROWS(A$2:A2)*2-1)*LEN(A$1),LEN(A$1)),CHAR(34),""))
 
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Will each successive row take on the same characteristics. To ensure a good solution, it would be beneficial to see a few more records to see if there is any consistency.

Thank you for your prompt response.

I was trying to keep it simple, but at the bottom of this post is a sample of the kind of text string I am working with.

As you can see, there are more metrics in the string than just the Date, Headline and Source that I presented in my simplified example.

In a perfect world, I would be able to extract the following plurality of metrics in the order they appear within the string:

datetime
headline
source
url

So the resulting column would look like this:

datetime (1st appearance)
headline (1st appearance)
source (1st appearance)
url (1st appearance)
datetime (2nd appearance)
headline (2nd appearance)
source (2nd appearance))
url (2nd appearance)
datetime (3rd appearance)
headline (3rd appearance)
source (3rd appearance)
url (3rd appearance)

However, I can imagine that might be a tall order -- so I thought I'd aim lower and try to extract singular metrics in one formula so that the result looks like this

headline (1st appearance)
headline (2nd appearance)
headline (3rd appearance)

It is worth mentioning that I have tried many times to use various delimiters to split/parse this text into a horizontal sequence cells.
However, there is no singular delimiter I can use that does not occasionally appear in the Headline or Summary, which then causes the cells to contain text fragments.
That's why I tried experimenting with SEARCH -- because I could conceivably extract text between words that would not be found in a Headline or Summary.

Sample Text String:

<style type="text/css"><!--td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}--></style>[{"datetime":1557328805000 headline:"GE CEO says weaker quarters likely after 'good start' to year" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/1e7714c0-93f6-4921-af24-f3122a2e809e" summary:"General Electric Co Chief Executive Larry Culp said on Wednesday the company will likely have weaker quarters in the rest of the year after a surprisingly "good start" in the first quarter." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/1e7714c0-93f6-4921-af24-f3122a2e809e" lang:"en" hasPaywall:false} {"datetime":1557307800000 headline:"GE Investors Gather to Elect Revamped Board Air Gripes" source:"The Wall Street Journal" url:"https://cloud.iexapis.com/v1/news/article/dfb8a33f-9194-493e-9b83-9709dc0d18a9" summary:"General Electric Chairman and CEO Larry Culp who joined the company just a year ago is expected to answer questions from investors at the shareholder meeting in Tarrytown N.Y." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/dfb8a33f-9194-493e-9b83-9709dc0d18a9" lang:"en" hasPaywall:true} {"datetime":1557205509000 headline:"IndiGo is making plans for another aggressive push" source:"The Economic Times India" url:"https://cloud.iexapis.com/v1/news/article/6ffe2450-5e9f-4166-b5f6-a26a79f10908" summary:"by Anurag KotokyIndiGo is in talks with Airbus SE for another large plane order in a sign Asia’s biggest budget carrier has no intention of letting up on a blistering pace of expansion.The Indian airline which adds an aircraft to its fleet every week is in discussions to buy a longer-range version of the European planemaker’s newest narrow-body jet according to Chief Executive Officer Ronojoy Dutta.Existing orders will see IndiGo through the next two years Dutta said in an interview with Bloomberg News Editor-in-Chief John Micklethwait. After that the carrier is considering adding new planes including Airbus’s long-distance A321neo LR and the yet-to-be-released A321 XLR he said.“When we order we’ll order in bulk for sure ” Dutta said in New Delhi in his first interview since becoming CEO of the airline in January. “I can’t give you a tentative number but it will be large. We need longer range.”Buying BigThough Dutta didn’t specify the size of IndiGo’s next purchase it’s likely to be a multi-billion-dollar order based on the company’s history." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/6ffe2450-5e9f-4166-b5f6-a26a79f10908" lang:"en" hasPaywall:false} {"datetime":1557085967000 headline:"General Electric: Larry Culp Is The Man (For The Job)" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/87bdd7f9-1aec-4943-835f-7a436483cede" summary:"General Electric reported strong Q1 2019 results that beat the consensus bottom-line estimate. The stock is up big since General Electric reported Q1 2019 resul" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/87bdd7f9-1aec-4943-835f-7a436483cede" lang:"en" hasPaywall:false} {"datetime":1556912727000 headline:"GE to build 500 MW coal power plant in Kosovo" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" summary:"A General Electric-led ([[GE]] +2.1%) consortium is selected to build and equip a new 500 MW coal-fired power plant in Kosovo. The new plant designed to m" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" lang:"en" hasPaywall:false} {"datetime":1556901660000 headline:"Good News! General Electric Burned Less Cash in Q1" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/492e3f89-1206-4547-a8a8-db086e773abb" summary:"Is there light at the end of the tunnel for the troubled industrial conglomerate?" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/492e3f89-1206-4547-a8a8-db086e773abb" lang:"en" hasPaywall:false} {"datetime":1556893560000 headline:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" source:"Zacks Investment Research" url:"https://cloud.iexapis.com/v1/news/article/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" summary:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" lang:"en" hasPaywall:false} {"datetime":1556889257000 headline:"GE to build Kosovo's new 500 MW coal power plant" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/f5e2699b-b92e-4aed-aa90-02b557ce5652" summary:"Kosovo and London-listed power firm ContourGlobal said on Friday they had chosen a consortium of General Electric subsidiaries to build and equip a new 500 megawatt (MW)coal-fired power plant in the Balkan country." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/f5e2699b-b92e-4aed-aa90-02b557ce5652" lang:"en" hasPaywall:false} {"datetime":1556877840000 headline:"Key Takeaways From General Electric's Earnings" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/356ca854-67f0-4e1b-a782-4f1986da535f" summary:"The company delivered a solid but unspectacular quarter but that might be just what investors need right now." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/356ca854-67f0-4e1b-a782-4f1986da535f" lang:"en" hasPaywall:false} {"datetime":1556833186000 headline:"General Electric: Still A Mystery" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/2a0c3a5c-32fc-4d40-b8e7-458792672cda" summary:"General Electric released its first quarter reports with a little bit of positive news and a little bit of negative news. As one analyst put it Did this change" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/2a0c3a5c-32fc-4d40-b8e7-458792672cda" lang:"en" hasPaywall:false}]
 
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hey, providing the string format follows the above rule try give this a try in cell A2:

=MID($A$1,SEARCH("|",SUBSTITUTE($A$1,"headline:","|",ROW()-1))+10,SEARCH("|",SUBSTITUTE($A$1,"source:","|",ROW()-1))-SEARCH("|",SUBSTITUTE($A$1,"headline:","|",ROW()-1))-12)

Copy this formula down to see successive headlines.

Thank you for your prompt reply.

I tried your formula but am getting an error related to the requirement of the pipe argument "|"

That symbol does not appear in my text strings (please see my post above).

Thank you.
 
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hi,

Assuming data string in A1...

Use B1 formula copied across if you want results horizontal,
Use A2 formula copied down if you want results vertical:

ABCDE
1{"date": 5/7/19 headline:"GE Posts Profit" source:"CNBC"}{"date": 5/8/19 headline:"GE Dividend Shrink" source:"MSN"}{"date": 5/9/19 headline:"GE Bankrupt" source:"WSJ"}GE Posts ProfitGE Dividend ShrinkGE Bankrupt
2GE Posts Profit
3GE Dividend Shrink
4GE Bankrupt
5

<tbody>
</tbody>
Sheet661

Worksheet Formulas
CellFormula
B1=TRIM(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE($A1,"headline:",REPT(" ",LEN($A1))),"source:",REPT(" ",LEN($A1))),(COLUMNS($B1:B1)*2-1)*LEN($A1),LEN($A1)),CHAR(34),""))
A2=TRIM(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE(A$1,"headline:",REPT(" ",LEN(A$1))),"source:",REPT(" ",LEN(A$1))),(ROWS(A$2:A2)*2-1)*LEN(A$1),LEN(A$1)),CHAR(34),""))

<tbody>
</tbody>

<tbody>
</tbody>

Thank you for your prompt reply.

Your ROW formula works perfectly. Thank you very much!
I am having a bit of trouble with COLUMN version -- maybe because I cannot place your formula in Cell B1.
So I placed it in Cell A3 and modified the formula it as follows:

=TRIM(SUBSTITUTE(MID(SUBSTITUTE(SUBSTITUTE(A$1,"headline:",REPT(" ",LEN(A$1))),"source:",REPT(" ",LEN(A$1))),(COLUMNS($A3:A3)*2-1)*LEN(A$1),LEN(A$1)),CHAR(34),""))

When I do this, I do get the first headline in A3 -- but I get #NUM! errors for all the columns to the right.
This isn't a big problem, though, as the ROW version is probably more useful for my application anyway.

However, I do have two follow-up questions if you don't mind:

1. I am working in a Google Sheets environment and when I use your formula against a string that is 3,560 characters long -- it works perfectly.
However, when the string grows to 4,047 characters long, I get a #VALUE error.
And when I hover over a cell with that #VALUE error, Google Sheets gives me an Error message that consists of: "Text result of SUBSTITUTE is longer than the limit of 50000 characters."
Now, I've measured the string length using a variety of tools and I am quite certain this is breaking at only 4,047 characters.
Do you have any idea why this might be happening, and how to overcome it?

2. I posted a real sample of my text string in a post above and mentioned that, in a perfect world, I would like to be able to sequentially extract text segments in the string that exist between four different pairs of repeated boundary words.

Here are the pieces of text and the word pairs that bound them:

Extracted TextBeginning BoundaryEnding Boundary
1557328805000"datetime": headline:"
GE Posts Profitheadline:"" source:
CNBCsource:" url:"
www.cnbc.com url:"" summary:"

<tbody>
</tbody>

And in this perfect world, the resulting row would look like this:

1557328805000
GE Posts Profit
CNBC
www.cnbc.com
1657328805000
GE Dividends Shrink
MSN
www.msn.com

<tbody>
</tbody>
And so on ...

So, my question is: Is it possible to repeatedly extract those four text segments, that exist between those four different pairs of repeated boundary words?

Thank you once again.
 
Last edited:
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

UPDATE:

Well, I've solved all of my problems except this one:

1. I am working in a Google Sheets environment and when I use your formula against a string that is 3,560 characters long -- it works perfectly.
However, when the string grows to 4,047 characters long, I get a #VALUE error.
And when I hover over a cell with that #VALUE error, Google Sheets gives me an Error message that consists of: "Text result of SUBSTITUTE is longer than the limit of 50000 characters."
Now, I've measured the string length using a variety of tools and I am quite certain this is breaking at only 4,047 characters.
Do you have any idea why this might be happening, and how to overcome it?
 
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hi,

Again assuming your data string in A1, and assuming you start the extraction in A2, going downwards.

Use this updated formula in A2 copied down:


Book1
A
1[{"datetime":1557328805000 headline:"GE CEO says weaker quarters likely after 'good start' to year" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/1e7714c0-93f6-4921-af24-f3122a2e809e" summary:"General Electric Co Chief Executive Larry Culp said on Wednesday the company will likely have weaker quarters in the rest of the year after a surprisingly "good start" in the first quarter." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/1e7714c0-93f6-4921-af24-f3122a2e809e" lang:"en" hasPaywall:false} {"datetime":1557307800000 headline:"GE Investors Gather to Elect Revamped Board Air Gripes" source:"The Wall Street Journal" url:"https://cloud.iexapis.com/v1/news/article/dfb8a33f-9194-493e-9b83-9709dc0d18a9" summary:"General Electric Chairman and CEO Larry Culp who joined the company just a year ago is expected to answer questions from investors at the shareholder meeting in Tarrytown N.Y." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/dfb8a33f-9194-493e-9b83-9709dc0d18a9" lang:"en" hasPaywall:true} {"datetime":1557205509000 headline:"IndiGo is making plans for another aggressive push" source:"The Economic Times India" url:"https://cloud.iexapis.com/v1/news/article/6ffe2450-5e9f-4166-b5f6-a26a79f10908" summary:"by Anurag KotokyIndiGo is in talks with Airbus SE for another large plane order in a sign Asias biggest budget carrier has no intention of letting up on a blistering pace of expansion.The Indian airline which adds an aircraft to its fleet every week is in discussions to buy a longer-range version of the European planemakers newest narrow-body jet according to Chief Executive Officer Ronojoy Dutta.Existing orders will see IndiGo through the next two years Dutta said in an interview with Bloomberg News Editor-in-Chief John Micklethwait. After that the carrier is considering adding new planes including Airbuss long-distance A321neo LR and the yet-to-be-released A321 XLR he said.When we order well order in bulk for sure Dutta said in New Delhi in his first interview since becoming CEO of the airline in January. I cant give you a tentative number but it will be large. We need longer range.Buying BigThough Dutta didnt specify the size of IndiGos next purchase its likely to be a multi-billion-dollar order based on the companys history." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/6ffe2450-5e9f-4166-b5f6-a26a79f10908" lang:"en" hasPaywall:false} {"datetime":1557085967000 headline:"General Electric: Larry Culp Is The Man (For The Job)" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/87bdd7f9-1aec-4943-835f-7a436483cede" summary:"General Electric reported strong Q1 2019 results that beat the consensus bottom-line estimate. The stock is up big since General Electric reported Q1 2019 resul" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/87bdd7f9-1aec-4943-835f-7a436483cede" lang:"en" hasPaywall:false} {"datetime":1556912727000 headline:"GE to build 500 MW coal power plant in Kosovo" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" summary:"A General Electric-led ([[GE]] +2.1%) consortium is selected to build and equip a new 500 MW coal-fired power plant in Kosovo. The new plant designed to m" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" lang:"en" hasPaywall:false} {"datetime":1556901660000 headline:"Good News! General Electric Burned Less Cash in Q1" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/492e3f89-1206-4547-a8a8-db086e773abb" summary:"Is there light at the end of the tunnel for the troubled industrial conglomerate?" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/492e3f89-1206-4547-a8a8-db086e773abb" lang:"en" hasPaywall:false} {"datetime":1556893560000 headline:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" source:"Zacks Investment Research" url:"https://cloud.iexapis.com/v1/news/article/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" summary:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" lang:"en" hasPaywall:false} {"datetime":1556889257000 headline:"GE to build Kosovo's new 500 MW coal power plant" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/f5e2699b-b92e-4aed-aa90-02b557ce5652" summary:"Kosovo and London-listed power firm ContourGlobal said on Friday they had chosen a consortium of General Electric subsidiaries to build and equip a new 500 megawatt (MW)coal-fired power plant in the Balkan country." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/f5e2699b-b92e-4aed-aa90-02b557ce5652" lang:"en" hasPaywall:false} {"datetime":1556877840000 headline:"Key Takeaways From General Electric's Earnings" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/356ca854-67f0-4e1b-a782-4f1986da535f" summary:"The company delivered a solid but unspectacular quarter but that might be just what investors need right now." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/356ca854-67f0-4e1b-a782-4f1986da535f" lang:"en" hasPaywall:false} {"datetime":1556833186000 headline:"General Electric: Still A Mystery" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/2a0c3a5c-32fc-4d40-b8e7-458792672cda" summary:"General Electric released its first quarter reports with a little bit of positive news and a little bit of negative news. As one analyst put it Did this change" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/2a0c3a5c-32fc-4d40-b8e7-458792672cda" lang:"en" hasPaywall:false}]
21557328805000
3GE CEO says weaker quarters likely after 'good start' to year
4Reuters
5https//cloud.iexapis.com/v1/news/article/1e7714c0-93f6-4921-af24-f3122a2e809e
61557307800000
7GE Investors Gather to Elect Revamped Board Air Gripes
8The Wall Street Journal
9https//cloud.iexapis.com/v1/news/article/dfb8a33f-9194-493e-9b83-9709dc0d18a9
101557205509000
11IndiGo is making plans for another aggressive push
12The Economic Times India
13https//cloud.iexapis.com/v1/news/article/6ffe2450-5e9f-4166-b5f6-a26a79f10908
141557085967000
15General Electric Larry Culp Is The Man (For The Job)
16Seeking Alpha
17https//cloud.iexapis.com/v1/news/article/87bdd7f9-1aec-4943-835f-7a436483cede
181556912727000
19GE to build 500 MW coal power plant in Kosovo
20Seeking Alpha
21https//cloud.iexapis.com/v1/news/article/cf2c1bf4-2740-4f80-981b-7a68a8ec327f
221556901660000
23Good News! General Electric Burned Less Cash in Q1
24The Motley Fool
25https//cloud.iexapis.com/v1/news/article/492e3f89-1206-4547-a8a8-db086e773abb
261556893560000
27The Zacks Analyst Blog Highlights Apple Microsoft General Electric Ecolab and Celgene
28Zacks Investment Research
29https//cloud.iexapis.com/v1/news/article/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea
301556889257000
31GE to build Kosovo's new 500 MW coal power plant
32Reuters
33https//cloud.iexapis.com/v1/news/article/f5e2699b-b92e-4aed-aa90-02b557ce5652
341556877840000
35Key Takeaways From General Electric's Earnings
36The Motley Fool
37https//cloud.iexapis.com/v1/news/article/356ca854-67f0-4e1b-a782-4f1986da535f
381556833186000
39General Electric Still A Mystery
40Seeking Alpha
41https//cloud.iexapis.com/v1/news/article/2a0c3a5c-32fc-4d40-b8e7-458792672cda
42
Sheet661 (2)
Cell Formulas
RangeFormula
A2=SUBSTITUTE(SUBSTITUTE(TRIM(MID(SUBSTITUTE(SUBSTITUTE(A$1,CHOOSE(ROUND(MOD(ROWS(A$2:A2)-0.5,4),0),"datetime","headline:","source:","url:"),REPT(" ",LEN(A$1)),INT((ROWS(A$2:A2)+3)/4)),CHOOSE(ROUND(MOD(ROWS(A$2:A2)-0.5,4),0),"headline:","source:","url:","summary:"),REPT(" ",LEN(A$1)),INT((ROWS(A$2:A2)+3)/4)),LEN(A$1),LEN(A$1))),CHAR(34),""),":","")
 
Last edited:
Upvote 0
Re: How to Extract Multiple Sections of Text in a String, Between Two Repeating Words

Hi,

Again assuming your data string in A1, and assuming you start the extraction in A2, going downwards.

Use this updated formula in A2 copied down:

A
1[{"datetime":1557328805000 headline:"GE CEO says weaker quarters likely after 'good start' to year" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/1e7714c0-93f6-4921-af24-f3122a2e809e" summary:"General Electric Co Chief Executive Larry Culp said on Wednesday the company will likely have weaker quarters in the rest of the year after a surprisingly "good start" in the first quarter." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/1e7714c0-93f6-4921-af24-f3122a2e809e" lang:"en" hasPaywall:false} {"datetime":1557307800000 headline:"GE Investors Gather to Elect Revamped Board Air Gripes" source:"The Wall Street Journal" url:"https://cloud.iexapis.com/v1/news/article/dfb8a33f-9194-493e-9b83-9709dc0d18a9" summary:"General Electric Chairman and CEO Larry Culp who joined the company just a year ago is expected to answer questions from investors at the shareholder meeting in Tarrytown N.Y." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/dfb8a33f-9194-493e-9b83-9709dc0d18a9" lang:"en" hasPaywall:true} {"datetime":1557205509000 headline:"IndiGo is making plans for another aggressive push" source:"The Economic Times India" url:"https://cloud.iexapis.com/v1/news/article/6ffe2450-5e9f-4166-b5f6-a26a79f10908" summary:"by Anurag KotokyIndiGo is in talks with Airbus SE for another large plane order in a sign Asia’s biggest budget carrier has no intention of letting up on a blistering pace of expansion.The Indian airline which adds an aircraft to its fleet every week is in discussions to buy a longer-range version of the European planemaker’s newest narrow-body jet according to Chief Executive Officer Ronojoy Dutta.Existing orders will see IndiGo through the next two years Dutta said in an interview with Bloomberg News Editor-in-Chief John Micklethwait. After that the carrier is considering adding new planes including Airbus’s long-distance A321neo LR and the yet-to-be-released A321 XLR he said.“When we order we’ll order in bulk for sure ” Dutta said in New Delhi in his first interview since becoming CEO of the airline in January. “I can’t give you a tentative number but it will be large. We need longer range.”Buying BigThough Dutta didn’t specify the size of IndiGo’s next purchase it’s likely to be a multi-billion-dollar order based on the company’s history." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/6ffe2450-5e9f-4166-b5f6-a26a79f10908" lang:"en" hasPaywall:false} {"datetime":1557085967000 headline:"General Electric: Larry Culp Is The Man (For The Job)" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/87bdd7f9-1aec-4943-835f-7a436483cede" summary:"General Electric reported strong Q1 2019 results that beat the consensus bottom-line estimate. The stock is up big since General Electric reported Q1 2019 resul" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/87bdd7f9-1aec-4943-835f-7a436483cede" lang:"en" hasPaywall:false} {"datetime":1556912727000 headline:"GE to build 500 MW coal power plant in Kosovo" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" summary:"A General Electric-led ([[GE]] +2.1%) consortium is selected to build and equip a new 500 MW coal-fired power plant in Kosovo. The new plant designed to m" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/cf2c1bf4-2740-4f80-981b-7a68a8ec327f" lang:"en" hasPaywall:false} {"datetime":1556901660000 headline:"Good News! General Electric Burned Less Cash in Q1" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/492e3f89-1206-4547-a8a8-db086e773abb" summary:"Is there light at the end of the tunnel for the troubled industrial conglomerate?" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/492e3f89-1206-4547-a8a8-db086e773abb" lang:"en" hasPaywall:false} {"datetime":1556893560000 headline:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" source:"Zacks Investment Research" url:"https://cloud.iexapis.com/v1/news/article/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" summary:"The Zacks Analyst Blog Highlights: Apple Microsoft General Electric Ecolab and Celgene" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea" lang:"en" hasPaywall:false} {"datetime":1556889257000 headline:"GE to build Kosovo's new 500 MW coal power plant" source:"Reuters" url:"https://cloud.iexapis.com/v1/news/article/f5e2699b-b92e-4aed-aa90-02b557ce5652" summary:"Kosovo and London-listed power firm ContourGlobal said on Friday they had chosen a consortium of General Electric subsidiaries to build and equip a new 500 megawatt (MW)coal-fired power plant in the Balkan country." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/f5e2699b-b92e-4aed-aa90-02b557ce5652" lang:"en" hasPaywall:false} {"datetime":1556877840000 headline:"Key Takeaways From General Electric's Earnings" source:"The Motley Fool" url:"https://cloud.iexapis.com/v1/news/article/356ca854-67f0-4e1b-a782-4f1986da535f" summary:"The company delivered a solid but unspectacular quarter but that might be just what investors need right now." related:"GE" image:"https://cloud.iexapis.com/v1/news/image/356ca854-67f0-4e1b-a782-4f1986da535f" lang:"en" hasPaywall:false} {"datetime":1556833186000 headline:"General Electric: Still A Mystery" source:"Seeking Alpha" url:"https://cloud.iexapis.com/v1/news/article/2a0c3a5c-32fc-4d40-b8e7-458792672cda" summary:"General Electric released its first quarter reports with a little bit of positive news and a little bit of negative news. As one analyst put it Did this change" related:"GE" image:"https://cloud.iexapis.com/v1/news/image/2a0c3a5c-32fc-4d40-b8e7-458792672cda" lang:"en" hasPaywall:false}]
21557328805000
3GE CEO says weaker quarters likely after 'good start' to year
4Reuters
5https//cloud.iexapis.com/v1/news/article/1e7714c0-93f6-4921-af24-f3122a2e809e
61557307800000
7GE Investors Gather to Elect Revamped Board Air Gripes
8The Wall Street Journal
9https//cloud.iexapis.com/v1/news/article/dfb8a33f-9194-493e-9b83-9709dc0d18a9
101557205509000
11IndiGo is making plans for another aggressive push
12The Economic Times India
13https//cloud.iexapis.com/v1/news/article/6ffe2450-5e9f-4166-b5f6-a26a79f10908
141557085967000
15General Electric Larry Culp Is The Man (For The Job)
16Seeking Alpha
17https//cloud.iexapis.com/v1/news/article/87bdd7f9-1aec-4943-835f-7a436483cede
181556912727000
19GE to build 500 MW coal power plant in Kosovo
20Seeking Alpha
21https//cloud.iexapis.com/v1/news/article/cf2c1bf4-2740-4f80-981b-7a68a8ec327f
221556901660000
23Good News! General Electric Burned Less Cash in Q1
24The Motley Fool
25https//cloud.iexapis.com/v1/news/article/492e3f89-1206-4547-a8a8-db086e773abb
261556893560000
27The Zacks Analyst Blog Highlights Apple Microsoft General Electric Ecolab and Celgene
28Zacks Investment Research
29https//cloud.iexapis.com/v1/news/article/739fd73b-1e73-4d85-8e8b-7ced62a4e5ea
301556889257000
31GE to build Kosovo's new 500 MW coal power plant
32Reuters
33https//cloud.iexapis.com/v1/news/article/f5e2699b-b92e-4aed-aa90-02b557ce5652
341556877840000
35Key Takeaways From General Electric's Earnings
36The Motley Fool
37https//cloud.iexapis.com/v1/news/article/356ca854-67f0-4e1b-a782-4f1986da535f
381556833186000
39General Electric Still A Mystery
40Seeking Alpha
41https//cloud.iexapis.com/v1/news/article/2a0c3a5c-32fc-4d40-b8e7-458792672cda
42

<colgroup><col style="width: 25pxpx"><col></colgroup><thead>
</thead><tbody>
</tbody>
Sheet661 (2)

Worksheet Formulas
CellFormula
A2=SUBSTITUTE(SUBSTITUTE(TRIM(MID(SUBSTITUTE(SUBSTITUTE(A$1,CHOOSE(ROUND(MOD(ROWS(A$2:A2)-0.5,4),0),"datetime","headline:","source:","url:"),REPT(" ",LEN(A$1)),INT((ROWS(A$2:A2)+3)/4)),CHOOSE(ROUND(MOD(ROWS(A$2:A2)-0.5,4),0),"headline:","source:","url:","summary:"),REPT(" ",LEN(A$1)),INT((ROWS(A$2:A2)+3)/4)),LEN(A$1),LEN(A$1))),CHAR(34),""),":","")

<thead>
</thead><tbody>
</tbody>

<tbody>
</tbody>

Thank you very much jtakw!

You are clearly very talented -- and very generous with your time.

Any thoughts on the "Text result of SUBSTITUTE is longer than the limit of 50000 characters" error described above?

Thanks again.
 
Upvote 0

Forum statistics

Threads
1,213,482
Messages
6,113,913
Members
448,532
Latest member
9Kimo3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top