April 14, 2017 - by Bill Jelen
Some of the questions that come in are pretty difficult. Today, we have a column of cells. Each cell has some words, then a date, then some more words. The goal is to pull the date portion of that text to a new column. This is a dueling episode with ideas from Bill and Mike.
- Bill's super wide approach:
- Put all 12 months in separate columns
- Use the FIND function to see if this month is in the original text
- To find the minimum starting position, use =AGGREGATE(5,6,…
- A few extra formulas to look for a number 2 or 3 positions before the month
- Mike's approach:
- Use SEARCH instead of FIND. Find is case-sensitive, Search is not.
- Create an function argument array operation by specifying B13:B24 as Find_Text.
- The formula returns #VALUE! Error, but if you press F2, F9, you will see that it is returning an array.
- The first 13 functions in AGGREGATE can not handle an array, but functions 14-19 can handle an array.
- 5=MIN and 15=SMALL(,1) are similar, but SMALL(,1) will work with an array.
- LOOKUP, SUMPRODUCT, CHISQ.TEST, INDEX, and AGGREGATE can handle function array arguments without Ctrl + Shift + Enter
- Mike was smarter by looking to see if 2 characters before the Start is a number, and then grabbing 3 characters before. The extra space is eliminated by the TRIM()
- To get the Title, use SUBSTITUTE function to get rid of the Date text in column C
Bill Jelen: Hey, welcome back. It's time for another Dueling Excel Podcast. I'm Bill Jelen from MrExcel. I'll be joined in by Mike Girvin from Excel Is Fun.
This is our Duel #170: Finding Dates
Hey, welcome back everyone. I had such a great question here and I couldn't solve it. At least I couldn't solve it easily so I went out to Mike Girvin and I said, “Mike, hey, do you have a way to do this?” He said, “Yes, I have a way to do it. Let's do a Duel.”
So, someone on YouTube sent in this data and every single cell in general has something like a document title followed by a date. They wanted to break this data into the document title: what it is, what the thing is and then what the date is. But the dates are completely evil. Like here, it's 20 of January; but down here, there's things where the date might be after the cell, April 9th. Alright, and no matter which way it is we want to find it. And sometimes there's two dates and this is just completely horrible and that it is such a just a mixed-up situation of dates and as possible, do not even have a date show up, alright. So, here's my attempt. Out on the right-hand side, I'm going to put the things I'm looking for. What I really like here is they never abbreviated the month name. I really, really appreciate that. So type in January and I'll drag out here to December like that, and for each cell I want to know can we find =FIND that January. So I'm going to press F4 one, two times to lock it down to just a row, in the text over there in Column A, like that. I'll press F4 one, two, three times to lock it down to the column, alright. And here, it's telling us that January is found in Position 32 and for the other 11 months, it's going to tell us that it's not found at all. In other words, we're getting the Value error now. What I need to do there is I need to find, I need to find the minimum value ignoring all of the value errors. So, unhide this little formula here =AGGREGATE and let's build this just from scratch, =AGGREGATE, what we want is the MIN so that's the number 5, and then Ignore the error values number 6 comma and then all of these cells from January through December. And what that's going to tell us is that's going to tell us where the month happens. And in this case, we're going to get 0, say the month doesn't happen at all.
Alright now, let's unhide the rest of this. So, to handle the situation where here we have 20 January or 1 November, I said the first thing I'm going to do is I'm going to look at where that month starts and go back two cells, two cells, two characters, two characters. And see if that is a number, not so. That's my column here called, Adjust2. Adjust2. And here's what we're going to do. I'm going to say, take the MID of A2 starting it the where in G2-2 for a length of 1, add 0 to it and ask, is that a number or not? Alright, so is that a number. And then, we'll also look for the situation where it is a 2-digit date, so 20th of January. So that is called Adjust3, go back 3 characters from the where. So there's the Where, go back three characters for length of 1, add 0 to it, and see if that's a number, alright? Then we're going to adjust and the Adjusted Where says IF. IF it’s this weird case was 0, we're just going to put a really large value 999; otherwise, we're going to go from G2 and either go back 3, if Adjust3 is True or go back 2 if Adjust2 is True, or if none of those are True, the Where is going to be where the month starts. Alright, now that we know that that Adjusted Where, we'll double click to copy that down. Well, hey now, it's really easy. We're just going to - for the Title, we're going to say take the left of A2, how many characters we want. We want D2-1 because that's the -1 is to get rid of the space at the end. Although I guess the TRIM is also getting rid of the space at the end.
And then for the date, we're going to use the MID. MID for- MID of A2 starting at the Adjusted Where in D2 and go out 50 or whatever along you think it could possibly be, and then the TRIM function, and we will double click to copy that down.
Alright now, the reason I reached out to Mike is I said, I wonder if there's a way that I could replace these 12 columns with a single form, actually these 13 columns with a single form. Is there some way that I could do this using an Array formula? And Mike, of course, wrote that great book, Ctrl+Shift+Enter, on Array formulas. And I tried a few different things and in my mind, there was no way that it could be done. Alright, but you know, let's go ask the expert. So Mike, let's see what you have.
Mike Girvin: Thanks, MrExcel. Hey, and speaking of expert, this was pretty expertly done. You used FIND, AGGREGATE, ISNUMBER(MID. Now, when you sent this question over to me, I went ahead and solved it and it is amazing how similar my solution is to yours.
Alright, I'm going to go over to this sheet here. I'm going to start with figuring out where the start position in this text string is for each particular month. Now the way I'm going to do it is I'm going to, hey, use this SEARCH function. Now, you used FIND, I use SEARCH. Actually probably FIND is better in this situation because FIND is case-sensitive, SEARCH is not. Now normally what we do with either FIND or SEARCH, I say, hey, go FIND, January, comma within this larger text string, that's how we normally use SEARCH Ctrl+Enter, and it counts on its finger: one, two, three, four, five. It says the 32nd character is where it found January.
Now, instead of doing it in many cells across the columns, I'm going to hit F2, come up here and the FIND_TEXT. Notice we gave it 1 item, SEARCH gave us 1 answer. But if I highlight the entire column of month names, now instead of a single item I put many items in there. This is a Function Argument. We're putting an array of items in and so that means we're doing a Function Argument Array operation. Any time you do that, you tell the function, hey, give me 12 answers, 1 for each month. Now this will deliver an array so if I try to Enter this and copy down it's not going to work.
Well, let's go down to any particular cell, F2 and then F9 to look that yes, in fact, it is delivering an array, and look at that. It looks like I F2 up here, forgot to lock it. So I'm going to click on that and F4, Ctrl+Enter, double click and send it down, F2, F9. There we go, that's that array. There's exactly 12 answers and there's the 34 and the 55. Now, from this array, since we always want the actual first month, not the second month, we want whatever the MIN is because those are number of characters in from the left. So I'm going to click Escape, come up to the top F2. I'm going to use the AGGREGATE function. Hey, we would like to use AGGREGATE 5 but no matter how hard you try if you have an array operation and we do here, if you try to put any function 1 to 13, it just doesn't work. But no problem, we have SMALL, so number 15 comma. Any one of those functions 14 to 19, they understand array operations. And once you select 14 or above, this is the screen tip you're working off, not this bottom one with the references. Alright, comma.
The second options here we want to Ignore errors, comma. That number 6 will then tell AGGREGATE to look through here and ignore the errors. It will only see the numbers. And this is one of five functions in Excel: LOOKUP some product, CHI SQUARE TEST, AGGREGATE, and INDEX that actually have a special argument that can handle Array operations without doing any special key stroke. So there is the Array, comma, and then for K we simply put A1. That's our way of getting them in. Close parentheses, Ctrl+Enter, double click and send it down. And so that tells us the position where it found the first month name from this list.
Now, we'll deal with the NUM error at the very end in our final formula. Now, we are going to have to take these and notice that sometimes there's a number before the month and sometimes, like down here in December, there is not. So I'm going to do the same thing MrExcel did. I'm going to go back two characters and check whether it is a letter or in this case a number =MID, there’s the text, comma, the starting position. Well, I want to start at 32 in this case and -2 to go back to and comma. I get exactly one character. Now, if I close parenthesis MID LEFT RIGHT they all deliver text, double click and send it down and we want to check if it's a number. So watch this, the whole column is highlighted. Active cell at the top, I'm going to hit F2. We could do any Math operation to convert text numbers back to number so I'm going to add 0, Ctrl+Enter to populate this edited formula down through the column. Ctrl+Enter. Now, we can ask the question: Is the returned item a number? F2. So now I say ISNUMBER, close parenthesis, Ctrl+Enter. So now we have a pattern of Trues and Falses. Now, remember we need to get the starting position and for 32 we're definitely going to have to subtract 3 and start at that 20 but notice down here, we don't want to subtract any. So our logical test if I hit F2, that will simply be put into the IF Logical Test Argument. If that comes out True comma then I want to jump back 3 comma. Otherwise I want to jump back 0, close parenthesis, Ctrl+Enter to populate that all the way down. Now we can take this number and subtract the number over here to give us our starting position. Active cell at the top F2, I'm putting this inside of MID. There's the text, comma. And can you believe it? All of this to get the start number. So I'm going to click on that B2 and subtract our IF, come to the end comma and I'm just going to put a big number in here, 100, some big number big enough to get all the way to the end, close parenthesis and Ctrl+Enter to populate that all the way down. It looks like we have some extra spaces and that makes sense because right here we went back three, so no problem. Active cell at the top, F2, I'm going to use the haircut function, the diet function. No, the TRIM function to remove extra spaces except for single spaces between words. Come to the end, close parenthesis, Ctrl+Enter to populate that all the way down.
Now, I have the date, oh, except for the NUM. Now, I could come to the top and use IF error but then it would run all of these plus that cell right there and for a small data set, it doesn't matter at all; but, with the goal of efficiency, I'm going to say IF(ISNUMBER and I'm going to click on that cell, that way close parenthesis, comma. The trigger for whether we run the formula is only based on that instead of the entire formula. If that comes out True, we want to run the formula, comma. Otherwise, double quote double quote. That zero link text string will show nothing. Ctrl+Enter, double click and send it down. And now, all we need to do is get the Title. Well, I already have the text that I don't want in here so I'm going to use the SUBSTITUTE function. SUBSTITUTE, there's the text, comma. The old text, it's that right there, comma, the new text. Hey, I want to take that and SUBSTITUTE in nothing. There's our zero link text string, Ctrl+Enter, double click and send it down.
Now, I'm going to come over here to column B, right click, Hide and there we go. Alright, throw it back to MrExcel.
Bill Jelen: Hey, Mike, that is brilliant and I know exactly, exactly where I went wrong. Right here in row 12 when the formula returned the #VALUE error, you pressed F2, F9 to see that it's returning an array. When I got the #VALUE error, I just swore a little bit and said, why isn't this working? Never thought of pressing F2, F9, alright. Also, like that, of course, MIN and SMALL(,1) are the same but the difference is SMALL(,1) will work with an array in the AGGREGATE function. That was a beautiful, beautiful trick. And then, I went through that whole hassle to look at 2 characters before and 3 characters before. You were smart enough to say, “Hey, we're going to go 2 characters before and if so, go back 3 characters.” Worst case you get a space for that extra space and eliminated by the TRIM. And then the cherry on top, using SUBSTITUTE function to get rid of the Date text in column C. What a brilliant, brilliant way to go, alright.
So, I want to thank everyone for stopping by. We'll see you next time for another Dueling Excel Podcast from MrExcel and Excel Is Fun.
Download the sample file here: Duel180.xlsm
Title Photo: dimitrisvetsikas1969 / Pixabay