Hello everyone,
Big fan of the Mr. Excel forum, and first time posting. Thanks in advance for any advice!
A small nonprofit that I am helping gave me a dataset that tracks their donations and I am trying to clean it up for them so we can run analytics. However, the dates for each row (1 row per donation) are entered in a variety of numeric and text formats. Below is a sample of the data I have and how I will need it at the end. I prefer formulas, as I don't know VBA, but am willing to follow VBA instructions if provided. I am using Excel from the Microsoft 365 Pro Plus suite.
Also please note, that I have been discussing this over on the ExcelForum.com with partial success (link below). I will copy over formulas that got partway to the end goal outputs.
How to clean up and format very messy date data (mix of text and date)
Here is a sample of my data, where for our reference the first row are the Excel column letters and first column are the Excel rows.
Column
A) a unique identifier number (not shown in table below) so I can merge my changes back to my master dataset.
B) Date: my original date data
C) IsText: If the date is text (true) or numeric (false)
D) ShouldBe: The date in A, as an Excel date formatted here as DD-MM-YY for readability.
E) ShouldBe2: Any non-date text in A is extracted into this column
F) FixDate: Best attempt so far to convert the date, written in unformatted Excel date
G) FixDate2: FixDate reformatted to compare result
H) FixNonDate: Any non-date extraction attempts thus far into a new column.
I) Comments
For dates that are not complete (e.g. missing MM or DD), I don't mind replacing the missing values with 01.
Some of the original dates have date and non-date information. I need these in extracted into separate columns, as shows below. In a few cases, the solution formulas so far extract only part of the non-date text. If we cannot fully fix that formula, that is ok; I can manually find and fix these few instances.
<tbody>
</tbody>
Thanks to AlKey and FlameRetired over on the ExcelForum.com thread. FlameRetired's formulas got me the closest to what I need:
Formula to extract and fix the date
Formula to extract the non-date text
Thanks again for the time and consideration!
Big fan of the Mr. Excel forum, and first time posting. Thanks in advance for any advice!
A small nonprofit that I am helping gave me a dataset that tracks their donations and I am trying to clean it up for them so we can run analytics. However, the dates for each row (1 row per donation) are entered in a variety of numeric and text formats. Below is a sample of the data I have and how I will need it at the end. I prefer formulas, as I don't know VBA, but am willing to follow VBA instructions if provided. I am using Excel from the Microsoft 365 Pro Plus suite.
Also please note, that I have been discussing this over on the ExcelForum.com with partial success (link below). I will copy over formulas that got partway to the end goal outputs.
How to clean up and format very messy date data (mix of text and date)
Here is a sample of my data, where for our reference the first row are the Excel column letters and first column are the Excel rows.
Column
A) a unique identifier number (not shown in table below) so I can merge my changes back to my master dataset.
B) Date: my original date data
C) IsText: If the date is text (true) or numeric (false)
D) ShouldBe: The date in A, as an Excel date formatted here as DD-MM-YY for readability.
E) ShouldBe2: Any non-date text in A is extracted into this column
F) FixDate: Best attempt so far to convert the date, written in unformatted Excel date
G) FixDate2: FixDate reformatted to compare result
H) FixNonDate: Any non-date extraction attempts thus far into a new column.
I) Comments
For dates that are not complete (e.g. missing MM or DD), I don't mind replacing the missing values with 01.
Some of the original dates have date and non-date information. I need these in extracted into separate columns, as shows below. In a few cases, the solution formulas so far extract only part of the non-date text. If we cannot fully fix that formula, that is ok; I can manually find and fix these few instances.
B | C | D | E | F | G | H | I | |
1 | Date | IsText | ShouldBe | NonDate | FixDate | FixDate2 | FixNonDate | Comments |
2 | 04-10-16 | False | 10-Apr-16 | 38994 | 4-Oct-16 | FixDate2 MM & DD flipped. | ||
3 | 09-16-2004 | True | 16-Sep-04 | 42614 | 1-Sep-16 | FixDate2 DD as year, YY dropped, and 1 as new DD. | ||
4 | Nov. ‘02 | True | 01-Nov-02 | 37561 | 1-Nov-02 | This works. | ||
5 | Jan/Feb 2011 | True | 01-Feb-11 | 40575 | 1-Feb-11 | This works. | ||
6 | 8/12/04 (scholarship) | True | 12-Aug-04 | (scholarship) | 38329 | 8-Dec-04 | (scholarship) | FixDate2 MM & DD flipped. FixNonDate column ok. |
7 | Paid | True | Paid | Col E to G are empty (""). Previous solutions had these as #N/A Paid should appear in FixNonDate with FixDate empty (""). | ||||
8 | 03/27/06 | False | 27-Mar-06 | 46447 | 1-Mar-27 | FixDate2 DD as year, YY dropped, and 1 as new DD. | ||
9 | 02/22/04 | True | 22-Feb-04 | 44593 | 1-Feb-22 | FixDate2 DD as year, YY dropped, and 1 as new DD | ||
10 | 2005 | False | 01-Jan-05 | 38353 | 1-Jan-05 | This works. | ||
11 | Deposited in Bratislava in 2001 | True | 01-Jan-01 | 2001 | 23-Jun-05 | in Bratislava in 2001 | Found YY in Date, but entered it as the unformatted Excel date. Partial extraction of non-date text into FixNonDate. |
<tbody>
</tbody>
Thanks to AlKey and FlameRetired over on the ExcelForum.com thread. FlameRetired's formulas got me the closest to what I need:
Formula to extract and fix the date
Code:
=IFERROR(IF(ISNUMBER(B2)*(LEN(B2)=4),DATE(B2,1,1),IF(LEN(B2)>1,IFERROR(LOOKUP(1E+306,--LEFT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B2,".",""),CHAR(145),""),"//","/"),"?",""),"(",""),")",""),"$",""),ROW($1:$30))),LOOKUP(1E+306,--RIGHT(B2,ROW($1:$30)))),"")),"")
Formula to extract the non-date text
Code:
=IFERROR(IF(LEN(MID(B2,FIND(" ",B2)+1,99))>4,MID(B2,FIND(" ",B2)+1,99),""),"")
Thanks again for the time and consideration!