Consolidate Data, Combine Text, and Calculate Age

VtQuest

New Member
Joined
Oct 13, 2020
Messages
24
Office Version
  1. 365
Platform
  1. Windows
I am working with a spreadsheet regarding housing for homeless people. The people are represented by ID numbers. There are many duplicates. I want to sum SOME of the values associated with the IDs while also combining text notes. The values I wish to sum are Number of Nights. I wish to combine the text in the Causes column and in the Housing Authorized column (separately). I’d also like to calculate age from birthdates, and number of calls based on call date. (Call date is not listed in the first example.)

I have familiarity with Excel in its simple form, not at this level. (I’m sure, to some people, this is pretty simple, but not to me - yet.) So, I’m not familiar with many formula terms or query tools. Please respond with this in mind.

The first image is a sample of the data, with identifying information obscured. The second image is of the desired result.

Thank you for your help.
Sample Data.PNG
Result Desired.PNG
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
Hi *VtQuest,
Welcome to the forum.

As you are using Excel 365, there is a new function called TEXT JOIN which would be ideal to concatenate the text columns you need. Unfortunately, I do not have that version of Excel, so I can only offer a UDF (User Defined Function) as an alternative.

I was reluctant to respond to this thread too quickly as I was hoping another board member with 365 would provide a solution.

I think assisting the homeless is a fantastic thing to do so I’m keen that you should get a suitable solution. But that is likely to come from another board member who has the TEXT JOIN function on their version of Excel.
In the meantime, I can offer a UDF solution along with a few other formula.
Credit to Aladin Akyurek for the UDF.

I have uploaded a UDF version of the file to Dropbox.
Forum Q.xlsm

A few notes:
Column N header says “Date” but under that heading are the numbers 5, 2 & 4 which are clearly not dates. Is the column N intended to be the number of occurrences? I have assumed it is and therefore counted the occurrences of each ID number.

To view the UDF on the downloaded file, hold your cursor over the Sheet1 page tab, right click on the page tab and select “View Code”, the UDF is in “Module1”. This UDF is used for the formula in columns P,Q & R

In the formula in columns P, Q & R I have added “CHAR(10)” which acts as a word wrap to keep each concatenated set of words on separate lines. I’m not sure if you wanted this.

Because the text in columns “P, Q & R” are, may I say, a bit wordy. May I suggest the text is substituted with a shortened code. There are some suggested examples on the second page of the downloaded file. If we continue to use the long text words, then the row height will grow as the amount of text increases. Using a shortened code will assist in keeping the row height consistent.

Hope this helps
Paul
 
Upvote 0
Hi Paul,

Thank you so much for your help and suggestions. We do help the homeless by connecting them with referrals to services. We also provide after-hours/weekend/holiday intake for the state agency that provides motel housing for homeless folks. What I'm working on is a annual (fiscal year) report on the data we collect.

The sample sheet I included in my query is only a fraction of the data -- the actual sheet runs to over 4,000 rows.

Unfortunately, my lack of knowledge is such that I need a bit more basic information:
- Should I set up the formula section on a separate page from the data sheet?
- Should I copy and paste the formulas from the View Code page? Or from the response sheet you gave me above?
- I'm not sure how to do that. How do I reference the source data?
- I guess I just need to know what are my next steps.
I'm sure I'll have other questions...

I appreciate your patience and your willingness to help me.

Many thanks,
Beth
 
Upvote 0
Hi
I assume you may prefer to have the report on a different page tab. I have set it up for separate pages and adjust the formula range to 5000, just so you have enough room for the expected data.

I have changed the page tab names to “Data” and “Report”
When we move the report to separate page tab, the formula has to be adjusted to point to Sheet1 which is now called “Data”
So typically, this formula for ”# of Nights” was
=SUMIFS(INDEX($G$2:$G$23,0),$A$2:$A$23,$B2)

Has now become:-
=SUMIFS(INDEX(Data!$G$2:$G$23,0),Data!$A$2:$A$23,$B2)

In other words, I have referenced the Data page by adding Data! In front of the range, so the formula knows which page to look at.

If you want to transfer the formula and code to a new workbook, that can be done but you will have to create a new module and paste the UDF (User Defined Function) into the new module. If you would like to do this, let me know and I can explain in more detail.

Also, you need to be aware that some of the formula are array formula, they can not be entered by clicking the enter key. You will need to press Ctrl Shift Enter. If you do this correctly, there will be curly brackets at the start and end of the formula, you cannot enter these brackets manually, it is a feature specific to array formula.

I think the easier route would be for me to set it up and then you can copy the source data to the new workbook. I have used the same header names but I have a feeling you have provided an abridged version and there may well be more columns in the real sheet, if this is the case, please let me know so we can get it set up correctly.

Revised file
Forum Q - Rev1.xlsm

cheers
Paul
 
Upvote 0
Hi Paul,

Thank you, again, for your help. I'm following your logic about the formula, but not quite sure I understand how the repeated IDs will be consolidated.
I hesitate to paste it here, as this is confidential information.

Take Care,
Beth
 
Last edited by a moderator:
Upvote 0
Hi Beth,
Just picking up on your comment
“not quite sure I understand how the repeated IDs will be consolidated.”

One way to check what is happening with the consolidation is to look at ID number 0587 on the workbook called “Forum Q–Rev1.xlsm”
The Report page, column C indicate there are 7 occurrences. – You can count them on the Data page.
Column E is displaying: Chronically homeless & Eviction without cause & Kicked out by family/friend
Column E does not show repeated items. The actual occurrences are:

Chronically homeless occurs 4 times
Eviction without cause occurs 1 time
Kicked out by family/friend occurs 2 times
Total = 7

Try changing some of the words on the Data page in one of the rows for ID 0587 and with each change you make, take a look at the report page to see how it is being consolidated.
I’m not sure if this addresses your question but I hope so.

Regarding the confidential nature of the data. I totally understand that you cannot send confidential data. I only need to check the formulas in the file I sent to you are actually referencing the correct columns on the Data page. In other words, are the columns on the Data page in the file I sent to you the same as the columns in your real file?

Cheers
Paul.
 
Upvote 0
Hi Paul,

You're correct - there are more columns than shown in my sample page. Here are the actual columns:

Client IDTransaction DateCaller Date of Birth (for ACCESS use only)Cause of homelessnessChoose ALL of the criteria that applies to client (if ANY)Group/individual TypeHousing authorized?How Many Children? (enter 1,2,3 or more)LocationNumber of nights housed (enter 1,2,3,4 or 5)

I also like the short codes you've used - I'll replace the existing phrases with those.

In terms of the explanation of how the data associated with the ID numbers are consolidated, what do I enter into the left column (yellow cells)? Do I copy and paste the entire list of ID numbers, repeated or not? If I'm not to paste the entire list in, how do I get a simplified list of the ID numbers to input into the left column? I hope this question makes sense.

Finally, is there a way to count the number of times each cause of homelessness is listed? And the number of times each Housing Authorized decision is listed?

Thank you so much!

Beth
 
Upvote 0
Hi Beth,
The new columns show that some of the fields the formula are looking at will need to shift over. I will adjust them later tonight.

Question.
If I am to “count the number of times each cause of homelessness is listed? And the number of times each Housing Authorized decision is listed?”
Is this a count per ID number or a total of all of them in the whole report?
If it is for the whole report, I can add a header row for totals. If it is per ID number, I will need to put my thinking cap on.

Regarding the way you enter the data into the yellow cells
Option 1 = Type the ID number in the yellow cell individually – Way too boring!
Option 2 = Copy and paste unique numbers – follow the method below:-

On the Data page, copy all the ID numbers from row 2 to as far down as they go. You only need to copy the single column.
Paste this column of data elsewhere (a new page, just so they are out of the way)
On the new page, select the range again (unless they are already selected)
Go to the Data Ribbon and about half way across click on the “Remove Duplicates”
In the pop up box, tick the column number where your data is and click OK
Another box will appear telling you how many duplicates were removed

You now have a unique list of ID numbers with all the duplicates removed.

At this stage it would be prudent to sort the list so the ID numbers are in order, low to high or high to low, whichever you prefer.
You can then copy and paste them into the yellow cells on the report page. If required, you can re-colour the yellow cells because pasting the data will overwrite the yellow colour.
The yellow cells were to act as an Aide-mémoire, to help avoid accidentally over typing the formula, you don’t really need them to be coloured.

Cheers
Paul
 
Upvote 0
Hi Paul,

Good questions: I was hoping to count those numbers (# causes, #Auth response) for each ID, but if it's too much of a hassle, then it doesn't have to be in there.

Thanks for the instructions to remove duplicates. Now, I remember having done that years ago.

Many thanks, as always!

Beth
 
Upvote 0
Hi Paul,

Another question: Is it possible to remove all but the name of the town from the Location Column?
Here is typical text in that column: "AHS District Office (3 letter code): Jdo; county: Caledonia; town: St Johnsbury"
I know how to find and replace, but am not sure how to replace all but town. I'm sure it's pretty simple, but not within my experience. Maybe there's a good basic Excel book you can suggest?

Best Regards,
Beth
 
Upvote 0

Forum statistics

Threads
1,215,446
Messages
6,124,900
Members
449,194
Latest member
JayEggleton

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top