Working with a formula that pulls data from multiple columns

Dclark015

New Member
Joined
Mar 6, 2023
Messages
17
Office Version
  1. 365
Platform
  1. Windows
Good Afternoon,

I am working on a spreadsheet and could use some help. I have the formulas I need down however there is a lot of data I need to pull and was hoping there is a way to make it easier. The formula I am currently using is =IF('TEST Template'!A6<>"",'TEST Template'!$A$2,"") It does pull from 2 sheets. I am trying to build it so that I can just copy it down the column by hitting the green dot. This works however it's taking way more time than I thought it would to capture everything I need because of my needs.

If possible I need the A6 row to only carry to 360 and then start over as B6 and so forth until I hit AH. For the second item I need to keep the $A$2 locked however I also need the A to change a B when the first A changes to a B and keep that through the 360. If I can just get the first letters as stated I can just go in every 360 and change the second letter as needed.

Any help is great and thanks.
 

Excel Facts

Move date out one month or year
Use =EDATE(A2,1) for one month later. Use EDATE(A2,12) for one year later.
The formula requires only a minor edit in the last step, since the information sought is already created in the correct form by the existing formula, but not displayed. To display it, we use the HSTACK function in the last step. As I reviewed each step to offer some documentation, I noted one thing that should be addressed: I hardwired a "4" in the "header" formula, and that is now changed to
header,DROP(CHOOSEROWS(ary,2),,DaysBeginAryCol-1), where DaysBeginAryCol-1 replaces the 4. This step drops the first several columns, leaving only the "days" columns. Here is a very brief sample to show the revised formual (note that I've inserted some line breaks to make each step more obvious:
MrExcel_20231024 (version 1) (version 1).xlsx
ABCD
1Scheduled DateHUB NameChild NodeMissed Amps
210/23HUB LocationVAALD09D0A1
310/23HUB LocationVAALD0210A1
410/23HUB LocationVAALD0100D1
TEST Daily Template
Cell Formulas
RangeFormula
A2:D26A2=LET(ary,'TEST Template'!A1:O21,DaysBeginAryCol,5, data,FILTER(IF(ISNUMBER(ary),ary,ary&""),INDEX(ary,,1)<>""), numcols,COLUMNS(data), NumDayCols,numcols-DaysBeginAryCol+1, dleft,TAKE(data,,1), dright,TAKE(data,,-NumDayCols), data2,HSTACK(dleft,dright), locrows,LET(a,SEQUENCE(ROWS(data2))*(TAKE(data2,,1)="Node"),FILTER(a,a<>0)-1), locary,MAKEARRAY(ROWS(data2),1,LAMBDA(r,c,IF(ISNUMBER(MATCH(r,locrows,0)),INDEX(data2,r,c),""))), idx,SCAN("",locary,LAMBDA(acc,val,IF(val="",acc,+val))), fintbl,HSTACK(idx,data2), fintbl2,FILTER(fintbl,NOT(ISNUMBER(MATCH(SEQUENCE(ROWS(fintbl)),locrows,0)))), fintbl3,FILTER(fintbl2,CHOOSECOLS(fintbl2,2)<>"Node"), data3,TAKE(fintbl3,,-NumDayCols), data3scol,TOCOL(data3,,TRUE), locnode,TAKE(fintbl3,,2), expln,CHOOSEROWS(locnode,MOD(SEQUENCE(ROWS(locnode)*NumDayCols,,0),ROWS(locnode))+1), header,DROP(CHOOSEROWS(ary,2),,DaysBeginAryCol-1), datecol,TOCOL(IFNA(EXPAND(header,ROWS(data3)),header),,TRUE), final,HSTACK(datecol,expln), FILTER(HSTACK(final,data3scol),N(data3scol)<>0))
Dynamic array formulas.


Documentation isn't supported within these formulas, and as they grow in complexity, that becomes a problem. Here is an expanded version of the formula with comments about each step:
Comments added on LET formula used for Dclark015

LET(
ary,'TEST Template'!A1:O21, // assign data range to variable ary

DaysBeginAryCol,5, // assign the column number of ary where the "days" data begin to variable DaysBeginAryCol

data,FILTER(IF(ISNUMBER(ary),ary,ary&""),INDEX(ary,,1)<>""), // condition the ary array by forcing blanks (which FILTER was considering to be 0's) to be blanks, and then remove rows where the 1st column (Nodes) is empty...and assign this new trimmed down array to variable data

numcols,COLUMNS(data), // assign number of columns in data array to numcols

NumDayCols,numcols-DaysBeginAryCol+1, // determine number of columns that contain "days" information and assign that value to variable NumDayCols

dleft,TAKE(data,,1), // assign 1st column of "data" array (the consolidated version of the original source "ary") to variable dleft (the left side of the "data" array

dright,TAKE(data,,-NumDayCols), // from the right side of the "data" array, take only the columns containing "days" information and assign those columns (which form a multi-column array) to variable dright (essentially the right side of the "data" array

data2,HSTACK(dleft,dright), // combine dleft and dright to form a new data array called data2 (this is identical to the "data" array with a few columns ommitted)

locrows,LET(a,SEQUENCE(ROWS(data2))*(TAKE(data2,,1)="Node"),FILTER(a,a<>0)-1), // create a vertical array called "a" that consists of either 0's or the array row position (of array data2) where the word "Node" appears in column1 of data2...and then feed this array "a" into a FILTER function to remove the 0's, returning a smaller array called "locrows" that shows the row locations in data2 where the Hub Location should be found (assumes the Hub Location always appears in the row before the word "Node" in the original source table ary)

locary,MAKEARRAY(ROWS(data2),1,LAMBDA(r,c,IF(ISNUMBER(MATCH(r,locrows,0)),INDEX(data2,r,c),""))), // a little tricky here...this creates a new array called "locary" that is generated by feeding a sequence of array row numbers for the data2 array into a LAMBDA function. The array is to be 1 column wide. For each array row number, we determine if the array row number can be found anywhere within the small "locrows" array, and if that is TRUE, then we assign the corresponding value from the "data2" array to this position in the "locary" array, and if the evaluation results in FALSE, then a blank is assigned to the "locary" array position. Essentially this step creates a single column array that contains either blanks or the name of the Hub Location in the same position where it appears in array "data2".

idx,SCAN("",locary,LAMBDA(acc,val,IF(val="",acc,+val))), // the "locary" array is fed into a SCAN function to create a new array called "idx". SCAN performs an element by element operation via a LAMBDA function. Here the LAMBDA begins with a blank assigned to its accumulator variable, and if a "locary" array element is blank, then the accumulator value is assigned to that position in the "idx" array, otherwise the value of the "locary" element is assigned to the "idx" array position. This step acts like a "Fill Down" operation so that the "idx" array will look like the "locary" array except where blanks appear in the "locary" array will now have the Hub location.

fintbl,HSTACK(idx,data2), // the data2 array is horizontally appended to the "idx" array (a single column array containing Hub Locations), essentially creating a new first column showing the Hub Location for each row...although this array still has some unneeded rows that need to be removed

fintbl2,FILTER(fintbl,NOT(ISNUMBER(MATCH(SEQUENCE(ROWS(fintbl)),locrows,0)))), // this removes the rows from the "fintbl" array that correspond to the original Hub Location rows (the ones shown in "locrows", which are no longer needed since that information has been transformed into the first column of the "fintbl" array)

fintbl3,FILTER(fintbl2,CHOOSECOLS(fintbl2,2)<>"Node"), // this removes the rows from the "fintbl2" array where the word "Node" appears in the 2nd column of "fintbl2"...again these rows are no longer needed, as they were only used intiallly to determine where we should find the Hub Locations

data3,TAKE(fintbl3,,-NumDayCols), // similar to how the "dright" array was formed above, this takes all of the right side columns containing "days" information from the transformed and consolidated "fintbl3" array

data3scol,TOCOL(data3,,TRUE), // converts the multiple columns of the "data3" array (all of the "days" information) and stacks them into a single column, preserving blank entries

locnode,TAKE(fintbl3,,2), // similar to how the "dleft" array was formed above, this takes the two leftmost columns (containing Hub Locations and Nodes) of the "fintbl3" array...so this step and "data3" just above split "fintbl3" into two separate arrays. Just above, "data3" was transformed into a single column stacked array, so it now becomes necessary to stack this "locnode" on top of itself multiple times to correspond to information in "data3scol"

expln,CHOOSEROWS(locnode,MOD(SEQUENCE(ROWS(locnode)*NumDayCols,,0),ROWS(locnode))+1), // as alluded to in the previous step, this step will expand the Hub Location-Node array ("locnode") multiple times. Starting inside this formula, SEQUENCE creates a vertical array whose length is equal to the length of "locnode" multiplied by NumDayCols, and the sequence begins at 0. If we then MOD this sequence with number of rows in "locnode", we'll get a long array consisting of repeating sequences {0;1;2;...;ROWS(locnode)-1}. And if we then add 1 to this result, we'll generate a long array of repeating sequences {1;2;3;...ROWS(locnode)}, which are the array row numbers of "locnode" that correspond to the information in "data3scol". CHOOSEROWS is used to perform this row extraction from "locnode", to form the "expln" array whose rows correspond to the rows in "data3scol".

header,DROP(CHOOSEROWS(ary,2),,DaysBeginAryCol-1), // returns to the original "ary" data array and extracts the 2nd row, and then drops the columns to the left of the 1st column where "days" information begins, leaving a horizontal array of dates (assumes the dates appear in row 2 of "ary")

datecol,TOCOL(IFNA(EXPAND(header,ROWS(data3)),header),,TRUE), // just like a scheme was needed for repeating the Hub Location-Node rows, a similar scheme is needed to repeat the date headers so that they will correspond to the single column data3scol array (the "days" information). EXPAND creates an array with the date "header" as the first row and then adds more rows for a total number of ROWS(data3) rows, which means the array length matches the length of the data3 array. By default, #N/A errors are introduced when the expanded array length exceeds that of the array to expand. This behavior is exploited by trapping it with an IFNA function to replace the error with the corresponding element from the date "header" array. The result is a date "header" array repeated for each row in "data3". We then convert this array of data headers into a single stacked column, taking each column from left to right and stacking it below the one to its left (same effective stacking order used when "expln" was formed)

final,HSTACK(datecol,expln), // combines the date single column array "datecol" with "expln" to create a three column array of Date, Hub Location, and Node that correspond to every row in "date3scol"

finfltr,FILTER(final,N(data3scol)<>0), // the original final FILTER operates on the "final" three-column array of Date, Hub Location, and Node and retains only those where "data3scol" is not 0. But recall that "data3scol" is the information found within the "days" columns. So here it is used only to determine whether a row of information (Date/Hub/Node) should be displayed.

If you want to see the "days" information, then this last step of the formula is modified. So before we feed the "final" array into FILTER, we'll first HSTACK "final" with "data3scol" and then perform the FILTER on the 4-column array...so the new final step will be...
finfltr,FILTER(HSTACK(final,data3scol),N(data3scol)<>0),

finfltr)
 
Upvote 1
Solution
You’re welcome…I’m happy to help.
 
Upvote 1
Starting with some notional data in the 'TEST Template' sheet, such as this:
MrExcel_20231024.xlsx
ABCD
1
2FV in A2FV in B2FV in C2FV in D2
3
4
5
6a6b6c6
7a7c7d7
8b8d8
9a9b9
10b10c10
11a11c11d11
12b12c12
TEST Template

The following creates a new array based on the conditional you've described, replacing any values with the fixed values in the corresponding columns while retaining blanks. It's not clear to me what you want to do with the result. I'm assuming you want to stack each of the array columns to create a single long column, but let me know if that is not correct.
MrExcel_20231024.xlsx
A
1Results
2FV in A2
3FV in A2
4
5FV in A2
6
7FV in A2
8
9FV in B2
10
11FV in B2
12FV in B2
13FV in B2
14
15FV in B2
16FV in C2
17FV in C2
18
19
20FV in C2
21FV in C2
22FV in C2
23
24FV in D2
25FV in D2
26
27
28FV in D2
29
Dclark015
Cell Formulas
RangeFormula
A2:A29A2=LET(ary,'TEST Template'!A6:D12,fv,'TEST Template'!A2:D2,cnum,COLUMNS(ary),rnum,ROWS(ary),aryfv,MAKEARRAY(rnum,cnum,LAMBDA(r,c,INDEX(fv,1,c))),arymap,MAP(ary,aryfv,LAMBDA(a,b,IF(a<>"",b,""))),TOCOL(arymap,,TRUE))
Dynamic array formulas.
 
Upvote 0
Thanks for looking into this. I'll add some pictures and more information as best I can. I'm not able to install XL2BB because of work PC configurations. I get an error saying this can't be used in protected mode and the store is locked out as well.

The information on this sheet (Named TEST Template) is where I am pulling the data from and then adding it to the second sheet (named TEST Daily Template) The data I am trying to pull is if E6(Sixth Cell under Monday that has a 1) has a number in it then it will pull the date from E2(First Date under Monday). This will return the date on A2 on the sheet "Test Daily Template".

The issue that I am having is that the Hub locations and cells below them go to row 400. Having to copy it down and then going to each line after about 20 is becoming a very long process. The picture below only shows a 5-day set of columns however the dates will actually cover the fiscal month 22-21st minus weekends. Thus causing me to create code/formulas for all of those cells as well to go onto that second page. Once the formula is completed I would like to also add something to hide a line if it was blank so that any other cell filled would flow to the top.

Yesterday while working on this I am also working on other Ideas on how to build (maybe a pivot) to get my information easier and more automated. My issue is the person who created the First sheet is on vacation and I won't be able to see how he got the data to it and from where or if it was all just cut and pasted from other files.

This is a lot of figuring out. If it just sounds too complicated I'm good with that answer also and I will work on getting it from another location or use a pivot that one of my managers had suggested.


SHEET 1(TEST Template)
OP TicketDayMondayTuesdayWednesdayThursdayFriday
Date10/2310/2410/2510/2610/27
Max Quota
HUB LocationPassives
Node
VAALD09D0Ay111
VAALD0210Ay111
VAALD0100Dy11
VAALD02M0Cy11
VAALD0520Cy33
OP TicketDayMondayTuesdayWednesdayThursdayFriday
Date10/2310/2410/2510/2610/27
Max Quota
Different HubPassives
Node
MDFKD0570D11
MDMED02S0C1
MDDSD06V0A1
MDDSD02R0CY22
MDFKD03J0AY22


SHEET 2 (TEST Daily Template) Where data is being collected.
Scheduled DateBusiness Partner NamePrimary Node ReferenceHubCityChild NodeTier ClassificationTask Type
10/23/2023GEXHUB LocationVAALD09D0A2A Proactive Digital
10/23/2023GEXNodeVAALD0210A2A Proactive Digital
 
Upvote 0
Thanks for providing some clarification. What distinguishes the upper set of source data from the lower one? I see "HUB Location" and "Node" to the upper left of the 1st group, and "Different Hub" and "Node" to the upper left of the 2nd group. Is any of that information to be carried through and applied to the entire set below it? I ask because in the Results table, there is a column called "Primary Node Reference" that appears to use the 1st label (HUB Location) for the 1st item in the 1st set of data, but then the 2nd label (Node) is used for the 2nd item in the 1st set of data. I'm trying to understand where the other information in the Results table comes from (GEX, node information, 2A Proactive Digital, etc.).

The approach I described above could be easily adjusted to remap one table with the fixed value substitutions, but your results table seems to be pulling other information into it...and the approach I described doesn't do that. Depending on your answer to the question above, it might be easier to port the data over to Power Query and transform it there to create a final results table that would then be written back onto an Excel worksheet...a VBA solution would work too. But please post back to explain where the other information comes from.
 
Upvote 0
You're Welcome and thanks for continuing to look at this...

The source data in Hub location and Different Hub are geological areas. Such as in California or Colorado, the "node" name will go into the "Child Node" location on the second sheet. When a "node" is listed as needing work done in it then a number 1 will be placed under the date the work will be completed and the formula pulls that date and adds it under "Scheduled Date".

I'm trying to understand where the other information in the Results table comes from (GEX, node information, 2A Proactive Digital, etc.).

For this sheet, all work is done by GEX. I have a formula built so when the date shows under "Scheduled Date" then it puts in GEX under "Business Partner Name". I also have the same type of formula for the "Tier Classification" column.
For the cells under "HUB" I have that built to pull the "Hub" location. The column with "primary node reference" will be left blank.

There are a total of 25 "Hub Locations" below the two I listed as well. Each one has a total of 20 spots to use for the nodes.
 
Upvote 0
Okay...I was thinking about trying to form much of the output table at once. I'm assuming:
  1. the "nodes" are the text strings like VAALD09D0A, etc.
  2. that a given node appears only one time in the worksheet
  3. you want the results table to be a continuous one, where any nodes that have scheduled work will appear one after the other, regardless of whether those nodes are found in the 1st subtable, 2nd subtable, etc.
  4. that any value greater than 0 in the subtables means that work is scheduled for that day, therefore the node and its associated HUB Name should be included in the summary
  5. that the word "Node" appears above each subtable and that the HUB Name for the subtable appears immediately above the word "Node" (this is used to identify where the subtable begins so that the HUB Name can be filled down for each item (node) in the subtable).
The following assumes your data layout looks like the sample below (please check, as I noted some cell spacing issues--I think--in your post #4).
Then a single long formula transforms the input range (the "ary" variable which you'll adjust to your actual range reference) into a three column output summarizing Scheduled Date, HUB Name, and Child Node. This is probably better done in Power Query, but I was curious about using some of the newer Excel functions to essentially unpivot your source data and fill down information common to each subtable.
MrExcel_20231024.xlsx
ABCDEFGHI
1OP TicketDayMondayTuesdayWednesdayThursdayFriday
2Date10/2310/2410/2510/2610/27
3Max Quota
4HUB LocationPassives
5Node
6VAALD09D0Ay111
7VAALD0210Ay111
8VAALD0100Dy11
9VAALD02M0Cy11
10VAALD0520Cy33
11
12OP TicketDayMondayTuesdayWednesdayThursdayFriday
13Date10/2310/2410/2510/2610/27
14Max Quota
15Different HubPassives
16Node
17MDFKD0570D11
18MDMED02S0C1
19MDDSD06V0A1
20MDDSD02R0CY22
21MDFKD03J0AY
TEST Template

MrExcel_20231024.xlsx
ABC
1Scheduled DateHUB NameChild Node
210/23HUB LocationVAALD09D0A
310/23HUB LocationVAALD0210A
410/23HUB LocationVAALD0100D
510/23HUB LocationVAALD02M0C
610/23HUB LocationVAALD0520C
710/23Different HubMDFKD0570D
810/23Different HubMDMED02S0C
910/23Different HubMDDSD06V0A
1010/23Different HubMDDSD02R0C
1110/24HUB LocationVAALD09D0A
1210/24HUB LocationVAALD0210A
1310/24Different HubMDDSD02R0C
1410/25HUB LocationVAALD09D0A
1510/25HUB LocationVAALD0210A
1610/25HUB LocationVAALD0100D
1710/26HUB LocationVAALD0520C
1810/27HUB LocationVAALD02M0C
1910/27Different HubMDFKD0570D
TEST Daily Template
Cell Formulas
RangeFormula
A2:C19A2=LET(ary,'TEST Template'!A1:I21, data,FILTER(IF(ISNUMBER(ary),ary,ary&""),INDEX(ary,,1)<>""), data2,CHOOSECOLS(data,1,5,6,7,8,9), locrows,LET(a,SEQUENCE(ROWS(data2))*(TAKE(data2,,1)="Node"),FILTER(a,a<>0)-1), locary,MAKEARRAY(ROWS(data2),1,LAMBDA(r,c,IF(ISNUMBER(MATCH(r,locrows,0)),INDEX(data2,r,c),""))), idx,SCAN("",locary,LAMBDA(acc,val,IF(val="",acc,+val))), fintbl,HSTACK(idx,data2), fintbl2,FILTER(fintbl,NOT(ISNUMBER(MATCH(SEQUENCE(ROWS(fintbl)),locrows,0)))), fintbl3,FILTER(fintbl2,CHOOSECOLS(fintbl2,2)<>"Node"), data3,TAKE(fintbl3,,-5),data3scol,TOCOL(data3,,TRUE), locnode,TAKE(fintbl3,,2),expln,CHOOSEROWS(locnode,MOD(SEQUENCE(ROWS(locnode)*5,,0),ROWS(locnode))+1), header,DROP(CHOOSEROWS(ary,2),,4),datecol,TOCOL(IFNA(EXPAND(header,ROWS(data3)),header),,TRUE),final,HSTACK(datecol,expln),FILTER(final,N(data3scol)<>0))
Dynamic array formulas.
 
Upvote 0
This worked perfectly!! Thank you. I am now able to pull all of the data which goes to I337. I made the change and it worked but if I go past I ( could go to AH but not needed) it breaks it. I'm not sure about the "ARY" you said I could change for my range reference. However, it works great for what is needed and now I can easily turn that data into a pivot or extrapolate any of the A, B, and C columns into another sheet to make it fit the format that is being used.

Thank you again your help is greatly appreciated.
 
Upvote 0

Forum statistics

Threads
1,215,722
Messages
6,126,464
Members
449,315
Latest member
misterzim

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top