Sum and Count Numbers

mark r

New Member
Joined
Dec 31, 2011
Messages
4
Hi All,

Looking for an Excel solution (I have Excel 2010-Mac) to the following problem:

I have a list of 20,000 numbers (deciphering bioinformatics data) and would like to have a formula that would be able to perform a Sum to a specific number and then tell me how many cells were used to reach that Sum.

For a better understanding, lets say that my list of numbers is arranged from high-to-low (this is how it would normally be set-up):

10,000
3,000
500
500
300
250
250
200
175
150
50
10
1
1
1

And I specified that I want a Sum = 14,100 (but not greater than). How would I write a formula to both calculate a sum (using numbers from high-to-low) and then also tell me how many of the numbers were used to reach that sum. The Sum does not have to perfectly equal the number I specify, just as close as possible without being greater than the specified number. In this case, the formula should report back that 4 cells were used to reach a Sum of 14,000.

Also, a real bonus would be if I could sum an entire set of numbers and then be able to specify the decile percentages (say by 10% increments) and then have a sum function that Sum to each of the decile percentages and tell me how many numbers were used to reach that decile number.

For example: let's say that I have a total of 10,000 individual numbers that when Sum = 300,000 and that I want to set-up deciles by 10% increments so that the Top 90% =30K, Top 80% = 60K, Top 70% = 90K, etc. How would I write a formula that would tell me how many of the cells were required to reach (but no exceed) from my ranked high-to-low list of 10,000 numbers the desired number, i.e., 30K, 60K, 90K, etc.

Any help would be greatly appreciated.

If this works, I would certainly include you as an acknowledgement in any publication of the scientific data.

Thanks!

Mark
 

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
Well, here's part of your solution.

With your posted data in A1:A15
and
C1: a target value....14,100

This regular formula returns the max count of consecutive cells that, when summed, approach or equal that value...but do not exceed it.
Code:
C2: =MATCH($C$1,INDEX(SUBTOTAL(9,OFFSET(A1,,,1,ROW(A1:A20)-2)),0),1)

Using your data, that formula returns: 4
The sum of 10,000+3,000+500+500 equals 14,000.
The next cell contains 300, which would cause the total to exceed 14,100

Oh, and this formula returns the sum through that 4th cell
Code:
D2: =SUM(A1:INDEX(A:A,C2))
which, in this case, is: 14,000

Does that help?
 
Last edited:
Upvote 0
Hi Ron,

Happy New Year and Thank You for the quick turn-around and solution!!!! That formula should work perfect for calculating a lot of bioinformatics data. I'm an immunologist (not a bioinformatician) and so have a "basic" understanding of the power of Excel. Most "biologists" use Excel to graph-out data, but rarely have training or an understanding of the power of Excel in data processing outside the standard cookie-cutter functions and statistical tests. I'm holding true to my offer on acknowledgement in any publications from the data set using the formula and Excel assistance. Please feel free to send me your proper credits to my email address:

bmtlabs (at) gmail.com

Thanks

Mark
 
Upvote 0
This gives a different answer.
8 list entries sum to 14,063.

(Ignore the underscores and -- below.)
your list is in A2:A16, columns B and E are blank.
The target value of 14,100 is in D1
C2 has the formula =(A2<=D1)*A2, dragged down
D2 has the formula = D1-C2, also dragged down.

The result in F1 =SUM(C:C) = 14,063 is the sum of A2,A3,A4,A5,A12,A13,A14,A15
In F2, =COUNTIF(C:C,">0") = 8 is the count of list items that make up that sum.

- List - -- target= 14,100 -- 14,063
10,000 -- _10,000 _4,100 -- 9
_3,000 -- __3,000 _1,100 --
___500 -- ___500 ___600 --
___500 -- ___500 ___100 --
___300 -- _____0 ___100 --
___250 -- _____0 ___100 --
___250 -- _____0 ___100 --
___200 -- _____0 ___100 --
___175 -- _____0 ___100 --
___150 -- _____0 ___100 --
____50 -- ____50 ____50 --
____10 -- ____10 ____40 --
_____1 -- _____1 ____39 --
_____1 -- _____1 ____38 --
_____1 -- _____1 ____37 --

(BTW, are you using Excel 2011 on a Mac or have you installed Windows on your Mac?)
 
Last edited:
Upvote 0
The count of terms is 9.
The formula results in 9.
But my fingers typed the wrong digit twice above.
 
Upvote 0
Hi mikerickson,

Thank you for the reply; sorry for the mis-type, I am using Excel 2011 on Mac OS 10.7. I see how your approach works, but what I am interested in is obtaining a number based on arranging the data from highest-to-lowest and once the target number is reached (based on going from high-to-low) to report the number of cells required to reach that target number.

Ron, quick question for you. I have entered your formula into Excel and am able to repeat the example of 14,100, but if I place the value of 5,000 in-place of 3,000, the read out is still 4 cells (when it should only be 1) to reach (being less than or equal to 14,100) the target value of 14,100.

Below is a slice of my real data set. As a side note, I am looking at genomic copy number of a specific gene in human T cells. The data set is 22,000 individual/unique data points (or gene copies) that sums to 351,950 total copies (the same amino acid gene sequence is expressed multiple times in the sequence, hence, referred to as Copy Number ... I have a particular gene sequence of interest that is expressed 39 times in the data set, and so on).

Dataset (A1:A29):

39006
17944
7627
6797
1823
1765
981
679
677
629
586
580
526
522
483
463
456
412
357
355
355
343
334
328
326
323
319
313
257

From this slice of the data set, if I apply your formula (below) in C2 and request a target number of 77,928 (in C1), which is the Sum of the first 10 numbers, I still get a read-out of 4. Is there a reference in the formula that I am missing?

=MATCH($C$1,INDEX(SUBTOTAL(9,OFFSET(A1,,,1,ROW(A1:A20)-2)),0),1)

Without question, I greatly appreciate any assistance. :--)

Thanks and Happy 2012

Mark
 
Upvote 0
Here you go:
Code:
C2: =MATCH($C$1,INDEX(SUBTOTAL(9,OFFSET(A1,,,ROW(A1:A1000))),0),1)

The formula I originally posted had a typo.
The OFFSET function has 5 arguments:
OFFSET(range_reference, rows_to_offset, cols_to_offset, height, width)

In the original formula:
=MATCH($C$1,INDEX(SUBTOTAL(9,OFFSET(A1,,,1,ROW(A1:A20)-2)),0),1)

I had set the height argument at 1
and was calculating a variable width: ROW(A1:A20)-2


The new formula corrects that by calculating a variable height.
I omitted the 5th argument because we are using the same width as the referenced range...1 col.

BTW...I wrapped the SUBTOTAL function in an INDEX function so we could commit the formula just using ENTER. Without the INDEX function it would have been an array formula that required using CTRL+SHIFT+ENTER.
 
Upvote 0

Forum statistics

Threads
1,215,639
Messages
6,125,968
Members
449,276
Latest member
surendra75

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top