Challenging Filtering vs excel freezing issue

ihavequestions256

New Member
Joined
Sep 13, 2017
Messages
7
Hello!

I tried a prior thread which alas did not fully work out (more on this below), but I am still struggling so here goes.

I have this large dataset for analysis that has several columns (subject, age, gender, protein, peptide, intensity). I've just made a minor representative subset below. I'm trying to sort/filter the data such that only the proteins with more than one level in the peptide column are shown. Below for example if I set the limit as 2 peptides or above, then protein ABC would be excluded as there's only one peptide listed (though I would like to be able to freely choose my threshold).

ProteinPeptideIntensity
XYZXYZ STRPZW24
XYZXYZ SQERT24.3
ABCABC DEZFEF25
QRSQRS FLEMP23
QRSQRS BESOE23.5

<tbody>
</tbody>


I'm able to make a pivot table which shows me how many unique peptides there are, but I'm lost from that point. The dataset is so large there are literally 1000 proteins that have only one peptide, and I'm sure there's a smarter way to do this than for me to deselect each of these in the data "sort" tool. In my prior thread someone was very helpful and suggested I use the "countif" function to count the number of rows for each protein and then sort based off that.

I had initially thought it worked, but then realized when I tried to sort after using that conditional function, excel just freezes on me (~300,000 rows x 6 columns of data to sort with a conditional function). Excel automatically jumped to using 4 processors to no avail.

Any extra suggestions would be greatly appreciated!
 

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).
ProteinPeptideIntensitycount2< < <YOUR CRITERIA
XYZXYZ STRPZW24212
XYZXYZ SQERT24.321
ABCABC DEZFEF251
QRSQRS FLEMP23215
QRSQRS BESOE23.521THE LAST COLUMN INSERTS ROW NUMBER
EFGEFG XXXYYZ231IF A 1 IS ON TOP OF ANOTHER 1
VWXVWX AABBCC23.13
VWXVWX BBCCDD23.23
VWXVWX CCDDEE23.33
15#####QRS#####=IF(ISERROR(LARGE($F$2:$F$10,A15)),"",LARGE($F$2:$F$10,A15))
22XYZ
3
4
5
6
7
8
9
10
QRS from
=IF(B15="","",OFFSET($F$1,MATCH(B15,$F$2:$F$10,0),-5))
if you set 3 or more you will have to play around with the row number
only appearing for a 1 sitting on two 1's etc

<colgroup><col><col><col span="13"></colgroup><tbody>
</tbody>
 
Upvote 0
I hope this helps. I don't usually work with them, but I somewhere learned to filter the pivot table using cells adjacent to the table.

Using a pivot table with 'Protein' in the Rows field and 'Count of Peptide' in the Values field (Grand Totals off):

Select the cell immediately above the 'Count of Peptide' header and also the cell with the header. If the header is in B3, you want to select B2:B3. With B2 as the active cell, bring up the right-click menu. Select Filter >> Filter by Selected Cell's Value. The pivot table may disappear as the filter button appears in B2.

You can now filter the rows in the pivot table. You'll notice the header cell, 'Count of Peptide', appears as one of the check boxes when you make your filtering choices.
 
Last edited:
Upvote 0
Thank you for your help. Could you breakdown what's going on in the cells a bit more for me? I'm not quite getting what's going on here.
 
Upvote 0
I hope this helps. I don't usually work with them, but I somewhere learned to filter the pivot table using cells adjacent to the table.

Using a pivot table with 'Protein' in the Rows field and 'Count of Peptide' in the Values field (Grand Totals off):

Select the cell immediately above the 'Count of Peptide' header and also the cell with the header. If the header is in B3, you want to select B2:B3. With B2 as the active cell, bring up the right-click menu. Select Filter >> Filter by Selected Cell's Value. The pivot table may disappear as the filter button appears in B2.

You can now filter the rows in the pivot table. You'll notice the header cell, 'Count of Peptide', appears as one of the check boxes when you make your filtering choices.

Thanks @thisoldman for your suggestion. This I feel comes close, but how do I go from the now filtered pivot table to similarly filtered source data (now without any proteins with only 1 peptide) with all the columns, not just the protein and peptides? (subject, age, gender, condition as well)
 
Upvote 0
ProteinPeptideIntensitycount2< < <YOUR CRITERIA
XYZXYZ STRPZW24212
XYZXYZ SQERT24.321
ABCABC DEZFEF251
QRSQRS FLEMP23215
QRSQRS BESOE23.521THE LAST COLUMN INSERTS ROW NUMBER
EFGEFG XXXYYZ231IF A 1 IS ON TOP OF ANOTHER 1
VWXVWX AABBCC23.13
VWXVWX BBCCDD23.23
VWXVWX CCDDEE23.33
15#####QRS#####=IF(ISERROR(LARGE($F$2:$F$10,A15)),"",LARGE($F$2:$F$10,A15))
22XYZ
3
4
5
6
7
8
9
10
QRS from
=IF(B15="","",OFFSET($F$1,MATCH(B15,$F$2:$F$10,0),-5))
if you set 3 or more you will have to play around with the row number
only appearing for a 1 sitting on two 1's etc
D2 (under count)
=COUNTIF($A$2:$A$10,A2)
E2 (under your criteria choice[2])
=IF(D2=$E$1,1,"")

<colgroup><col><col><col span="12"></colgroup><tbody>
</tbody>
 
Upvote 0

Forum statistics

Threads
1,215,949
Messages
6,127,892
Members
449,411
Latest member
AppellatePerson

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top