Data sorting with multiple subtypes

ZenZilla

New Member
Joined
Jun 26, 2015
Messages
46
Hi all, here's my problem:

I am pulling information from a raw data workbook into a statistical analysis workbook. Each data entry is being sorted based on one of many criteria and because the criteria I need to use to separate the entries have multiple subtypes within each type I have run into a problem. For example: book5564 English 09. I am not interested in the subtypes, just the book5564. What I have been doing so far is placing all the matching entries into one worksheet and having separate worksheets for each type, rather than all into one as I would like to. And now you'll think why not just use if statements. Well, yes I am..but again this poses a problem if I throw them all into one worksheet as some entry names contain the names of others, for example book5 and book5564 so I cant just say =sumif('data'!A:A,*book5*) as it will return false information. I could just put the name of the book verbatim but some of the entries I am working with have multiple language subtypes spanning 20 years...working with 300,000 entries this method is infeasible. In addition to this I would like the data entry name (book5564) to be a cell reference to the book name column ie =sumif('data'!A:A,B10).

Perhaps I am missing an obvious solution to this problem but I have tried everything I could think of. Thanks for any help!
 
I am sorting the data based on its name, and from there running analysis on it. An example name would be : Book10 E 15. Where E is a language code and 15 is a year code. So the primary descriptor is book10, and the subtypes are described by the language and year codes. For my analysis I need to group all data entries together that are of the same type using only the primary descriptor (in this case, book10). I am not interested in looking at the subtypes.

Currently what I've been doing to properly analyze the data is separate each book type into its own worksheet, ie book10 has its own worksheet for all of its entries and book1022 has its own worksheet etc etc. This however is taking a significant amount of time and I would like to be able to simplify this process. The reason this is an issue is that I am using referential formulas in my analysis that rely on the primary descriptor. I cannot just use *book10* as that reference as it will pull false information due to other book types containing book10 in their names ie book1022. I could type the full name into the formulas (book10 E 15) but that would take even longer than separating each into their own worksheet.

I see two possible solutions to this, im just not sure how to achieve them:

1) I continue to separate each book type into separate worksheets, but I write a macro that auto populates my statistical analysis worksheet with the new data every time I add a new worksheet.

2) I find a way to properly reference each book type, allowing me to leave all the raw data in one worksheet.
 
Upvote 0

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().
book10D13
book102E14
book1020F15
book10E14
book102F15
book1020F15
book10D13DEF
book102E14book10527015
book1020D15book102154245
book10E14book1020151575
book102D15
book1020F15
book10D13
book102E14is this near what you want ?
book1020F15
book10E14I did not use year as year but as totals to demonstrate principle
book102F15
book1020F15
book10D13
book10E14
book1020E15
book10E14
book102F15
book10F15

<tbody>
</tbody>
 
Upvote 0
bookcodeyearsaleshelper
book10D1537book10D
book102E1454book102E
book1020F1422book1020F
book10E1418book10E
book102F1566book102F
book1020F1544book1020F
book10D1437book10D131415
book102E1454book102Ebook1020D2200
book1020D1322book1020Dbook1020E0022
book10E1418book10Ebook1020F022154
book102D1566book102Dbook102D0066
book1020F1544book1020Fbook102E01620
book10D1337book10Dbook102F00198
book102E1454book102Ebook10D743737
book1020F1522book1020Fbook10E01260
book10E1418book10Ebook10F0044
book102F1566book102F
book1020F1544book1020F
book10D1337book10D
book10E1454book10Ebook and code bundled together
book1020E1522book1020E
book10E1418book10E
book102F1566book102F
book10F1544book10F

<colgroup><col span="4"><col><col><col><col span="4"></colgroup><tbody>
</tbody>
 
Upvote 0
I am only interested in conducting the analysis on the main type, the subtypes do not matter at all to me. The pictures I linked in post #5 show the structure of the raw data and the analysis. My main problem is how I can deal with the multiple subtypes when using references to conduct analysis.
 
Upvote 0
Not really. Sorting the data itself is not my issue. Perhaps I am not explaining my problem very well. Let me try and restate my issue a different way:

Currently I am taking all my raw data and separating it into separate worksheets based on book type. I would like to keep it all together in one worksheet however. But the reason I have been separating it is that I cannot find a way to deal with the subtypes of the book types in my referential formulas. So by separating each type into its own worksheet I can just use for example =sum('Book10'!L:L) rather than having it all in one worksheet and use =sumif('Data'!K:K,*Book10*,L:L) which will return false information. I could use AND or OR arguments to nest all the subtypes but that would take far far far longer than simply putting each book type into separate worksheets.
 
Upvote 0
Essentially I see two ways to solve my problem: I can find a different way to reference the book types. Maybe excel has a different referencing function I am not aware of, or I can write a macro? Or I could continue to separate the book types into separate worksheets, but write a macro that auto fills my analysis worksheet every time a new data worksheet is added.
 
Upvote 0
I found a solution! Sort of. Using the INDIRECT formula I can kind of halfassed automate my analysis table. Much better than where I was before, but not quite what I wanted. It will work for now though, I appreciate all the help!
 
Upvote 0
From what I could see of your images, why not a PivotTable? Are you familiar with Pivot Tables?
 
Upvote 0
book533
book5134
book5235
book51036
book51537
book5438
book567839how many
book 3340book512921
book4441book51132
book9942book52135
book543book510138
book5144book515141
book5245book54144
book51046book5678147
book51547
book5448
book567849
book 3350
book4451
book9952this distinguishes between 5,51,510 etc
book553
book5154what is it you want other than this ?
book5255
book51056
book51557
book5458formula to count how many book5????????
book567859
book 3360=SUMPRODUCT((LEFT(A1:A30,5)="book5")*1)
book4461
book9962

<colgroup><col width="64" span="12" style="width:48pt"> </colgroup><tbody>
</tbody>
 
Upvote 0

Forum statistics

Threads
1,215,003
Messages
6,122,655
Members
449,091
Latest member
peppernaut

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top