Data sorting with multiple subtypes

ZenZilla

New Member
Joined
Jun 26, 2015
Messages
46
Hi all, here's my problem:

I am pulling information from a raw data workbook into a statistical analysis workbook. Each data entry is being sorted based on one of many criteria and because the criteria I need to use to separate the entries have multiple subtypes within each type I have run into a problem. For example: book5564 English 09. I am not interested in the subtypes, just the book5564. What I have been doing so far is placing all the matching entries into one worksheet and having separate worksheets for each type, rather than all into one as I would like to. And now you'll think why not just use if statements. Well, yes I am..but again this poses a problem if I throw them all into one worksheet as some entry names contain the names of others, for example book5 and book5564 so I cant just say =sumif('data'!A:A,*book5*) as it will return false information. I could just put the name of the book verbatim but some of the entries I am working with have multiple language subtypes spanning 20 years...working with 300,000 entries this method is infeasible. In addition to this I would like the data entry name (book5564) to be a cell reference to the book name column ie =sumif('data'!A:A,B10).

Perhaps I am missing an obvious solution to this problem but I have tried everything I could think of. Thanks for any help!
 

Excel Facts

Enter current date or time
Ctrl+: enters current time. Ctrl+; enters current date. Use Ctrl+: Ctrl+; Enter for current date & time.
show us 5 rows of your raw data and whjat you want the analysis to look like (pictures better than words)
 
Upvote 0
Unfortunately due to the nature of the data I cannot share it, however let me try and give a simple explanation of it. The raw data itself is composed of approx. 300,000 entries (rows) with 26 characteristics (columns). My analysis table that I have built is entirely referential. It has the item name, and then pulls all the required information about data entries that share that name. ie book5 was ordered 34,000 times, 12,000 of which were orders >99 composing X% of overall orders. etc etc.

My problem is that I dont want to have to sort each data type into its own worksheet before analysis as this will add hours and hours of work to the project. I want to have just two worksheets in the workbook, Analysis overview and Raw Data. But I cannot see a way around it due to the multiple subtypes within the naming of the data entries. As I said above the names contain multiple languages spanning 20 years. So an example name would be book5 E 09. Where E is a language code and 09 is a date code.
 
Upvote 0
In the future please don't create duplicate threads on the same topic, but stick to your Original Thread.

If you feel like a thread has fallen too far back to be noticed you can simply reply to it with "Bump" as the message, which will move it back to the top.

I'm leaving this one open, but closed your earlier, original thread.

As for helping you out, if you follow the HTML Maker link in my sig you can post screen shots of what you have now and what your desired outcome is. I know you can't post the real data (which we discourage anyway), but see if you can mock up a sample.

If not, then the logical thing to me would be to parse the sub-type code into an additional column. Then you just use Pivot Tables for your summaries.
 
Upvote 0
Ah my apologies, I know some forums frown on bumping so I though I'd just make a new thread.

Here are a couple of images that give a basic picture of the raw data and the analysis. Keep in mind there are about 300,000 entries, with 26 columns of information and the analysis has about 17 columns as well. All the raw data keeps the same format.

What the raw data looks like:
jUFu8GD.png

EDIT: the new material numbers vary from subtypes, I mistakenly made them the same here.

and what the analysis looks like:
7EB7owM.png


What i've been doing so far is sorting each type into their own worksheets based on material number and then grabbing the stats referencing that worksheet. What I'd like to do is toss all the raw data into one worksheet and reference it that way.
 
Last edited:
Upvote 0
Currently I am taking all the raw data and separating it based on its main type, ie book10. So all the book10 types are all together. Reason being is I can then run all my formulas very easily ie =sumif('Book10Data'!D:D). This data separation is taking a very long time due to the amount of data present. But I run into roadblocks if I try and keep all the data together like I mentioned above. Some of the book names contain the names of other books ie book10 and book1022 So I cant just reference it by using its name if it has subtypes ie =sumifs('Data'!D:D,'Data'!B:B,*Book10*,'Data'!D:D, ">" & C1) as this will pull all book10 data and book1022 data and so on with any other books which contain the name.

I could just type the name of each book and its many subtypes into the formulas but that would take even longer than separating the data as im dealing with multiple languages and 20 years worth of subtypes
 
Upvote 0
I'm not sure what else you're asking for? I gave examples of my Data and the Analysis above your post and a description of my sorting issues in the first post and post #7.
 
Upvote 0
you are very familiar with your data and requirements - I cannot understand what you want from your description, list all your column headers in raw data and column headers in sorted data and see if we can progress from there, plus explain what you mean by subtype
 
Upvote 0

Forum statistics

Threads
1,214,987
Messages
6,122,614
Members
449,090
Latest member
vivek chauhan

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top