macro for stratified random sampling?

notwhoiam

New Member
Joined
Nov 29, 2004
Messages
30
Hi,

Does anyone know how to get Excel to pull a stratified random sample?

For example, let's say my dataset contains records/measurements from sixteen groups of buffalo, representing nine entirely different populations across the US. The data is organised into 4 columns ("Population", "Family size", "Adult weight", & "Density") and 16 rows.
Now, what I'd want to do is to generate a subsample by randomly pulling ONE record per population (=9). Then I'd want to repeat that 99 times to get 100 randomly chosen subsamples, each containing just 9 (not 16!) records.

Now I suppose I could use labels to identify the various populations but I worry that it'd get really complicated once I apply the macro to my real (& much more complex) dataset, especially since I'd be wanting each subsample to contain data from ALL the columns (i.e. each of the 9 records per subsample would still include "Fam Size", Weight" & "Density" estimates).

Understand what I'm needing to do? Unfortunately, the usual answers (simple random sampling between X & Y, or using a random number generator) won't work b/c I need everything to be stratified by population. I'd also prefer to avoid having to run random sampling functions/macros for EACH population individually.

I know this can be done via S+ & R, but I'd really rather stick to excel if at all possible.
Any suggestions you can give will be enthusiastically & gratefully appreciated!! :pray:
Thanks!!
 
This was such an interesting Problem I had to give it a shot in VBA. (even though I'm a relative newcomer to this world.)

- The code generated 10000 Data Sets of 75 populations from a list of 1125 observations. (75 Populations x max of 15 observations each) in 17 minutes.

- The lists are not random. (only because I'm not sure how to generate more than one Random number in VBA. (I was hoping someone could help me out with that part.) (For now each set just chooses the first instance of a population every time.)

- The Sets are stacked vertically on top of one another. (eg. Population 1 of Data Set 2 is directly below Population 75 of data set 1) When Row 30000 is reached, The Next Sets are Written Shifted Horizontally automatically. Thus all 10,000 Data Sets are on one Worksheet! Why not closer to 65,536 you ask? Because excel gave an error (overflow I think) when it reached row 32768. (Which is exactly half of Excel's 65536 limit...interesting) So I changed it to 30000 instead and it works fine.

- When saved with the 10,000 Data Sets the file is 27 MB.

- 8 variables x 5 weighting factors to go. I'm not sure what these factors are or how they affect the samples, but I could easily see expanding my DataSetsArr by two dimensions to house All of the Data Sets. (though I think Excel might a memory fit, if we tried to do it all as one array.) It might be more efficient to to work with the 10,000 Data Sets in Chunks.

- In any case, I think, in principle at least, that it is possible to include those factors by storing all the data across 40 workbooks, and accessing them one at a time.

- A crude lower limit of how long long it would take to generate all 400000 data sets: 20 min/Per * 8 * 5 = 800 minutes or just over half a day. (on my 1.7 gh celeron lap top.)
- This does not consider any added time for generating the random #'s
- This does not include any time necessary to calculate/manipulate Data with the 8 variables x 5 weighting factors.

I put the original sample on a worksheet called Data Pool:
Buffalo 10000.xls
ABCDEFGHIJK
1Pop#InstancePopulationNameFamilySizeAdultWeightDensityPopulationPop#251124
281Population6852452.6Population75116
3281Population4841987.4Population74215
461Population7062066.2Population73323
Data Pool


Columns H, I Contain the names of the 75 populations, and assigns each one a number from 1 to 75.

Column J Counts the number of instances of that Population in the Data Pool.

J1 finds the maximum number of instances of a Population. This number is the value of iMaxObservations used to set up the DataPoolArr.

K1 contains the number of Observations, iNumObservations, used in setting up the RawDataArr.

Column B Counts the number of instances of that Population up to that point in the Data Pool.

Column A uses VLookUp to find the Population number for the Population of that entry.

After the Code takes in the information from the Data Pool, it creates Data Sets, then writes them to the sheet called, "Subsamples"
 
Upvote 0

Excel Facts

Add Bullets to Range
Select range. Press Ctrl+1. On Number tab, choose Custom. Type Alt+7 then space then @ sign (using 7 on numeric keypad)
Here's how the code works:

RawDataArr: Gets all observations from Data Pool Worksheet.
DataPoolArr: Stores all observations, By Population, then by Instance.

DataSetsArray: For each Data Set, it chooses the First instance of a Population in the DataPool and stores all fields for that instance of that population.

The major missing ingredient is that I want the DataSetArray to choose a Random instance for each population.

(Of Note: The first field of the instance element for each poplation, That is DataSetsArr(Population #, iMaxNumObservations, 1) = # instances of that population.

Application.Calculation = xlCalculationManual
Code:
Application.ScreenUpdating = False
Dim i As Integer, j As Integer, iDummy As Integer, iCount As Integer, iRow As Integer, iCol As Integer
Dim iNumObservations As Integer, iMaxObservations As Integer, iNumPopulations As Integer, iRandom As Integer
Dim iNumFields As Integer, iNumDataSets As Integer, k As Integer, iReachBottom As Integer, iSetsThisReach As Integer
Dim StartTime As Date, EndTime As Date
StartTime = Timer
iNumObservations = Range("K1")
iMaxObservations = Worksheets("Data Pool").Range("J1").Value
iNumDataSets = 10000
iNumPopulations = 75
iNumFields = 6
ReDim RawDataArr(1 To iNumObservations, 1 To iNumFields) As Variant
ReDim DataPoolArr(1 To iNumPopulations, 1 To iMaxObservations + 1, 1 To iNumFields) As Variant
ReDim DataSetsArr(1 To iNumDataSets, 1 To iNumPopulations, 1 To iNumFields)

'Pulls the Raw Data into an Array
  RawDataArr = Range("A2:J" & iNumObservations) ' & Chr(iNumFields + 95) & iNumObservations)

'Creates a Data Pool Array similar to the Data Pool Chart in my post
  For i = 1 To iNumPopulations
    DataPoolArr(i, iMaxObservations + 1, 1) = 1 'last element of Data Pool used to count the instance of this Population
  Next i

  For i = 1 To iNumObservations - 1 'Go through each observation
   For k = 1 To iNumFields  'Go through each field
'Store Observation by Population(Pop # really), by instance, store each field
    DataPoolArr(RawDataArr(i, 1), RawDataArr(i, 2), k) = RawDataArr(i, k)
    DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) = RawDataArr(i, 2)
   Next k
  Next i
  
'Store Data Sets
For i = 1 To iNumDataSets
 For j = 1 To iNumPopulations
  iRandom = 1 'need to come up with a good random # Generator to Generate a Number
               'between 1 and the # of instances of this Population
  For k = 1 To iNumFields
    DataSetsArr(i, j, k) = DataPoolArr(j, iRandom, k)
  Next k
 Next j
Next i

iReachBottom = 1   'used to keep track of how many times the data has extended past row 30000
iSetsThisReach = 0  'used in determing iRow (counts the number of Data Sets in this particular "column" of Data
                    'that is, how many data sets have been stacked on top of one another
For i = 1 To iNumDataSets
'  If ((iNumPopulations + 1) * (i + 1)) - (iNumPopulations * iReachBottom - 1) > 30000 Then
   If iRow > 30000 Then
    iDummy = iReachBottom
    iReachBottom = iDummy + 1
    iSetsThisReach = 0
'    MsgBox "Reached the Bottom for the " & iReachBottom - 1 & " time."
  End If
  iDummy = iSetsThisReach
  iSetsThisReach = iDummy + 1
  
  For j = 1 To iNumPopulations
    For k = 1 To iNumFields
      iRow = j + (iNumPopulations) * (iSetsThisReach - 1)
      iCol = k + (iReachBottom - 1) * iNumFields
      Worksheets("Subsamples").Cells(iRow, iCol) = DataSetsArr(i, j, k)
    Next k
  Next j
Next i
EndTime = Timer
MsgBox Format((StartTime - EndTime), "mm:ss")
MsgBox DataSetsArr(1, 5, 3)
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = True
 
Upvote 0
I've modified my code from before and I think does an awful lot of what you want it to do.

- It now chooses the instances randomly.
- It now calculates 5 x 8 more Data samples, creating a separate workbook for each combination called, "Weight Factor w - Variable v" where each workbook contains iNumDataSets of Subsamples stored on Worksheet "Subsamples".
- For now the Weight Factors and Variables just allow for more Subsamples, but in the future you might want to use the variables as a way to manipulate the first subsample 40 times, or otherwise process the data.
- I ran the code (again, on my 1.7 GH Celeron XP Lap Top) for 5 x 8, 100 subsamples.
- This operation created 40 workbooks of 444 k each and it took 8.4 minutes.
- I believe that if we increased the number of Subsamples to 10000 both of these numbers would increase by a factor of 100, meaning the the whole process would take about 14 hours and would produce 40 44.4 MB files. (1.8 GB total) But the process is completely automated.

Code:
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim i As Integer, j As Integer, iDummy As Integer, iCount As Integer, iRow As Integer, iCol As Integer
Dim iNumObservations As Integer, iMaxObservations As Integer, iNumPopulations As Integer
Dim iNumFields As Integer, iNumDataSets As Integer, k As Integer, iReachBottom As Integer, iSetsThisReach As Integer
Dim StartTime As Date, EndTime As Date, iNumVariables As Integer, iNumWeightFactors As Integer
Dim w As Integer, v As Integer, sWeightVar As String
StartTime = Timer
iNumObservations = Range("K1")
iMaxObservations = Worksheets("Data Pool").Range("J1").Value
iNumVariables = 8
iNumWeightFactors = 5
iNumDataSets = 100
iNumPopulations = 75
iNumFields = 6
ReDim RawDataArr(1 To iNumObservations, 1 To iNumFields) As Variant
ReDim DataPoolArr(1 To iNumPopulations, 1 To iMaxObservations + 1, 1 To iNumFields) As Variant
'Pulls the Raw Data into an Array
  RawDataArr = Range("A2:J" & iNumObservations) ' & Chr(iNumFields + 95) & iNumObservations)
'Creates a Data Pool Array similar to the Data Pool Chart in my post
  For i = 1 To iNumPopulations
    DataPoolArr(i, iMaxObservations + 1, 1) = 1 'last element of Data Pool used to count the instance of this Population
  Next i
  For i = 1 To iNumObservations - 1 'Go through each observation
   For k = 1 To iNumFields  'Go through each field
'Store Observation by Population(Pop # really), by instance, store each field
    DataPoolArr(RawDataArr(i, 1), RawDataArr(i, 2), k) = RawDataArr(i, k)
    DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) = RawDataArr(i, 2)
   Next k
  Next i
'***************************************************************************
' Creates a Subsample (I've called it DataSetArr) Then Writes the Random Samples to a Workbook called
' Weight Factor w - Variable v
For w = 1 To iNumWeightFactors
For v = 1 To iNumVariables
  ReDim DataSetsArr(1 To iNumDataSets, 1 To iNumPopulations, 1 To iNumFields)
  sWeightVar = "Weight Factor " & w & " - " & "Variable " & v
  Set newBook = Workbooks.Add
       With newBook
           .Title = sWeightVar
           .Subject = sWeightVar
           .SaveAs Filename:=sWeightVar & ".xls"
       End With
  Set NewSheet = Worksheets.Add  'the new sheet is named after the acct number
       With NewSheet
          NewSheet.Name = "Subsamples"
       End With
'Store Data Sets
For i = 1 To iNumDataSets
 For j = 1 To iNumPopulations
'  iRandom = 1 'need to come up with a good random # Generator to Generate a Number
               'between 1 and the # of instances of this Population
  iRandom = Int(DataPoolArr(j, iMaxObservations + 1, 1) * Rnd + 1)
  For k = 1 To iNumFields
    DataSetsArr(i, j, k) = DataPoolArr(j, iRandom, k)
  Next k
 Next j
Next i

iReachBottom = 1   'used to keep track of how many times the data has been recorded to row 65000
iSetsThisReach = 0  'used in determing iRow
For i = 1 To iNumDataSets
   If iRow > 30000 Then
    iDummy = iReachBottom
    iReachBottom = iDummy + 1
    iSetsThisReach = 0
  End If
  iDummy = iSetsThisReach
  iSetsThisReach = iDummy + 1
  For j = 1 To iNumPopulations
    For k = 1 To iNumFields
      iRow = j + (iNumPopulations) * (iSetsThisReach - 1)
      iCol = k + (iReachBottom - 1) * iNumFields
      Worksheets("Subsamples").Cells(iRow, iCol) = DataSetsArr(i, j, k)
    Next k
  Next j
Next i
ActiveWorkbook.Save
ActiveWorkbook.Close
'****************************************************************************
Next v
Next w

EndTime = Timer
MsgBox "This Run Took " & Format(((EndTime - StartTime) / 60), "###0.00") & " minutes to complete"
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = True

If you try this code, please let me know how it works. I am quite curious how close this code comes to meeting your needs.
 
Upvote 0
WOW! Thanks PA HS Teacher!
I'm super busy today but will definitely give your code a try this evening.
Am very excited to see if it works.. Will let you know how it goes.
 
Upvote 0
Well,
I ran the code for al 8 x 40 x 10000 and I got 40 39.8 MB Workbooks in 4 hours! (I think it ran faster because I made sure that in my mock population data each population appeared no more than 15 times.
- I also had to change a line that keeps track of the number of instances in each population. The change is below.
- You could split up the work on different machines. E.g. cut the run time by a factor of 5 if you ran 5 machines.

- Machine 1 w from 1 to 1
- Machine 2 w from 2 to 2
etc.

Slight Code Modifcation:

Rich (BB code):
'Creates a Data Pool Array similar to the Data Pool Chart in my post
  For i = 1 To iNumPopulations
    DataPoolArr(i, iMaxObservations + 1, 1) = 1 'last element of Data Pool used to count the instance of this Population
  Next i
  For i = 1 To iNumObservations - 1 'Go through each observation
   For k = 1 To iNumFields  'Go through each field
'Store Observation by Population(Pop # really), by instance, store each field
    DataPoolArr(RawDataArr(i, 1), RawDataArr(i, 2), k) = RawDataArr(i, k)
    If RawDataArr(i, 2) > DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) Then
       DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) = RawDataArr(i, 2)
    End If
   Next k
  Next i
 
Upvote 0
Hi,

Well I wasn't able to get to the code last night so I've been playing around with it today - but haven't had much luck so far. I think I'm missing a few steps or formats/labels.. ?

Here's what my worksheet looks like:
forum idea #2 - VBA macro attempt....xls
ABCDEFGHIJK
1Pop#InstancePopNameFamilySizeAdultWeightDensityPopNamePop#40
211252452.6211
3213.541987.43.521
431462066.2431
541532543.4544
642542434.5751
743522625.6861
Data Pool


As you can see, I've simply copied the example you provided but changed the numbers of the populations so they're noticeably different from the contents of columns B & C. Ideally I'd also love to add an extra column so I can ID the observations themselves, not just track them with respect to the population they belong to. For example, imagine that each population is a family (Smith, Jones, Baker etc) and that each observation is a separate family member (Tom, Sue, Jimmy, etc - assuming that there aren't any repeated first names within OR between families). So what your examples have all contained is the equivalent of "Smith_1, Smith_2, Jones_2, etc". But what I'm planning to add is a column of first names (=observations) only so I can figure out who's data I'm looking at without having to look at the contents & order of observations in each population. Anyways that's not a huge issue - it's just something I'm putting off until the coding works and I feel confident enough to add columns & then change the code to suit the new layout.

Now, as for the code itself, I'm a little lost. I taught myself how to write & interpret simple VBA codes earlier this summer but then went on to learn coding for three other very different programs (Blossom, R, & S-Plus) & so have forgotten most of the VBA rules. Given that frustrating complication, I'm not now sure whether your current code requires that I label the various columns & ranges (as was necessary for your initial solution). From what I can tell, I think the new code automatically names its own ranges & worksheets so I don't have to. Correct?

But worst of all, and regardless of whether I label or not, I can't seem to get the macro to run.
I think I entered the code into VB correctly: I simply copied it out of your message and pasted it into VBA under 'VBA Project (ForumSolution#2.xls) > Sheet 2(Data Pool). I had to add an extra line: "Private Sub ForumIdea2(ByVal Target As Range)" so the program would recognise it as a new code. I also tried using "Sub ForumIdea2()" but that didn't work either. (of course I also included "Sub end" at the end of the script).
Have I screwed up or forgotten a step so far?

Assuming that's all ok, I tried to run the macro but got a variety of error messages. At first it complained of "Type mismatch (Error 13)" but once I took out the color=red formatting, it seemed much happier ... although I don't know why.
Now it's moved on to saying "Subscript out of range (Run-Time Error 9)", but once again I'm stumped. I don't know exactly which PART of the code or the way I'm applying it is causing the problem. This is all the more annoying b/c I'm sure that once I sort this problem out VB announce another error elsewhere in the code... Arg. Do you have any suggestions?? You've clearly gotten it to work, so what am *I* doing wrong?

Also I think you've over-complicated the issues of weighting & multiple variables, and as a result wrote a more complex code than is really necessary... Basically the plan is that I'm going to weight some of the columns/cells in my dataset by multiplying their contents with one of four pre-determined weighting schemes. That's a step I can definitely do via a simple excel function - no need to incorporate it into the VBA code. In fact, the only reason I mentioned it before was because I was worried about the added time needed to reformat your initial (non-VBA) sheet so it would accept all 10000 random sumbsamples ON TOP OF the time that would already be needed to format & analyse (separately) the various versions of my dataset (i.e. without weights, with weighting schemes 1-4, using diferent combinations of my variables as needed, etc etc).
You see, I don't mind waiting 4h per 'subsampling session' (=10,000 subsamples pulled) or having to run 20 sessions in all.. I just don't want to have to pull them all by hand or to set each one up from scratch separately :eek:
From what I can tell (but chances are I'm wrong - esp since I can't even get this to work), your code might be more complicated ...& therefore slow... than is really necessary. But I'll get a better idea of the output & steps needed once I can get this silly thing to work for me. Sigh.

As for the multiple variables, I've just learned that those will probably be whittled down so I'll only need to get data for a few variables per subsample - not for all 8 variables every time (sorry for the change - but I'm forced follow the whims of a statistician who keeps changing his mind).

So what does this all mean? Well, it doesn't mean much until I can get the code to run & can see if does what I need it to. That being the case, I'd obviously REALLY appreciate any info or ideas about how you got your macro to work &/or about where I've screwed up.

I wish computers were easier to understand. I've been working on the stats, formatting, & coding of this analysis for the last THREE months and am getting soooo fed up with it all.

Thanks for all of your help so far PA HS Teacher - you've been amazing!
 
Upvote 0
I have a version here which is more user friendly. I believe it is ready as is to accomodate your data and any number of fields you desire. (such as unique observation names.) You are limited only by how much your computer can handle. For a copy of this worksheet private message me for the actual file.

Otherwise I will post the latest version Now:
There are two sheets,

The second Sheet called:Data Pool
It is where you put your data, as much as you want and as many fields as you want, as long as you copy Cell B2 on the Data Pool Sheet and Paste it down at least as far as you have data.

The Second Sheet is called: Population Table
It is where you control the input parameters.
(for a single Workbook of 10,000 subsamples, set iNumVariables and iNumWeightFactors to 1, and set iNumDataSets to 10,000)
(This takes me about 7 minutes to run on my machine)
(Note: In this version of the code, the sheet starts a New Sheet within the workbook, rather than moving a set of columns over, when it reaches the bottom of a worksheet)

The Range Named Populations is Column C of the Data Pool Sheet, and must contain the population name for each observation.
Random Buffalo with var and weight 12-4-04.xls
ABCDEFG
1PopulationNamePop##InstancesofPopulationButtonForGeneratingSamples,Putabuttonhereandgiveitthemostrecentpostedcode
2Population75115
3Population74215iNumObservations1105
4Population73314iMaxNumObservations15
5Population72415iNumVariables2
6Population71515iNumWeightFactors1
7Population70615iNumDataSets10000
8Population69715iNumPopulations75
9Population68814iNumFields7
10Population67915
11Population661015***AddasmuchDataintotheDataPoolSheetasdesired
12Population651115***MakesureyoucopyColumnBofDataPoolasfardownasyouhavedata
13Population641215***MakesureallpopulationsarelistedinthePopulationTable
14Population631315***Alldataisstoredinworkbookssavedatthesamelevelasthisfile
15Population621415***LetmeknowwhenyouwanttodowiththeWeightingFactorsandVariables
16Population611514***IwasthinkingofusinganarrayofWeightingFactorsinthecodesuchas
17Population601614WeightFactorValue
18Population59171511.05
19Population58181520.87
20Population57191530.73
21Population56201542.7
22Population55211551.34
23Population542215
24Population532314*Youcouldsavealotoftimewiththeweightingfactors,bydoingitwithcode
Population Table
 
Upvote 0
The data Pool Sheet looks as follows. Feel free to add as many fields as you desire. But you MUST Copy A2 and B2 and paste them down as far as you have data.

The named Range, PopTable refers to columns A B and C on the Population Table Sheet.
Random Buffalo with var and weight 12-4-04.xls
ABCDEFGHIJ
1Pop#InstancePopulationNameFamilySizeAdultWeightDensityObservationNameUserDefined2UserDefined3etc.
2371Population3911032Bob
3711Population5110517Sally
4581Population18110518Joe
5331Population43110613John
Data Pool


On the Population Table Sheet, add a command button.
If you double click on it, and paste the following code, this should all work.

Code:
Private Sub CommandButton1_Click()
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Dim i As Integer, j As Integer, iDummy As Integer, iCount As Integer, iRow As Integer, iCol As Integer
Dim iNumObservations As Integer, iMaxObservations As Integer, iNumPopulations As Integer
Dim iNumFields As Integer, iNumDataSets As Integer, k As Integer, iReachBottom As Integer, iSetsThisReach As Integer
Dim StartTime As Date, EndTime As Date, iNumVariables As Integer, iNumWeightFactors As Integer
Dim w As Integer, v As Integer, sWeightVar As String, sSheetName As String, sRawDataRange As String
StartTime = Timer
'Initializes Variables: Make sure you understand this section**************************
iNumObservations = Worksheets("Population Table").Range("F3")
iMaxObservations = Worksheets("Population Table").Range("F4")
iNumVariables = Worksheets("Population Table").Range("F5")
iNumWeightFactors = Worksheets("Population Table").Range("F6")
iNumDataSets = Worksheets("Population Table").Range("F7")
iNumPopulations = Worksheets("Population Table").Range("F8")
iNumFields = Worksheets("Population Table").Range("F9")
'Pulls RawData into an array, then puts that data into an DataPoolArray from which
'Observations will be drawn at random for each population
'**************************************************************************************
ReDim RawDataArr(1 To iNumObservations, 1 To iNumFields) As Variant
ReDim DataPoolArr(1 To iNumPopulations, 1 To iMaxObservations + 1, 1 To iNumFields) As Variant
'Pulls the Raw Data into an Array
For i = 1 To iNumObservations
 For j = 1 To iNumFields
  RawDataArr(i, j) = Worksheets("Data Pool").Cells(i + 1, j)
 Next j
Next i
'Creates a Data Pool Array similar to the Data Pool Chart in my post
  For i = 1 To iNumPopulations
    DataPoolArr(i, iMaxObservations + 1, 1) = 1 'last element of Data Pool used to count the instance of this Population
  Next i
  For i = 1 To iNumObservations - 1 'Go through each observation
   For k = 1 To iNumFields  'Go through each field
'Store Observation by Population(Pop # really), by instance, store each field
    DataPoolArr(RawDataArr(i, 1), RawDataArr(i, 2), k) = RawDataArr(i, k)
    If RawDataArr(i, 2) > DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) Then
       DataPoolArr(RawDataArr(i, 1), iMaxObservations + 1, 1) = RawDataArr(i, 2)
    End If
   Next k
  Next i
'***************************************************************************
' Creates a Subsample (I've called it DataSetArr) Then Writes the Random Samples to a Workbook called
' Weight Factor w - Variable v
For w = 1 To iNumWeightFactors
For v = 1 To iNumVariables
  ReDim DataSetsArr(1 To iNumDataSets, 1 To iNumPopulations, 1 To iNumFields)
  sWeightVar = "Weight Factor " & w & " - " & "Variable " & v
  Set newBook = Workbooks.Add
       With newBook
           .Title = sWeightVar
           .Subject = sWeightVar
           .SaveAs Filename:=sWeightVar & ".xls"
       End With
  Set NewSheet = Worksheets.Add
       With NewSheet
          NewSheet.Name = "1"
       End With
'Store Data Sets
For i = 1 To iNumDataSets
 For j = 1 To iNumPopulations
  iRandom = Int(DataPoolArr(j, iMaxObservations + 1, 1) * Rnd + 1)
  For k = 1 To iNumFields
    DataSetsArr(i, j, k) = DataPoolArr(j, iRandom, k)
  Next k
 Next j
Next i

iReachBottom = 1   'used to keep track of how many times the data has been recorded to row 65000
iSetsThisReach = 0  'used in determing iRow
For i = 1 To iNumDataSets
   If iRow > 30000 Then
    iDummy = iReachBottom
    iReachBottom = iDummy + 1
    iSetsThisReach = 0
       Set NewSheet2 = Worksheets.Add
       With NewSheet2
          NewSheet2.Name = i
       End With
      End If
  iDummy = iSetsThisReach
  iSetsThisReach = iDummy + 1
  For j = 1 To iNumPopulations
    For k = 1 To iNumFields
      iRow = j + (iNumPopulations) * (iSetsThisReach - 1)
      iCol = k
      ActiveSheet.Cells(iRow, iCol) = DataSetsArr(i, j, k)
    Next k
  Next j
Next i
ActiveWorkbook.Save
ActiveWorkbook.Close
'****************************************************************************
Next v
Next w

EndTime = Timer 'displays time it took to complete
MsgBox "This Run Took " & Format(((EndTime - StartTime) / 60), "###0.00") & " minutes to complete"
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = True
End Sub

Let me know if this is any easier to follow. Otherwise PM me with your e-mail address and I'll send you the file.
 
Upvote 0
Well, it almoooost works...

I've PM'd you about this but thought a post might be faster.

From what I can tell, the macro should work but it's getting stuck (Subscript out of range - Error 9) while trying to run+debug the following line:
Code:
'Store Observation by Population(Pop # really), by instance, store each field
    DataPoolArr(RawDataArr(i, 1), RawDataArr(i, 2), k) = RawDataArr(i, k)
It seems to think that all of the RawDataArr are empty, although I'm not sure why. As I mentioned in the PM, the hyperlinks to your most recently posted 'Population Table' don't work - so that might be part of the problem. But that can't explain everything since I was careful to copy & paste the cell contents -as posted- into my own sheet - meaning that there ARE values in the cells (tho' they aren' necessarily correct) and excel SHOULD still be able to run the macro despite the missing hyperlinks. Or at least that's my interpretation of the code... (?)

Any thoughts? Perhaps sending the file itself would be the best plan since I could see all of the functions, codes & labels and know that they really DO work -at least at YOUR end-...

Once again, you're a miracle-worker. Thx!!

:pray:
 
Upvote 0
Not sure why you'e getting that error. I'm reposting with the hyperlinks working. (and I've sent you the file directly)

If it doesn't from the file itself that it is beyond my understanding.
Random Buffalo with var and weight 12-5-04.xls
ABCDEFG
1PopulationNamePop##InstancesofPopulationButtonForGeneratingSamples,Putabuttonhereandgiveitthemostrecentpostedcode
2Population75115
3Population74215iNumObservations1105
4Population73314iMaxNumObservations15
5Population72415iNumVariables1
6Population71515iNumWeightFactors1
7Population70615iNumDataSets1500
8Population69715iNumPopulations75
9Population68814iNumFields7
Population Table


The range Populations refers to Column C of the Data Pool Sheet

[Edit] By the way, what are the values of i and k when you get the error? (hold your mouse over the i and k and their values should show on the screen)
 
Upvote 0

Forum statistics

Threads
1,214,827
Messages
6,121,818
Members
449,049
Latest member
cybersurfer5000

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top