Random Sample with source being Text File

Jose Miguel

Board Regular
Joined
Feb 3, 2005
Messages
99
Hello All,

This is my problem. I need a random sample from a draw. I would normally do this all in Excel, but some of the initial draw populations are well over the Excel row limit, so my source file is a text file. Is there any way of writing a macro in Excel that will look for that text file and produce a random sample? Thanks.

Jose
 

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
Hi Jose,

If by "random draw" you mean randomly selecting and reading a particular record from a file, and the records are all the same length (in bytes), I recommend you open the file in VBA using an Open statement with the Random access mode option. Then your code will be able to access any record in the file directly ("randomly") using the Get # statement. So, for example, if your file contains 500,000 records you can randomly select and read a record in the range 1-500,000 like this:

Dim L As Long
Dim RecData As String*8
L = Rnd() * 500000 + 1
Get #1, L, RecData

where RecData is a string that is the length of the record (or could be any other data type that has the correct length), L is the random record number, #1 is the file number assigned by the Open statement.

Damon
 
Upvote 0
The text file may have from one to three columns (tab delimited or comma delimited). Columns such as "Cust Name" and "Acct Num" that will have varying lengths but where the field itself has a max number. Basically is it possible to write code that will randomly pick a row (or rows) from a tab delimted text file (don't know how to define a row's end)? Does this make any sense? Thanks.

Jose
 
Upvote 0
Hi again Jose,

If the file does not have fixed length records, in can still be done, but very inefficiently. This is because with variable record lengths the file must be read sequentially, which means that to read, for example, record 80000 your code would have to read records 1-79999 first.

You mentioned that all the fields have a maximum length. If these fields could always be written to the file using the maximum length (padding each field with spaces out to the maximum length), then the file could be read very efficiently in random mode as I described before.

A way around the ineffiency of the sequential (non-padded fields) approach would be to write a simple macro that would read this file in sequential mode and write it out in fixed-length binary mode. This binary file could then be the one that is read in random mode for the random draws. Such a macro would be very easy to write.

Damon
 
Upvote 0
Hi Jose

Another option is to read the file into the memory. It will not be as efficient as an approach like Damon's, where you first transform the source into a file with a convenient format. However, depending on the size of your file, it may be comfortable enough as the processing becomes very quick.

It will help to understand which would be a good approach if you post some details:
- typical number of records
- maximum size of a record
- typical size of a sample (number of records)
- Number of fields we already know: 1-3
- are you talking about one source file that you will use over and over again or will you be receiving new source files?
- in this last case how many samples do you typically extract from each source file?

PGC
 
Upvote 0

Forum statistics

Threads
1,214,804
Messages
6,121,652
Members
449,045
Latest member
Marcus05

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top