Random Sample with source being Text File

Jose Miguel

Board Regular
Joined
Feb 3, 2005
Messages
99
Hello All,

This is my problem. I need a random sample from a draw. I would normally do this all in Excel, but some of the initial draw populations are well over the Excel row limit, so my source file is a text file. Is there any way of writing a macro in Excel that will look for that text file and produce a random sample? Thanks.

Jose
 

Some videos you may like

Excel Facts

Back into an answer in Excel
Use Data, What-If Analysis, Goal Seek to find the correct input cell value to reach a desired result

Damon Ostrander

MrExcel MVP
Joined
Feb 17, 2002
Messages
4,239
Hi Jose,

If by "random draw" you mean randomly selecting and reading a particular record from a file, and the records are all the same length (in bytes), I recommend you open the file in VBA using an Open statement with the Random access mode option. Then your code will be able to access any record in the file directly ("randomly") using the Get # statement. So, for example, if your file contains 500,000 records you can randomly select and read a record in the range 1-500,000 like this:

Dim L As Long
Dim RecData As String*8
L = Rnd() * 500000 + 1
Get #1, L, RecData

where RecData is a string that is the length of the record (or could be any other data type that has the correct length), L is the random record number, #1 is the file number assigned by the Open statement.

Damon
 

Jose Miguel

Board Regular
Joined
Feb 3, 2005
Messages
99
The text file may have from one to three columns (tab delimited or comma delimited). Columns such as "Cust Name" and "Acct Num" that will have varying lengths but where the field itself has a max number. Basically is it possible to write code that will randomly pick a row (or rows) from a tab delimted text file (don't know how to define a row's end)? Does this make any sense? Thanks.

Jose
 

Damon Ostrander

MrExcel MVP
Joined
Feb 17, 2002
Messages
4,239
Hi again Jose,

If the file does not have fixed length records, in can still be done, but very inefficiently. This is because with variable record lengths the file must be read sequentially, which means that to read, for example, record 80000 your code would have to read records 1-79999 first.

You mentioned that all the fields have a maximum length. If these fields could always be written to the file using the maximum length (padding each field with spaces out to the maximum length), then the file could be read very efficiently in random mode as I described before.

A way around the ineffiency of the sequential (non-padded fields) approach would be to write a simple macro that would read this file in sequential mode and write it out in fixed-length binary mode. This binary file could then be the one that is read in random mode for the random draws. Such a macro would be very easy to write.

Damon
 

pgc01

MrExcel MVP
Joined
Apr 25, 2006
Messages
19,881
Hi Jose

Another option is to read the file into the memory. It will not be as efficient as an approach like Damon's, where you first transform the source into a file with a convenient format. However, depending on the size of your file, it may be comfortable enough as the processing becomes very quick.

It will help to understand which would be a good approach if you post some details:
- typical number of records
- maximum size of a record
- typical size of a sample (number of records)
- Number of fields we already know: 1-3
- are you talking about one source file that you will use over and over again or will you be receiving new source files?
- in this last case how many samples do you typically extract from each source file?

PGC
 

Watch MrExcel Video

Forum statistics

Threads
1,122,538
Messages
5,596,750
Members
414,097
Latest member
FaeFen

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top