Sampling from a Text File

kparadise

Board Regular
Joined
Aug 13, 2015
Messages
186
Hello,

I have 10 texts files. These text files contain a different amount of records within them....anywhere from 100,000 records, to 6,000,000 records.

I need to select 16,000 random records from each of the 10 text files.

For those text files which have under 1,000,000 records, I can easily load them into excel...do a RAND() function and then sort to pull my sample. But what or how would I do this for the text files which contain more that the excel limit?

OR

Is there another way easier, that I am not aware of?
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Hello,

in the case of more than 1 mio lines, either PowerQuery or the following function could help:

Code:
Function ReadTxtFile(ByRef strPath As String) As String
    Dim intFF As Long
  
    intFF = FreeFile()
    Open strPath For Binary As [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=intFF]#intFF[/URL] 
        ReadTxtFile = Space$(LOF(intFF))
        Get [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=intFF]#intFF[/URL] , , ReadTxtFile
    Close [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=intFF]#intFF[/URL] 
End Function

THe next step to apply the split-function with the linebreak-character (vblfcr, vblf, vbcr). In the resulting array a sampling per line is possible.

I am not really aware of the limitations (depending on ram 4-6 GB??), so please report if successful.

regards
 
Upvote 0
I am sorry, but I do not have an idea of where to enter this code.

I have zero knowledge about this type of execution.

More information, the texts files contain two columns, with headers. In my sample files, I will need the header (or not) and both columns of data. I do not see in that code about the number 16,000 which is my sample number.
 
Upvote 0

Forum statistics

Threads
1,214,646
Messages
6,120,715
Members
448,985
Latest member
chocbudda

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top