Recommendations about searching through thousands of text files


New Member
Apr 3, 2013
I have written two versions of an Excel 2010 macro (v1.6 and v2.0) that pull data from multiple directories. There about 60,000 total text files with about 20,000 of them being of interest to my macro. There are main directories that the macro searches, and all three of them have multiple sub-directories that need to be searched.

Initially, both macro versions recursively search each of the 3 directories and save the file names so that later, a separate function can mine data out of each file path that was found. Each file name has a serial number that the macro is trying to find. There are 60 total serial numbers.

Macro v1.6 takes one of the 60 serial numbers and then recursively searches all 3 directory locations using the following piece of code (but adapted for vba) which comes from the recursive search example here: HOW TO: Recursively Search Directories by Using FileSystemObject. It then repeats the search for each of the 59 remaining serial numbers (it recursively searches the 3 directories a total of 60 times).

[COLOR=#333333][FONT=Consolas]FileName = Dir(fso.BuildPath(fld.Path, sFile), vbNormal Or [/FONT][/COLOR][COLOR=#333333][FONT=Consolas]vbHidden Or vbSystem Or vbReadOnly)[/FONT][/COLOR]
   While Len(FileName) <> 0      FindFile = FindFile + FileLen(fso.BuildPath(fld.Path, FileName))      nFiles = nFiles + 1      List1.AddItem fso.BuildPath(fld.Path, FileName)  ' Load ListBox      FileName = Dir()  ' Get next file      DoEvents Wend
Macro v2.0 recursively loads all file paths from each of the 3 directories into a blank worksheet and then performs a search on the file name of each file path that was just loaded into the worksheet to separate the 20,000 or so desired file paths for the later data mine function. It uses the Like comparison to compare each file name with all 60 serial numbers (all 60 serial numbers compared with the file name of each row in the worksheet - see the following piece of code).This is what imports all the file paths in v2.0:
[COLOR=#222222][FONT=Verdana]Private Function FindFile(ByVal SearchDirectory As String, OutputSheet As String, OutputRow As Long, TotalDirectoriesFound As Long, TotalFileCount As Long) As Currency[/FONT][/COLOR]    Dim SubFld As Folder
    Dim FileName As String
    Dim File As File
    Dim TempPath() As String
    Set fld = fso.GetFolder(SearchDirectory)
    For Each File In fld.Files
        DoEvents                                    'allow other processes to execute
        TotalFileCount = TotalFileCount + 1
        TempPath = Split(File, "\")
        Worksheets(OutputSheet).Cells(OutputRow, 1).Value = TempPath(UBound(TempPath))
        Worksheets(OutputSheet).Cells(OutputRow, 2).Value = File
        OutputRow = OutputRow + 1
        Application.StatusBar = "Elapsed Time: " & Format(Now - StartTime, "hh:mm:ss") & " Searching Directories... " & TotalFileCount & " files found so far"
    Next File
    TotalDirectoriesFound = TotalDirectoriesFound + 1
    If fld.SubFolders.Count > 0 Then
        For Each SubFld In fld.SubFolders
            FindFile SubFld.Path, OutputSheet, OutputRow, TotalDirectoriesFound, TotalFileCount
    End If
    Set SubFld = Nothing

[COLOR=#222222][FONT=Verdana]End Function[/FONT][/COLOR]
And this is what searches the imported file names/paths for the desired serial numbers:
[COLOR=#222222][FONT=Verdana]OutputRow = 2[/FONT][/COLOR]    InputRow = 1
        For Count = 0 To UBound(SerialNumbers)
            DoEvents                    'allow other processes to execute
            SearchString = FileNameSearch1 & SerialNumbers(Count) & FileNameSearch2     'this is the file name to match during searching (FileNameSearch1 and 2 are set in the userform)
            If Worksheets(InputSheet).Cells(InputRow, InputFileNameColumn).Value Like SearchString Then
                Worksheets(OutputSheet).Cells(OutputRow, OutputFilePathColumn).Value = Worksheets(InputSheet).Cells(InputRow, InputFilePathColumn).Value    'store file path
                Worksheets(OutputSheet).Cells(OutputRow, OutputFileNameColumn).Value = Worksheets(InputSheet).Cells(InputRow, InputFileNameColumn).Value    'store file name
                Worksheets(OutputSheet).Cells(OutputRow, SNColumn).Value = SerialNumbers(Count)             'print the serial number to output sheet
                OutputRow = OutputRow + 1
            End If
        Next Count
        Application.StatusBar = "Elapsed Time: " & Format(Now - StartTime, "hh:mm:ss") & " Finding files... " & InputRow & " of " & TotalFileCount & " file names searched"
        InputRow = InputRow + 1

[COLOR=#222222][FONT=Verdana]    Loop Until Len(Worksheets(InputSheet).Cells(InputRow, 1).Value) = 0[/FONT][/COLOR]
I was expecting macro v2.0 to execute much faster than v1.6. My reasoning was that v1.6 had to go through to the network directories 60 times, while the v2.0 only went through the network directories once. After running the macros, I was surprised to find that v1.6 actually executes faster than v2.0. Does anyone know why this is true? Does the microsoft example somehow use the indexing system that's built into windows or something?Any tips about speeding up this macro are welcome as well. Although combining the searching and data mining would probably decrease the execution time, I would prefer to keep the file searching and the data mining in two separate functions because it makes it much easier to reuse my code for other things.Thanks,Adrian

Excel Facts

What is the fastest way to copy a formula?
If A2:A50000 contain data. Enter a formula in B2. Select B2. Double-click the Fill Handle and Excel will shoot the formula down to B50000.
Sorry about that, it said it timed out the first time and I decided to change the name for the second try.

This is the better one to keep since the code blocks didn't work correctly in the second post.
Last edited:
Upvote 0
You're CODE tags appear to have gone a little haywire on this post as well unfortunately. For me at least the code has separate lines joined onto single lines.

Hopefully though I can still offer a potential solution. I've written a function that finds all the paths for a specific file extension and it also searches subfolders.
This may be faster as this function doesn't use the file system object and doesn't recursively call itself.

Once you get the paths out of it you can then use the Like comparison you've done previously to filter the list further.

Option Explicit

Sub findAllTxtFiles()
    Dim fileColl As Collection
    Dim i As Integer
    Const initPath = "C:\Temp"
    Const fileExt = ".txt"
    Set fileColl = fnGetAllFilesForExt(initPath, fileExt, True)
    For i = 1 To fileColl.Count
        Debug.Print fileColl(i)
        'Do Something with the paths returned
    Next i
End Sub

Function fnGetAllFilesForExt(initialPath As String, Optional fileExt As String = "", Optional bSearchSubFolders As Boolean = False) As Collection
    'Returns a collection object containing all the file names with the extension fileExt
    'Searches through: the folder supplied by initialFilePath
    'You can optionally search through subfolders as well by specifying True for bSearchSubFolders
    'E.g. If initialFilePath = "C:\SomeFolder"
    '       And fileExt = ".png"
    'Then this function will return full file paths to all png files in SomeFolder
    'and any subfolders that SomeFolder contains
    'If fileExt is omitted or is a null/zero length string, e.g. "",
    'then files for all extensions will be returned
    'If the path name  cannot be found then this function will return a blank collection
    'A blank collection will also be returned if no files are found for the extension supplied
    'Test the return in the sub or function that calls this by using:
    'If returnedCollection.Count = 0 then
    '   'Do something here as no files were found
    '   'E.g.
    '   Exit Sub
    'End If
    Dim foldersColl As Collection
    Dim filesColl As Collection
    Dim s As String
    Dim fullPath As String
    Set foldersColl = New Collection
    Set filesColl = New Collection
    'Set the initial folder
    If Right(initialPath, 1) = "\" Then
        foldersColl.Add initialPath
        foldersColl.Add initialPath & "\"
    End If
    'Loop through the current folder
    'If subfolders are found, add them to foldersColl
    'If a file with the correct extension is found, add them to filesColl
        s = Dir(foldersColl(1), vbDirectory)
            On Error Resume Next
            If (GetAttr(foldersColl(1) & s) And vbDirectory) = vbDirectory Then
                If Left(s, 1) <> "." And bSearchSubFolders Then
                    foldersColl.Add foldersColl(1) & s & "\"
                End If
                If fileExt = "" Then
                    filesColl.Add foldersColl(1) & s
                    If Right(s, Len(fileExt)) = fileExt Then
                        filesColl.Add foldersColl(1) & s
                    End If
                End If
            End If
            s = Dir()
            On Error GoTo 0
        Loop Until s = ""
        foldersColl.Remove 1
    Loop Until foldersColl.Count = 0
    Set fnGetAllFilesForExt = filesColl
End Function

About your point that V1.6 ran faster than V2.0, I think this is simply because V2.0 had to output the file paths to a worksheet. Every time the output happened Excel would probably have had to recalculate the entire sheet to account for any potential formula changes, formatting changes etc. Just a guess :)
Upvote 0

Forum statistics

Latest member

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back