Extract information from subfolders

Rafaelete

New Member
Joined
Feb 2, 2022
Messages
14
Office Version
  1. 2007
Platform
  1. Windows
Good afternoon,
I´m trying to extract information from multiple word documents that are in subfolders inside a folder. I have been able to extract information from the words that are inside the first folder, but I can´t find a way to make it work for the words that are inside the subfolders.
They are all inside the first folder:
1643815590577.png

Here I can extract the information from the docx document. But can´t access the words inside the folders 1034 and 1064.

The following code shows the part of the macro that extracts the information from the folder. I need it to work for the subfolders:



Dim FolderName As String
Dim FileName As String
Dim NewWordFile As New Word.Application
Dim NewDoc As New Word.Document

Dim wdDoc As Object
Dim wdFileName As Variant
Dim TableNo As Integer 'table number in Word
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel


FolderName = "C:\Users\rriveragarrido\Desktop\Proyectos\Proyecto solaris (endesa) (PROPIO)\prueba macros\ZZZ\try\"
FileName = Dir(FolderName & "*.docx")


'Comineza el loop para que corra en cada una de las narrativas.

Do While FileName <> ""


Set wdDoc = GetObject(FolderName & FileName)
With wdDoc



'Este check sirve para identificar si no hay ninguna tabla en el word. En caso de ser así, te devuelve un mensaje indicándotelo.
TableNo = wdDoc.Tables.Count
If TableNo = 0 Then
MsgBox "This document contains no tables"
End If


'FROM HERE IT STARTS TO EXTRACT THE INFORMATION FROM ALL THE WORDS LOCATED IN THE FOLDER.
'Nombra la fila en la que va a empezar el código:
RowOutputNo = Range("O1").End(xlDown).Row
ColOutputNo = 15


'Pasa por cada una de las tablas que hay en la narrativa. En cada narrativa hay varias tablas, por lo que es importante determinar en cada narrativa cuantas hay.
For tbBegin = 1 To TableNo


'Copia la información de cada una de las celdas de word a excel.
With .Tables(tbBegin)

For RowNo = 1 To .Rows.Count
For ColNo = 2 To .Columns.Count
Cells(RowOutputNo, ColOutputNo) = WorksheetFunction.Clean(.cell(RowNo, ColNo).Range.Text)
Next ColNo
RowOutputNo = RowOutputNo + 1
Next RowNo

End With
RowOutputNo = RowOutputNo

Next tbBegin

End With


FileName = Dir()

'Sort the table to select next narrative

Range("Table1").Sort Key1:=Range("O1"), Order1:=xlDescending, Header:=xlYes
Range("Table1[[#Headers],[Responsable Actividad]]").Select
Selection.End(xlDown).Select
ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate



Loop



Thank you in advanced.
 

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Welcome to the forum.

Try this code and see if it works. I don't have sample data, so I can't test it out, but I think it should work (or be close). It fills an array with all of the docx file paths in the folder and subfolders. Then, the array is used in the looping.
VBA Code:
Dim WordFiles() As String
Dim fCount As Integer

Sub GetWordData()
    Dim FolderName As String
    Dim filename As String
    Dim NewWordFile As New Word.Application
    Dim NewDoc As New Word.Document
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim TableNo As Integer 'table number in Word
    Dim iRow As Long 'row index in Excel
    Dim iCol As Integer 'column index in Excel
    Dim i As Integer
  
    FolderName = "C:\Users\rriveragarrido\Desktop\Proyectos\Proyecto solaris (endesa) (PROPIO)\prueba macros\ZZZ\try\"
    fCount = -1
    LoopAllSubFolders FolderName
  
    'Comineza el loop para que corra en cada una de las narrativas.
    For i = 0 To fCount
        Set wdDoc = GetObject(WordFiles(i))
        With wdDoc
        'Este check sirve para identificar si no hay ninguna tabla en el word. En caso de ser así, te devuelve un mensaje indicándotelo.
            TableNo = wdDoc.Tables.Count
            If TableNo = 0 Then
                MsgBox "This document contains no tables"
            End If
            'FROM HERE IT STARTS TO EXTRACT THE INFORMATION FROM ALL THE WORDS LOCATED IN THE FOLDER.
            'Nombra la fila en la que va a empezar el código:
            RowOutputNo = Range("O1").End(xlDown).Row
            ColOutputNo = 15
          
            'Pasa por cada una de las tablas que hay en la narrativa. En cada narrativa hay varias tablas, por lo que es importante determinar en cada narrativa cuantas hay.
            For tbBegin = 1 To TableNo
            'Copia la información de cada una de las celdas de word a excel.
                With .Tables(tbBegin)
                    For RowNo = 1 To .Rows.Count
                        For ColNo = 2 To .Columns.Count
                            Cells(RowOutputNo, ColOutputNo) = WorksheetFunction.Clean(.cell(RowNo, ColNo).Range.Text)
                        Next ColNo
                        RowOutputNo = RowOutputNo + 1
                    Next RowNo
                End With
                RowOutputNo = RowOutputNo
            Next tbBegin
        End With
      
        'Sort the table to select next narrative
        Range("Table1").Sort Key1:=Range("O1"), Order1:=xlDescending, Header:=xlYes
        Range("Table1[[#Headers],[Responsable Actividad]]").Select
        Selection.End(xlDown).Select
        ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate
    Next
End Sub

Sub LoopAllSubFolders(ByVal folderPath As String)
    Dim filename As String
    Dim fullFilePath As String
    Dim numFolders As Long
    Dim folders() As String
    Dim i As Long
  
    If Right(folderPath, 1) <> "\" Then folderPath = folderPath & "\"
    filename = Dir(folderPath & "*.*", vbDirectory)
    While Len(filename) <> 0
        If Left(filename, 1) <> "." Then
            fullFilePath = folderPath & filename
            If (GetAttr(fullFilePath) And vbDirectory) = vbDirectory Then
                ReDim Preserve folders(0 To numFolders) As String
                folders(numFolders) = fullFilePath
                numFolders = numFolders + 1
            Else
                If Mid(filename, InStrRev(filename, "."), 5) = ".docx" Then
                    fCount = fCount + 1
                    ReDim Preserve WordFiles(fCount)
                    WordFiles(fCount) = folderPath & filename
                End If
            End If
        End If
        filename = Dir()
    Wend
  
    For i = 0 To numFolders - 1
        LoopAllSubFolders folders(i)
    Next i
End Sub
 
Last edited:
Upvote 0
Updated code to get rid of the global variables.
VBA Code:
Sub GetWordData()
    Dim FolderName As String
    Dim filename As String
    Dim NewWordFile As New Word.Application
    Dim NewDoc As New Word.Document
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim TableNo As Integer 'table number in Word
    Dim iRow As Long 'row index in Excel
    Dim iCol As Integer 'column index in Excel
    Dim i As Integer
    Dim WordFiles() As String

    FolderName = "C:\Users\rriveragarrido\Desktop\Proyectos\Proyecto solaris (endesa) (PROPIO)\prueba macros\ZZZ\try\"
    If Not LoopAllSubFolders(WordFiles, FolderName, "*.docx") Then
        MsgBox "No files found."
        Exit Sub
    End If
    
    'Comineza el loop para que corra en cada una de las narrativas.
    For i = 0 To UBound(WordFiles)
        Set wdDoc = GetObject(WordFiles(i))
        With wdDoc
        'Este check sirve para identificar si no hay ninguna tabla en el word. En caso de ser así, te devuelve un mensaje indicándotelo.
            TableNo = wdDoc.Tables.Count
            If TableNo = 0 Then
                MsgBox "This document contains no tables"
            End If
            'FROM HERE IT STARTS TO EXTRACT THE INFORMATION FROM ALL THE WORDS LOCATED IN THE FOLDER.
            'Nombra la fila en la que va a empezar el código:
            RowOutputNo = Range("O1").End(xlDown).Row
            ColOutputNo = 15

            'Pasa por cada una de las tablas que hay en la narrativa. En cada narrativa hay varias tablas, por lo que es importante determinar en cada narrativa cuantas hay.
            For tbBegin = 1 To TableNo
            'Copia la información de cada una de las celdas de word a excel.
                With .Tables(tbBegin)
                    For RowNo = 1 To .Rows.Count
                        For ColNo = 2 To .Columns.Count
                            Cells(RowOutputNo, ColOutputNo) = WorksheetFunction.Clean(.cell(RowNo, ColNo).Range.Text)
                        Next ColNo
                        RowOutputNo = RowOutputNo + 1
                    Next RowNo
                End With
                RowOutputNo = RowOutputNo
            Next tbBegin
        End With

        'Sort the table to select next narrative
        Range("Table1").Sort Key1:=Range("O1"), Order1:=xlDescending, Header:=xlYes
        Range("Table1[[#Headers],[Responsable Actividad]]").Select
        Selection.End(xlDown).Select
        ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate
    Next
End Sub

Function LoopAllSubFolders(ByRef FoundFiles() As String, _
    ByVal folderPath As String, _
    Optional fileFilter As String = "*", _
    Optional fCount As Integer = -1, _
    Optional firstRun As Boolean = True) As Boolean
    
    Dim filename As String
    Dim fullFilePath As String
    Dim numFolders As Long
    Dim folders() As String
    Dim i As Long
    Dim fileExt As String
    
    If fileFilter = "" Then
        If MsgBox("File filter cannot be a null string." & vbCrLf & vbCrLf & _
          "Do you want to find all files (*.*)?", vbYesNo, "File Filter") = vbYes Then
            fileFilter = "*.*"
        Else
            Exit Function
        End If
    End If
    If Right(folderPath, 1) <> "\" Then folderPath = folderPath & "\"
    filename = Dir(folderPath & "*.*", vbDirectory)
    While Len(filename) <> 0
        If Left(filename, 1) <> "." Then
            fullFilePath = folderPath & filename
            If (GetAttr(fullFilePath) And vbDirectory) = vbDirectory Then
                ReDim Preserve folders(0 To numFolders) As String
                folders(numFolders) = fullFilePath
                numFolders = numFolders + 1
            ElseIf filename Like fileFilter Then
                fCount = fCount + 1
                ReDim Preserve FoundFiles(fCount)
                FoundFiles(fCount) = folderPath & filename
            End If
        End If
        filename = Dir()
    Wend
    
    For i = 0 To numFolders - 1
        LoopAllSubFolders FoundFiles, folders(i), fileFilter, fCount, False
    Next i
    If fCount > -1 Then LoopAllSubFolders = True
End Function
 
Upvote 0
Updated code to get rid of the global variables.
VBA Code:
Sub GetWordData()
    Dim FolderName As String
    Dim filename As String
    Dim NewWordFile As New Word.Application
    Dim NewDoc As New Word.Document
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim TableNo As Integer 'table number in Word
    Dim iRow As Long 'row index in Excel
    Dim iCol As Integer 'column index in Excel
    Dim i As Integer
    Dim WordFiles() As String

    FolderName = "C:\Users\rriveragarrido\Desktop\Proyectos\Proyecto solaris (endesa) (PROPIO)\prueba macros\ZZZ\try\"
    If Not LoopAllSubFolders(WordFiles, FolderName, "*.docx") Then
        MsgBox "No files found."
        Exit Sub
    End If
   
    'Comineza el loop para que corra en cada una de las narrativas.
    For i = 0 To UBound(WordFiles)
        Set wdDoc = GetObject(WordFiles(i))
        With wdDoc
        'Este check sirve para identificar si no hay ninguna tabla en el word. En caso de ser así, te devuelve un mensaje indicándotelo.
            TableNo = wdDoc.Tables.Count
            If TableNo = 0 Then
                MsgBox "This document contains no tables"
            End If
            'FROM HERE IT STARTS TO EXTRACT THE INFORMATION FROM ALL THE WORDS LOCATED IN THE FOLDER.
            'Nombra la fila en la que va a empezar el código:
            RowOutputNo = Range("O1").End(xlDown).Row
            ColOutputNo = 15

            'Pasa por cada una de las tablas que hay en la narrativa. En cada narrativa hay varias tablas, por lo que es importante determinar en cada narrativa cuantas hay.
            For tbBegin = 1 To TableNo
            'Copia la información de cada una de las celdas de word a excel.
                With .Tables(tbBegin)
                    For RowNo = 1 To .Rows.Count
                        For ColNo = 2 To .Columns.Count
                            Cells(RowOutputNo, ColOutputNo) = WorksheetFunction.Clean(.cell(RowNo, ColNo).Range.Text)
                        Next ColNo
                        RowOutputNo = RowOutputNo + 1
                    Next RowNo
                End With
                RowOutputNo = RowOutputNo
            Next tbBegin
        End With

        'Sort the table to select next narrative
        Range("Table1").Sort Key1:=Range("O1"), Order1:=xlDescending, Header:=xlYes
        Range("Table1[[#Headers],[Responsable Actividad]]").Select
        Selection.End(xlDown).Select
        ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate
    Next
End Sub

Function LoopAllSubFolders(ByRef FoundFiles() As String, _
    ByVal folderPath As String, _
    Optional fileFilter As String = "*", _
    Optional fCount As Integer = -1, _
    Optional firstRun As Boolean = True) As Boolean
   
    Dim filename As String
    Dim fullFilePath As String
    Dim numFolders As Long
    Dim folders() As String
    Dim i As Long
    Dim fileExt As String
   
    If fileFilter = "" Then
        If MsgBox("File filter cannot be a null string." & vbCrLf & vbCrLf & _
          "Do you want to find all files (*.*)?", vbYesNo, "File Filter") = vbYes Then
            fileFilter = "*.*"
        Else
            Exit Function
        End If
    End If
    If Right(folderPath, 1) <> "\" Then folderPath = folderPath & "\"
    filename = Dir(folderPath & "*.*", vbDirectory)
    While Len(filename) <> 0
        If Left(filename, 1) <> "." Then
            fullFilePath = folderPath & filename
            If (GetAttr(fullFilePath) And vbDirectory) = vbDirectory Then
                ReDim Preserve folders(0 To numFolders) As String
                folders(numFolders) = fullFilePath
                numFolders = numFolders + 1
            ElseIf filename Like fileFilter Then
                fCount = fCount + 1
                ReDim Preserve FoundFiles(fCount)
                FoundFiles(fCount) = folderPath & filename
            End If
        End If
        filename = Dir()
    Wend
   
    For i = 0 To numFolders - 1
        LoopAllSubFolders FoundFiles, folders(i), fileFilter, fCount, False
    Next i
    If fCount > -1 Then LoopAllSubFolders = True
End Function
Good morning,

Thanks a lot! It works perfectly!
 
Upvote 0
Updated code to get rid of the global variables.
VBA Code:
Sub GetWordData()
    Dim FolderName As String
    Dim filename As String
    Dim NewWordFile As New Word.Application
    Dim NewDoc As New Word.Document
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim TableNo As Integer 'table number in Word
    Dim iRow As Long 'row index in Excel
    Dim iCol As Integer 'column index in Excel
    Dim i As Integer
    Dim WordFiles() As String

    FolderName = "C:\Users\rriveragarrido\Desktop\Proyectos\Proyecto solaris (endesa) (PROPIO)\prueba macros\ZZZ\try\"
    If Not LoopAllSubFolders(WordFiles, FolderName, "*.docx") Then
        MsgBox "No files found."
        Exit Sub
    End If
   
    'Comineza el loop para que corra en cada una de las narrativas.
    For i = 0 To UBound(WordFiles)
        Set wdDoc = GetObject(WordFiles(i))
        With wdDoc
        'Este check sirve para identificar si no hay ninguna tabla en el word. En caso de ser así, te devuelve un mensaje indicándotelo.
            TableNo = wdDoc.Tables.Count
            If TableNo = 0 Then
                MsgBox "This document contains no tables"
            End If
            'FROM HERE IT STARTS TO EXTRACT THE INFORMATION FROM ALL THE WORDS LOCATED IN THE FOLDER.
            'Nombra la fila en la que va a empezar el código:
            RowOutputNo = Range("O1").End(xlDown).Row
            ColOutputNo = 15

            'Pasa por cada una de las tablas que hay en la narrativa. En cada narrativa hay varias tablas, por lo que es importante determinar en cada narrativa cuantas hay.
            For tbBegin = 1 To TableNo
            'Copia la información de cada una de las celdas de word a excel.
                With .Tables(tbBegin)
                    For RowNo = 1 To .Rows.Count
                        For ColNo = 2 To .Columns.Count
                            Cells(RowOutputNo, ColOutputNo) = WorksheetFunction.Clean(.cell(RowNo, ColNo).Range.Text)
                        Next ColNo
                        RowOutputNo = RowOutputNo + 1
                    Next RowNo
                End With
                RowOutputNo = RowOutputNo
            Next tbBegin
        End With

        'Sort the table to select next narrative
        Range("Table1").Sort Key1:=Range("O1"), Order1:=xlDescending, Header:=xlYes
        Range("Table1[[#Headers],[Responsable Actividad]]").Select
        Selection.End(xlDown).Select
        ActiveCell.Offset(rowOffset:=1, columnOffset:=0).Activate
    Next
End Sub

Function LoopAllSubFolders(ByRef FoundFiles() As String, _
    ByVal folderPath As String, _
    Optional fileFilter As String = "*", _
    Optional fCount As Integer = -1, _
    Optional firstRun As Boolean = True) As Boolean
   
    Dim filename As String
    Dim fullFilePath As String
    Dim numFolders As Long
    Dim folders() As String
    Dim i As Long
    Dim fileExt As String
   
    If fileFilter = "" Then
        If MsgBox("File filter cannot be a null string." & vbCrLf & vbCrLf & _
          "Do you want to find all files (*.*)?", vbYesNo, "File Filter") = vbYes Then
            fileFilter = "*.*"
        Else
            Exit Function
        End If
    End If
    If Right(folderPath, 1) <> "\" Then folderPath = folderPath & "\"
    filename = Dir(folderPath & "*.*", vbDirectory)
    While Len(filename) <> 0
        If Left(filename, 1) <> "." Then
            fullFilePath = folderPath & filename
            If (GetAttr(fullFilePath) And vbDirectory) = vbDirectory Then
                ReDim Preserve folders(0 To numFolders) As String
                folders(numFolders) = fullFilePath
                numFolders = numFolders + 1
            ElseIf filename Like fileFilter Then
                fCount = fCount + 1
                ReDim Preserve FoundFiles(fCount)
                FoundFiles(fCount) = folderPath & filename
            End If
        End If
        filename = Dir()
    Wend
   
    For i = 0 To numFolders - 1
        LoopAllSubFolders FoundFiles, folders(i), fileFilter, fCount, False
    Next i
    If fCount > -1 Then LoopAllSubFolders = True
End Function
Hello, is there a way to make it find all words with an extension of "*.docx" and "*.doc" as well?
 
Upvote 0

Forum statistics

Threads
1,214,947
Messages
6,122,413
Members
449,082
Latest member
tish101

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top