structured data export word to excel via vba - optimisation opinion

Jtufplata

Board Regular
Joined
Jun 24, 2013
Messages
51
Hello everyone,
i would like your opinion on the following:

I need to extract data from several word documents (about 80 pages each). These documents have somewhat of a standard formatting that enables the structured extraction. Example:
  1. word in upper case in document will go to column B
  2. words with alignment type = right with go to column D etc...

Each line/paragraph in word doc will have unique row in excel. Initially the whole extraction would take about 20 mins, and after some experimentation, I managed to reduce processing time to 45-60 seconds.

this is the loop used to read the word document, and I wanted to know what could be faster than this (other loop, or other way to read/write?)

used loop to read the document
Code:
For Each p In word_doc.Paragraphs
        w_h_cor = p.Range.HighlightColorIndex
        w_font_cor = p.Range.Font.ColorIndex 
        w_bold = p.Range.Font.Bold 
        w_list = p.Range.ListFormat.ListType 
        w_length = Len(p.Range.Text)
        w_descr = p.Range.Text
        w_align = p.Alignment
        Call import_export(w_h_cor, w_font_cor, w_bold, w_list, w_length, w_descr, w_align, i)
        i = i + 1
    Next p

and this is how i write it into excel and structure the data:

Code:
Public Sub import_export(ByVal var_h_cor As Integer, _
                    ByVal var_font_cor As Integer, _
                    ByVal var_bold As Long, _
                    ByVal var_list As Integer, _
                    ByVal var_length As Long, _
                    ByVal var_descr As String, _
                    ByVal var_align As Integer, _
                    ByVal var_i As Long)
ReDim v_array(0 To ubound_import_export, 1 To 5, 1 To 3) ' ubound_import_export is the number of lines in doc
Dim w_tipo, w_cor As Integer
If var_length > 2 Then  'to ignore certain lines 'Or var_list <> 2
    If var_descr = UCase(var_descr) Then
        w_tipo = 1    'type 1
    End If
    If (var_bold <> -1 Or var_bold = 0) And var_descr <> UCase(var_descr) And var_list <> 2 Then
        w_tipo = 3    'type 3
    End If
    If (var_bold = -1 Or var_bold <> 0) And var_descr <> UCase(var_descr) And var_list <> 2 And var_align <> wdAlignParagraphRight Then
        w_tipo = 4    'type 4
    End If
    If (var_font_cor = wdRed Or var_h_cor = wdRed) Or var_font_cor = 6 Or var_h_cor = 6 Then  'Or var_font_cor = 11 Or var_h_cor = 11
        w_cor = 1   'red
        Else: w_cor = 3 
    End If
    If (var_font_cor <> wdRed And var_font_cor <> wdAuto And var_font_cor <> wdBlack) Or _
    var_h_cor <> wdRed And var_h_cor <> wdAuto And var_h_cor <> wdBlack Then
        w_cor = 2 'green
        'Else: w_cor = 3 
    End If
    If var_align = wdAlignParagraphRight Then
        w_tipo = 2    'juizo
    End If
    If var_list = 2 And var_align <> wdAlignParagraphRight Then  
        w_tipo = 5 ' type 5
    End If
   v_array(var_i, w_tipo, w_cor) = Trim(WorksheetFunction.Clean(var_descr))
   ActiveWorkbook.Worksheets("PADRAO").Cells(var_i - cnt_null, w_tipo * 2) = v_array(var_i, w_tipo, w_cor)
   Call atribui_cor(w_cor, var_i - cnt_null, w_tipo)
     Else: cnt_null = cnt_null + 1
End If


End Sub

to migrate colors in word doc to excel i have:

Code:
Sub atribui_cor(ByVal var_cor As Integer, _
                ByVal var_row As Long, _
                ByVal var_col As Integer)
    Select Case var_cor
        Case 1
            ActiveWorkbook.Worksheets("PADRAO").Cells(var_row, var_col * 2).Interior.Color = vbRed
        Case 2
            ActiveWorkbook.Worksheets("PADRAO").Cells(var_row, var_col * 2).Interior.Color = vbGreen
    End Select
End Sub

thank you all
 

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().

Forum statistics

Threads
1,224,225
Messages
6,177,273
Members
452,765
Latest member
Erka Gizli

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top