Extract Word paragraphs

NdNoviceHlp

Well-known Member
Joined
Nov 9, 2002
Messages
3,637
From XL, I'm trying to search a summary document for a keyword, then copy the paragraph containing the keyword to another document.
I want to be able to do this for a variable amount of paragraphs. Currently, the following code finds the keyword in summary.doc (once) and then copies it to test.doc It's a start. I can't seem to find a way to select and copy the paragraph (even once). The keyword is always on line1 and the paragraph is always 4 lines if that helps? I'll appreciate any suggestions. Thanks. Dave
Code:
Private Sub CommandButton1_Click()
Dim Wdapp As Object, Wdapp2 As Object
Dim MyData As DataObject, Bigstring As String
'finds and extracts paragraphs from summary.doc to test.doc
'identifies paragraphs by key word in 1st line
'copies paragraph to clipboard then transfer to test.doc
'variable # of paragraphs to copy/transfer
On Error GoTo Evlmsg1
Set Wdapp = CreateObject("Word.application")
    Wdapp.ChangeFileOpenDirectory "c:\records\"
    Wdapp.documents.Open Filename:="summary.doc"
    With Wdapp.activedocument
     .Range(0, .Characters.Count).Select
    End With
 'keyword "seed" eg
 With Wdapp.Selection.Find
    .ClearFormatting
    .Execute FindText:="seed", Format:=False, Forward:=True
    '***line below errors
    '.Expand Unit:=wdParagraph
 End With
With Wdapp.Selection
.Range.Copy
End With
Wdapp.activedocument.Close savechanges:=True
Returntocode:
On Error GoTo 0
Wdapp.Quit
Set Wdapp = Nothing
Set MyData = New DataObject
    MyData.GetFromClipboard
    Bigstring = MyData.GetText(1)
On Error GoTo Evlmsg2
Set Wdapp2 = CreateObject("Word.Application")
    Wdapp2.ChangeFileOpenDirectory "c:\"
    Wdapp2.documents.Open Filename:="test.doc"
'use to initial clear test.doc
    With Wdapp2.activedocument
    .Range(0, .Characters.Count).Delete
    End With
With Wdapp2.activedocument
    .content.insertafter Bigstring
    End With
Wdapp2.activedocument.Close savechanges:=True
Wdapp2.Quit
Set Wdapp2 = Nothing
Exit Sub
Evlmsg1:
MsgBox "error1"
GoTo Returntocode
Evlmsg2:
On Error GoTo 0
Wdapp2.Quit
Set Wdapp2 = Nothing
MsgBox "Error2"
End Sub
 
This is much better and perhaps more readable. I got rid of the reference to the Word library by inserting numeric values for the Word code that XL didn't like (Thanks Jon!). The clipboard use was unneccessary and was removed. I re-worked the loop and I added a couple of functions to ensure the routine works even if either the source or transfer document is open. I use it to retrieve supporting documentation from Word documents for data totals stored in XL. This works only for 1 line paragraphs! I thought it might be useful for others, so here's the post. Dave
Code:
Sub myFind()
'Standard module
Dim Wapp As Object, Wapp2 As Object, Bigstring As String
Dim Message$, Title$, Default$, myTopic$, myData$
Dim Myrange As Variant, Myrange2 As Variant, Adjust As Integer
Dim ThisParaLoc As Integer, LastParaLoc As Integer, FirstParaloc As Integer
Dim Sourcefile As String, Transferfile As String

'Finds and extracts variable # of paragraphs containing keyword
'Transfer keyword paragraph along with next 4 paragraphs...
'from scource file to bigstring then to transfer file
'functions use file copies to manage file errors
'***Paragraphs are 1 line only

'Get keyword from user
Message = "Enter a topic to find:" ' Set prompt.
Title = "Get Topic!" ' Set title.
Default = "fertilizer" ' Set default.
' Display message, title, and default value.
myTopic = InputBox(Message, Title, Default)


Sourcefile = "c:\records\summary.doc" 'source document path
Transferfile = "c:\test.doc" 'search results transfer document path

'copy scource file to temp file & error check
If NoFileError(Sourcefile) Then
Exit Sub
End If

'Open source temp file and search
Set Wapp = CreateObject("Word.Application")
Wapp.documents.Open Filename:=Sourcefile & FreeFile(), ReadOnly:=True

ThisParaLoc = 0 'found paragraph line#
Bigstring = vbNullString 'combines paragraphs

'find last paragraph
Wapp.activedocument.Select
LastParaLoc = Wapp.Selection.Paragraphs.Count 'last paragraph(line#)

'Find keyword(myTopic) in the source WORD document
'loop to find all keywords in doc.
Do While ThisParaLoc < LastParaLoc
On Error GoTo Err2
Set Myrange2 = Wapp.activedocument.Paragraphs(ThisParaLoc + 1).Range
Myrange2.SetRange Start:=Myrange2.Start, _
    End:=Wapp.activedocument.Paragraphs(LastParaLoc).Range.End
Myrange2.Select
With Wapp.Selection.Find
.Text = myTopic
.Forward = True
.Execute

'expand keyword selection to whole paragraph(line) selection
'add next 4 paragaphs(lines) to selection
'adjust for pages and selection on more than 1 page
If .Found = True Then
MsgBox "Searching data WORD document for: [ " & myTopic & " ]!"
On Error GoTo Err3
.Parent.Expand Unit:=4
If Wapp.Selection.Information(3) > 1 Then
If Wapp.Selection.Information(10) <> 46 Then
Adjust = Wapp.Selection.Information(3) * 46 - 46
Else
Adjust = (Wapp.Selection.Information(3) - 1) * 46 - 46
End If
End If
FirstParaloc = Wapp.Selection.Information(10) + Adjust
Set Myrange = Wapp.activedocument.Paragraphs(FirstParaloc).Range
Myrange.SetRange Start:=Myrange.Start, _
    End:=Wapp.activedocument.Paragraphs(FirstParaloc + 4).Range.End
ThisParaLoc = FirstParaloc + 4
Myrange.Select
myData = Wapp.Selection.Text
MsgBox myData
Else
Exit Do
End If
End With

'store in bigstring
 Bigstring = Bigstring + myData
Loop

'transfer file to temp file & error check
If NoFileError(Transferfile) Then
GoTo RetErr2
End If

'clear transfer temp file
'transfer from bigstring to transfer temp file
On Error GoTo Err4
Set Wapp2 = CreateObject("Word.Application")
Wapp2.documents.Open Filename:=Transferfile & FreeFile(), ReadOnly:=False
Wapp2.activedocument.Select
With Wapp2.activedocument
    .Range(0, .Characters.Count).Delete
    .content.insertafter Bigstring
    End With
Wapp2.activedocument.Close savechanges:=True
Wapp2.Quit
Set Wapp2 = Nothing

'temp transfer file to real tranfer file
BackToReal Transferfile, False 'filepath,source doc

'handle errors
On Error GoTo Err5
RetErr2:
RetErr3:
RetErr4:
Wapp.Quit
Set Wapp = Nothing
Set mydata2 = Nothing
Set mydata3 = Nothing

'temp source file to real file
BackToReal Sourcefile, True 'filepath,source doc

GoTo myEnd
Err2:
MsgBox "Error2: Could not Find Topic: [ " & myTopic & " ]"
GoTo RetErr2
Err3:
MsgBox "Error3: Could not get paragraph from the data's WORD document!"
GoTo RetErr3
Err4:
MsgBox "Error4: Could not Paste paragraph from the data's WORD document!"
GoTo RetErr4
Err5:
MsgBox "Error5: Could not Clean up!"
On Error GoTo 0
'End tag!
myEnd:
End Sub
Function NoFileError(Flpath As String) As Boolean
'check if file exists. Copy to temp file from real file
Dim fs As Object
Set fs = CreateObject("Scripting.FileSystemObject")
If fs.fileexists(Flpath) Then
fs.copyfile Flpath, Flpath & FreeFile()
NoFileError = False
Else
MsgBox "Error: This file does not exist: " & Flpath
NoFileError = True
End If
Set fs = Nothing
End Function
Function BackToReal(Retpath As String, Flscource As Boolean) As Boolean
'if real source file open continue (leave source file open)
'if real transfer file open, close it and continue
'copy temp file to real file
Dim fs As Object, Objwordapp As Object, d As Variant
Set fs = CreateObject("Scripting.FileSystemObject")
On Error GoTo Errcode
fs.copyfile Retpath & FreeFile(), Retpath
Kill Retpath & FreeFile()
Set fs = Nothing
Exit Function
Errcode:
On Error GoTo 0
If Flscource Then
MsgBox "Transfer proceeding. This source file remains open:  " & Retpath
Else
MsgBox "Close all Word Docs. Transfer proceeding. This transfer file was open:  " & Retpath
Set Objwordapp = GetObject(, "word.application")
With Objwordapp
.Application.Quit
End With
Set Objwordapp = Nothing
fs.copyfile Retpath & FreeFile(), Retpath
End If
Kill Retpath & FreeFile()
Set fs = Nothing
End Function
 
Upvote 0

Excel Facts

Ambidextrous Undo
Undo last command with Ctrl+Z or Alt+Backspace. If you use the Undo icon in the QAT, open the drop-down arrow to undo up to 100 steps.
Hope you guys (ndNoviceHlp and Joe Was) are still watching this one. Considering the code you are using, maybe you can help me with this Problem:

I use Excel as data input to create a word document using find/replace.

Once completed I need to make sure that certain sections in word are not split with an auto page break. I can write code to know the start text and the ending text of a section (since I put it on there with VB).

But how can I determine that a page break sits in the middle of my section?

Hope this is emough detail and that I explain it in sufficient detail
 
Upvote 0
I used this section of code to deal with pagebreaks...
Code:
If Wapp.Selection.Information(3) > 1 Then
If Wapp.Selection.Information(10) <> 46 Then
Adjust = Wapp.Selection.Information(3) * 46 - 46
Else
Adjust = (Wapp.Selection.Information(3) - 1) * 46 - 46
End If
End If
FirstParaloc = Wapp.Selection.Information(10) + Adjust
Set Myrange = Wapp.activedocument.Paragraphs(FirstParaloc).Range

This is actually...
Code:
If Wapp.Selection.Information(wdActiveEndPageNumber) > 1 Then
If Wapp.Selection.Information(wdFirstCharacterLineNumber) <> 46 Then
Adjust = Wapp.Selection.Information(wdActiveEndPageNumber) * 46 - 46
Else
Adjust = (Wapp.Selection.Information(wdActiveEndPageNumber) - 1) * 46 - 46
End If
End If
FirstParaloc = Wapp.Selection.Information(wdFirstCharacterLineNumber) + Adjust
Set Myrange = Wapp.activedocument.Paragraphs(FirstParaloc).Range

I added this because the page break was causing failure in this routine. Why or how it works I'm not quite sure right now. Perhaps Joe can explain? I do remember that the equation only worked the way it is even though the "-46" part looks silly. This routine finds & extract x5 one line paragraphs and places them grouped together in a new doc ie. you can enter the paras without concern for pagebreaks then extract your information grouped without pagebreaks into a new doc. I'm not sure if this helps. Dave
 
Upvote 0
Interesting possibility, it would potentialy cleanup my code significantly

I added your code (changed Wapp to oWordApp to match my code.

Here is some of my code

global oWordApp as object
Set oWordApp = CreateObject("Word.Application")
' then I open the document

I get "Object variable or with block variable not set" in the first line of your code

Any ideas? Will show more code is needed.....
 
Upvote 0
My mistake, I found the problem with my code in the definition of the word object.

My next wuestion is: What do I do in my document to "group" things together, else how does it know what to consider a "paragraph" ?
 
Upvote 0
The posted code finds the search item and then the selection is expanded to include the whole para of the search item which then provides a para idendification number which is then used to set a range containing the found para + 4 more paras. This new group of paras is then set and selected and its text is converted to a string variable (myData). This group is then combined with other found groups(bigstring) which makes the new doc. For eg. if you replace the code posted with the following, only the para with the search item forms a group(mydata). These single groups are then re-grouped to form the doc(bigstring) ie the new doc contains all of the single paras which contained the search item.
Code:
Myrange.SetRange Start:=Myrange.Start, _
    End:=Wapp.activedocument.Paragraphs(FirstParaloc).Range.End
ThisParaLoc = FirstParaloc ' + 4

I thought if you knew how many paragraphs/lines you were extracting(mydata), you could then manage their combination in the new doc(bigstring). As for your questions, the paras are assigned a number which can be referred to. How you want to group them is probably a function of how they were inputted to begin with ie. If they were added to the search doc in groups of 5 lines (like the posted eg) you would prabably want to extract them in groups of 5 lines. Dave
 
Upvote 0
Here is what I discovered

I included both codes in the program

In Word I select the lines that I want to keep together and go to Format / paragraph - select tab Line and Pags Breaks and I checked

Widow/Orphan Control
Keep lines together
Keep with next

IT WORKS

Plan to experiment with you code if I really need both or if one will do it

Will keep you posted. You should consider yourself a genius.....

Thanks much
 
Upvote 0
It seems that there has been a fair bit of interest in this post since its' original posting so I thought I would post a new and improved version. Firstly, let me apologize to those whom had difficulties using the previous version. Unknown to myself at the time of the original posting, Word pagination has to be turned on in order for the code to work. Also, the code was MS version specific. I believe that I have remedied these imperfections with the following code which also provides a search for 2 keywords within a specified range. Dave

To use the following code:
Code:
Sub Callfunction()
Sfile = "c:\records\summary.doc" 'search document path
Tfile = "c:\test.doc" 'search results document path
Mytop2 = 2004 ' 2nd search word. Set to 1 if only 1 word
MyTop1 = "seed" '1st search word
TParas = 5 '# of paras to extract includes para where Mytop1 found

If Searchwords(TParas, Mytop2, MyTop1, Sfile, Tfile) Then
MsgBox "Transfer complete"
Else
MsgBox "Transfer not successful"
End If
End Sub

New code:
Code:
Option Explicit
Public TParas As Integer, MyTop1$, Mytop2$, Sfile As String, Tfile As String
 
Public Function Searchwords(TotParas As Integer, MyTopic2$, Mytopic1$, _
    Sourcefile As String, Transferfile As String) As Boolean
     
     'Extract and transfer range between Word docs. Standard module code
     'searches sourcefile for 1 or 2 words in a range defined by...
     'mytopic1(para 1/line 1) add totparas - 1 ie. both mytopic1 & mytopic2...
     'must be within number of paras/lines defined by TotParas to be found
     '(use 1 for mytopic2 if only 1 word search in range)
     'copies found range to transferfile
     'uses file copies to prevent/correct errors
     '** paragraphs are 1 line only
    
    Dim Wapp As Object, Wapp2 As Object, Bigstring As String
    Dim myData$, TemP As String, PagFlag As Boolean
    Dim Myrange As Variant, Myrange2 As Variant, Adjust As Integer
    Dim ThisParaLoc As Integer, LastParaLoc As Integer, FirstParaloc As Integer
     
    Searchwords = True
     
     'copy scource file to temp file & error check
    If NoFileError(Sourcefile) Then
        Exit Function
    End If
     
    On Error GoTo RetErr
     'Open source temp file and search
    Set Wapp = CreateObject("Word.Application")
    TemP = Left(Sourcefile, Len(Sourcefile) - 4) & "T.doc"
    Wapp.documents.Open Filename:=TemP, ReadOnly:=True
     
    ThisParaLoc = 0 'found paragraph line#
    Bigstring = vbNullString 'combines paragraphs
     
      'turn on pagination
    PagFlag = False
    If Wapp.Options.Pagination = False Then
        Wapp.Options.Pagination = True
        PagFlag = True
    End If
     
     'find last paragraph
    Wapp.ActiveDocument.Select
    LastParaLoc = Wapp.Selection.Paragraphs.Count 'last paragraph(line#)
     
     'Find keyword(myTopic) in the source WORD document
     'loop to find all keywords in doc.
    On Error GoTo RetErr2
    Do While ThisParaLoc < LastParaLoc
        Set Myrange2 = Wapp.ActiveDocument.Paragraphs(ThisParaLoc + 1).Range
        Myrange2.SetRange Start:=Myrange2.Start, _
        End:=Wapp.ActiveDocument.Paragraphs(LastParaLoc).Range.End
        Myrange2.Select
        With Wapp.Selection.Find
            .Text = Mytopic1
            .Forward = True
            .Execute
             
            If .Found = True Then
                On Error GoTo RetErr3
                
                'if > 1 page
                If Wapp.Selection.Information(3) > 1 Then
                    Adjust = Wapp.Selection.Information(3) * 46 - 46
                End If
                 
                 'expand keyword selection to whole paragraph(line) selection
                .Parent.Expand Unit:=4
                
                'expange range to include total # of paragraphs/lines
                FirstParaloc = Wapp.Selection.Information(10) + Adjust
                Set Myrange = Wapp.ActiveDocument.Paragraphs(FirstParaloc).Range
                Myrange.SetRange Start:=Myrange.Start, _
                End:=Wapp.ActiveDocument.Paragraphs(FirstParaloc + (TotParas - 1)).Range.End
                ThisParaLoc = FirstParaloc + (TotParas - 1)
                Myrange.Select
                 
                 'find 2nd keyword
                If MyTopic2 <> 1 Then
                    Myrange.Select
                    With Wapp.Selection.Find
                        .Text = MyTopic2
                        .Forward = True
                        .Execute
                    End With
                     
                    If .Found = True Then
                        Myrange.Select
                        myData = Wapp.Selection.Text
                    Else
                        GoTo Below
                    End If
                     
                End If
                 
                myData = Wapp.Selection.Text
            Else
                Exit Do
            End If
             
        End With
         
         'store in bigstring
        Bigstring = Bigstring + myData
        MsgBox Bigstring 'this line optional ie. comment out
Below:
    Loop
     
    On Error GoTo RetErr4
     'transfer file to temp file & error check
    NoFileError (Transferfile)
     
     'temp transfer file to real tranfer file (errorcheck/correct)
    BackToReal Transferfile, False
     
     'transfer from bigstring to transfer file
    Set Wapp2 = CreateObject("Word.Application")
    Wapp2.documents.Open Filename:=Transferfile, ReadOnly:=False
    Wapp2.ActiveDocument.Select
    With Wapp2.ActiveDocument
        .Range(0, .Characters.Count).Delete
        .content.insertafter Bigstring
    End With
    Wapp2.ActiveDocument.Close savechanges:=True
    Wapp2.Quit
    Set Wapp2 = Nothing
     
    Wapp.Quit
    Set Wapp = Nothing
     
     'temp source file to real file
    BackToReal Sourcefile, True
    Exit Function
     
     'handle errors
    Searchwords = False
RetErr:     On Error GoTo 0: MsgBox "Source Doc Error": Wapp.Quit: _
    Set Wapp = Nothing: Kill TemP: GoTo Erbelow
RetErr2:     On Error GoTo 0: MsgBox "Search Error": Wapp.Quit: _
    Set Wapp = Nothing: Kill TemP: GoTo Erbelow
RetErr3:     On Error GoTo 0: MsgBox "Range Creation Error": Wapp.Quit: _
    Set Wapp = Nothing: Kill TemP: GoTo Erbelow
RetErr4:     On Error GoTo 0: MsgBox "Transfer Doc Error": Wapp2.Quit: _
    Set Wapp2 = Nothing: Kill TemP: Kill Left(Transferfile, Len(Transferfile) - 4) _
    & "T.doc"
Erbelow: If PagFlag Then
        Wapp.Options.Pagination = False
    End If
End Function
 
Function NoFileError(Flpath As String) As Boolean
     'check if file exists. Copy to temp file from real file
    Dim fs As Object, TemP3 As String
    Set fs = CreateObject("Scripting.FileSystemObject")
    If fs.fileexists(Flpath) Then
        TemP3 = Left(Flpath, Len(Flpath) - 4) & "T.doc"
        fs.copyfile Flpath, TemP3
        NoFileError = False
    Else
        On Error GoTo 0
        MsgBox "Error: This file does not exist: " & Flpath
        NoFileError = True
    End If
    Set fs = Nothing
End Function
 
Function BackToReal(Retpath As String, FlScource As Boolean)
     'if real source file open continue (leave source file open)
     'if real transfer file open, close it and continue
     'copy temp file back to real file & kill temp file
    Dim fs As Object, Objwordapp As Object, d As Variant, TemP2 As String
    TemP2 = Left(Retpath, Len(Retpath) - 4) & "T.doc"
    Set fs = CreateObject("Scripting.FileSystemObject")
    On Error GoTo Errcode
    fs.copyfile TemP2, Retpath
    Set fs = Nothing
    Kill TemP2
    Exit Function
     
Errcode:
    On Error GoTo 0
    If FlScource Then
        MsgBox "Transfer proceeding. This source file remains open:  " & Retpath
    Else
        MsgBox "Close all Word Docs. Transfer proceeding. This transfer file was open:  " & Retpath
        Set Objwordapp = GetObject(, "word.application")
        With Objwordapp
            .Application.Quit
        End With
        Set Objwordapp = Nothing
        fs.copyfile TemP2, Retpath
    End If
    Set fs = Nothing
    Kill TemP2
End Function
 
Upvote 0

Forum statistics

Threads
1,216,075
Messages
6,128,662
Members
449,462
Latest member
Chislobog

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top