Retreive data from Word

waterdog15

New Member
Joined
Dec 27, 2009
Messages
15
How do I retrieve text from a word document that is alway is a specific place or after a certain string? Someone else created a series of word documents that all start like this:


Date: _________ Name: _________
Reason: __________ Zone: _________

and so on. These documents originally were meant to be printed and then hand written, so it is not any type of table or form but all of the documents have the same layout. We eventually went high tech and started typing in the information in the word document. So what I am attempting to do is retreive the data from the Name "field", the Reason "field" and the Zone "field".

I am thinking I could have Excel open the document, then look for the text "Name:" and then return the next 25 characters. Then do the same for "Reason:" and then "Zone:". I can open the documents but do not know how to do this.

So my question is, is excel capable of searching a document for a string and then returning the next 25 characters?

Thanks for any suggestions
 
Last edited:

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Hi waterdog15,

The answer is definitely yes, Excel can do this. But it would be a bit easier (no search required) if you were to replace the 25-character form fields with true form fields. Then you can give these fields names and have Excel simply access them by the Form Field object name. Form fields also give you advantage of validation, e.g., you could then ensure that a valid date is entered into the Date field, and a valid zone (whatever that is) is entered into the Zone field.

Damon
 
Upvote 0
I plan to do that going forward (and I will need to know how to do that as well) but what can I do with the old forms that are already done?

Thank-you,
Paul
 
Upvote 0
Okay Paul, here is some code:

Code:
Sub GetFieldText()

   'Demo of getting text from fields such as "Name:_____________"
   
   Dim WdDoc      As Object
   Dim ChStart    As Integer
   Dim ChEnd      As Integer
   Dim StrName    As String   'The resulting string we want to find
   
   Set WdDoc = GetObject("c:\MyOldForm.docx")
   
   With WdDoc
      If .Range Like "*Name:*" Then
         'match found
         'Look for text between "Name:" and "Reason:"
         ChStart = InStr(1, .Range, "Name:") + 6
         ChEnd = InStr(ChStart, .Range, "Reason:") - 2
         StrName = Trim(.Range(Start:=ChStart, End:=ChEnd))
         Range("B4") = "Name:"
         Range("C4") = StrName
      End If
   End With
   
End Sub

This code looks for the text between "Name:" and "Reason:" and ignores the two characters before "Reason" assuming they are carriage return-linefeed, or other characters to be ignored. It puts the result in the cell C4 of the active worksheet. Notice that I named the Word file "MyOldForm.docx". You should change this to your document name (would be .doc instead of .docx if created in a version of Word prior to 2007). Also you should replace my C:/ path with the actual path to your Word document.

The efficient way of doing this would have been to use the Find method of the Word Range object, but this is very unwieldy to use from VBA, so I used a brute force method: a combination of the Like operator (to determine if the document contains the "Name:" string and the InStr function (to determine where the "Name" and "Reason" strings start). This should be adequate for a simple form.

I hope you find this helpful.

Keep Excelling.

Damon
 
Upvote 0
I modified the code as shown below but instead of subtracting 15 from where it found "Lockout Supervisor (if required):" I changed it to add 20 to the starting point. The problem is if the lockout coordinators name is less than 20 characters, it will take the next characters in the document. For example, if my document looked like this:

Lockout Coordinator: John Doe
Lockout Supervisor: Jimmy Doe
The cell would contain:

John DoeLockout Sup <THE p small a shows also cell square<>between the e and the L (I think it is
the carrage return symbol)

If I put an underscore after the name, it would look like:
John Doe_________

Is there a way to have it stop at the underscore or 20 characters?

Thanks again,
Paul


Sub GetFieldText()
'Demo of getting text from after fields such as "Lockout Coordinator:"

Dim WdDoc As Object
Dim ChStart As Integer
Dim ChEnd As Integer
Dim StrName As String 'The resulting string we want to find

Set WdDoc = GetObject("c:\test.doc")

With WdDoc
If .Range Like "*Lockout Coordinator:*" Then
'match found
'Look for text after "Lockout Coordinator:"
ChStart = InStr(1, .Range, "Lockout Coordinator:") + 20
ChEnd = InStr(1, .Range, "Lockout Coordinator:") + 40
StrName = Trim(.Range(Start:=ChStart, End:=ChEnd))
Range("B4") = "Lockout Coordinator:"
Range("C4") = StrName
End If
End With

End Sub
 
Upvote 0
Hi again Paul,

One way would be to have the InStr function look for the first underscore character after the start of the name, then back up one. I think this should do it:

ChEnd = InStr(ChStart+1, .Range, "_") - 1

Damon
 
Upvote 0
Thanks Damon!

It worked perfect with 2 of the 3 items. For some reason it doesn't stop at the first _ on the one field that looks like:

mixing T (zone 3)________________

but I changed it to following and it works great!

ChEnd2 = InStr(ChStart2 + 1, .Range, ")")


By the way, when I import items from the actual table, I lose the hyphens. Do you have any idea why?

Thanks again!
Paul
 
Upvote 0

Forum statistics

Threads
1,214,978
Messages
6,122,545
Members
449,089
Latest member
davidcom

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top