Application sometimes fails when accessing Word

WT Cline

New Member
Joined
May 30, 2018
Messages
14
Hi all - yes I hate that "sometimes" word too. But in this case, I cannot find any good reason for this failure. Must be said upfront - the application I describe below works perfectly....until it doesn't. So, the code itself is ok. But clearly the code is doing something that it eventually doesn't like, and I get errors. I am using the latest and greatest Office 365 ProPlus versions.

My application loops through a text file, reading/pulling various bits of information, to be written to a spreadsheet. Part of that information is a Word document file name. My program needs to read the Word file for more specific information. So, my code calls a function that:
  1. creates a FileSystemObject (set FSO = CreateObject("Scripting.FileSystemObject)
  2. creates a word app (set wdapp = CreateObject("Word.Application")
  3. Using the FSO object, checks to see if the folder and file exist, if so,
  4. opens the Word file via wdapp
  5. reads/captures the specific data
  6. closes the Word file
  7. sets the file and Word app objects to "nothing"
  8. returns to the calling subroutine.
Within a single .txt file, this function will be called up anywhere from 1 to 90 times. The function is calling UNIQUE word docs, by the way, not the same Word doc over and over.

When the application is finished scanning through the .txt file, it goes to the next .txt file and repeats this process. There are 60 .txt files this application reads in this way. The whole point of the program is go through all 60 .txt files, and print the results onto a large spreadsheet. In doing so, the code will open/read/close 100s and 100s of Word docs. As I have said, it is all working....until it "dies" on me.

This works great for somewhere around 100-150 "loops", over a couple of the .txt files. side note: reading thro9ugh the .txt file is alarmingly FAST! But it slows way down in this function, apparently during the object creation, checking for folders and files and opening of the Word file (a couple seconds at least). That is probably not an issue, but throwing it out there in case anybody thinks opening a Word file (these are small files, 1 - 3 pages, minimal text) shouldn't be so slow.

But nevermind speed for the moment. While running this program, at some point, it will seem as if the program has stalled...and finally I will get an error, always within this Function. The errors are around not being able to open the file, or that Word has a a problem and can't open the file, or that Office has an issue, that maybe the file is corrupt. They are not. I have checked them, they all open and read just fine, as expected.

So, I am thinking, might their be a limit as to how many times, in short succession I can open/close a Word file? I know, doesn't make sense to me either...

Before you ask, no, it does NOT fail on the same file each time. I have stopped at this error, gone to the actual file, opened it - no problem. And, I can start and stop my application using other text files, and therefore it will be checking a whole different set of Word docs, and the error will still occur. The files are always where they should be, appear to be available to open, and not locked (I check for that possibility in the code).

Based on this description, anybody have any ideas what I might look for, or other coding methods to avoid the constant createobject/open/read/close cycle? For example, is there a way to pass the objects via the Function, so I can create only one instance of the FileSystemObject and Word.app in the main sub, then use them many times within the function? not sure I know how to do that... but....willing to learn! Open to any ideas.

Many thanks
Terry
 

WT Cline

New Member
Joined
May 30, 2018
Messages
14
Thanks Paul - yes, that was the solution for me: moving object creation and opening the Word file to my main module BEFORE calling the function that formerly created the object and opened the Word document. Sped everything up (reduced processing time by about 75%) and the application no longer mysteriously crashes. Thanks.
 

Some videos you may like

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying

Zack Barresse

MrExcel MVP
Joined
Dec 9, 2003
Messages
10,881
Office Version
  1. 365
  2. 2010
Platform
  1. Windows
  2. Mobile
  3. Web
Great information Terry, this is very helpful. And as always Paul, thank you.

Some things to think about, and some questions:

Information:
  • ADO Streams can be re-used.
  • Public variables can be helpful, although not as robust as Properties.
  • Classes can contain pretty much anything you want.
Questions:
  • Do you need to evaluate each line individually or can you loop through each found instance of your criteria? (Not sure how many you have.)
  • Do you need to evaluate your criteria in a Word document in that specific iteration, or can you make a list and do them all at once at some later point?
These are more workflow questions. I'm not saying it's not good, but these are the types of questions I would pose before putting a process(es) together.

For classes, while it's a little difficult for me to evaluate the need for them here, you most likely could. The question I would ask myself is, if I were to make this an object, what kind of properties or methods would I give it? It could be as something as simple as storing multiple pieces of information. For example, if you want to store more than just a file path to a Word document, but also wanted to capture key words, paragraphs/data, or other relevant information, you could make a class to contain that data. I'm not sure it's prudent here to create one, I'm just making an example of how you could.

I'm glad it's working much better for you. If there's anything else we can help with, don't hesitate to post back. (y)
 

WT Cline

New Member
Joined
May 30, 2018
Messages
14
Thanks Zack for you continued interest. Quick update: The code is running great right now (well, that's relative, since I am not sure if other methods -using classes and properties and just better coding - would be even more efficient. I DID move the ADO Stream code to only call once, am sure that is an improvement. I can tell you it takes about 33 minutes to run through 60 HTML files (I have them stored as Unicode enabled .txt files...I probably could have done the same thing just directly as HTML files...), and I find a total of 2298 PDF files in these 60 HTML "pages".

While evaluating each .txt file I find the specific section of the HTML where a readable/downloadable PDF lives, and then find such values as the category, the downloadable link, the name of the document and a part number of the document. The HTML code for all of this is - thanks goodness! - consistent. That is, once I know how to find one document, I can keep looping through the txt file, looking for more occurrences. And as part of each "loop" through the txt file, I am also opening a corresponding Word file to get even more data. At any rate, each time I find this information, I drop the values to an Excel spreadsheet for additional analysis. And when I get to the end of the txt file, I simply loop to the next txt file and start the whole process again.

So my biggest question might be - while I suppose I could set up a class for the data I am seeking, how (or is) that advantageous over simply looping through code that finds what I need, without any class created? I still need the code that takes each line to see if what I am looking for is somewhere in that line from the txt file. If not, go to the next line, if so, dig deeper and extract the info I am looking for, doing some validation at the same time.

Answering the two questions:
1)Do you need to evaluate each line individually... I start with assuming I do NOT know where in the txt file I will start find the info I am looking for. So, yes, I literally built a For loop that starts at line one and goes sequentially through the array that contains all of the lines of data. Many lines blank, many have no useful info, so in that sense, I am wasting loops. But when I do find some data, I stop and run the data through other functions to try to extract what I am looking for. Does that answer the question? I don't really know what you mean by "found instance" of my criteria?
2) Do you need to evaluate your criteria in a Word document in that specific iteration..Currently I am programming in a very linear way, ie "Find this thing...once found, then go find this other thing, now go to Word to find some related info, then come back and find another thing, ok, I have everything I need, reset and start looking for the NEXT set of data..."

I suppose I COULD find my data, stick it in some other variable/array, and process it all later, once everything else is found. But I just thought then I'd be needing more code to match up the results of the Word search with the records it was related to. Either way, I have to open up a number of Word docs, or sometimes the same word doc again, to get other information. Long story there! So I just figured, why make it messier (and hope it doesn't slow things down), let's just get the data, complete the record, and go to the next record. I know "hope" is not a great programming technique!
 

Zack Barresse

MrExcel MVP
Joined
Dec 9, 2003
Messages
10,881
Office Version
  1. 365
  2. 2010
Platform
  1. Windows
  2. Mobile
  3. Web
There are really several distinct processes to your project. Look in text files, upon condition look in Word files, look for other matching conditions. As far as using a class, I'd say it's "meh" for this. Since you've never used them, I'd probably not recommend using them now. Probably best to learn them on something scoped a bit smaller.

This is interesting though...
Either way, I have to open up a number of Word docs, or sometimes the same word doc again, to get other information.

Now this is a good reason to store the data in your txt iteration somewhere for use later. If there is a chance at utilizing the same Word file more than once, I would most certainly do something different. The goal would be to isolate all Word documents needed, along with a list of everything you want done for that document. This would make sense in a class, although you could also use a Collection, or both. I love intellisense and custom creation, and since Collections give no intellisense, using them with a class makes sense to me.

All of that being said, it would take a little time to implement in your code. I don't have time tonight as I'm between jobs at the moment, but I'd be happy to tackle this tomorrow. Since we're getting pretty well involved in your process, would it be possible for us to see the entire routine(s)?
 

WT Cline

New Member
Joined
May 30, 2018
Messages
14

ADVERTISEMENT

Sure! if you're THAT crazy, I will certainly feed your wonderful obsession! lol!

What I would probably do is send you an actual txt file so you can see exactly what I am dealing with. I can also send you an actual WORD file so that I can explain a bit more clearly why I sometimes will open the same Word file later. In fact, I am just now realizing it WILL happen MANY times as I walk through evaluating the files. There are actually two reasons....but the main one is that each txt file is for a different country or language. And we publish the same document file (a PDF) to many countries. So, yeah, my current method WILL open the same Word file as we go through all of the txt files.

Since I have been stuck focusing on performing my evaluations in a linear fashion, I see now that I am introducing far more repetition then I was at first considering! Yikes! And you are saying, if we put this information in a single place, once, and reference it anytime we need it, we save all of that opening and closing of the file useless activity. Doh! (as I smack my forehead!). Can/should we take this offline?
 

Macropod

Retired Moderator
Joined
Aug 27, 2007
Messages
3,490
What I would probably do is send you an actual txt file so you can see exactly what I am dealing with. I can also send you an actual WORD file so that I can explain a bit more clearly why I sometimes will open the same Word file later. In fact, I am just now realizing it WILL happen MANY times as I walk through evaluating the files. There are actually two reasons....but the main one is that each txt file is for a different country or language. And we publish the same document file (a PDF) to many countries. So, yeah, my current method WILL open the same Word file as we go through all of the txt files.

Since I have been stuck focusing on performing my evaluations in a linear fashion, I see now that I am introducing far more repetition then I was at first considering! Yikes! And you are saying, if we put this information in a single place, once, and reference it anytime we need it, we save all of that opening and closing of the file useless activity.
There is no need to open/close the same Word document multiple times; you'd do better to open them once, setting the appropriate reference to each, then whatever progressive updating is required (you can update the referenced document however many times you need), then close once all the processing is done.
Can/should we take this offline?
That would be against the board rules.
 

WT Cline

New Member
Joined
May 30, 2018
Messages
14
Not intending to break any rules, though I am confused as to why? I asked the original questions, Zack kindly replied, and after several long conversations on this forum, I asked about taking this offline; he is not, and has not, tried to sell me or hoodwink me into any nefarious activity. He is not aggressive or otherwise acting badly in any manner whatsoever, nor has he somehow encouraged me to ask. Quite the opposite is true. He's been amazing. Highly professional. And I would like to share some information with him that I am less interested in sharing with the whole community (the actual files I am working on, which will dramatically further the understanding of the application). So I am just wondering what the board rules are intended to achieve in this case? Thanks.
 

Macropod

Retired Moderator
Joined
Aug 27, 2007
Messages
3,490
I'm not suggesting anything has been amiss re Zack's actions; just pointing out what the rules say (see Message Board Rules) without providing commentary on their merits. One might observe, though, that taking discussions off-forum both deprives others of further insight into the problem and denies them the opportunity to contribute to the discussion.
 

Watch MrExcel Video

Forum statistics

Threads
1,119,237
Messages
5,576,899
Members
412,753
Latest member
Coach_Olson
Top