Posted: 19 May 2017 6:54 EDT Last activity: 16 Oct 2018 12:03 EDT
Reading content of a MS Word file
The specification is to find text in a word file to save in a context variable in order to be used in later automations. I have already added microsoftWord connector to the toolbox and I have managed to open a MS Word file but I don't know how to read the content of this file.
I'd appreciate any suggestions
**Moderation Team has archived post**
This post has been archived for educational purposes. Contents and links will no longer be updated. If you have the same/similar question, please write a new post.
You need to understand a little bit of the Word Object Model to work with a document. The connector exposes a property named WordDocument which is your starting point - this is a wrapper to the actual document. In the example I show below we next get the Paragraph collection from the Document. As we loop through each Paragraph we compare the Range.Text (the text of the paragraph) with your search phrase. This should get you started - there are several other ways you can do this but this should get you smaller chunks of text to compare and return.
Edit: change the start value of the loop to 1 (Word collections are 1 based - and update the Limit value to count + 1).
This will require a script most likely. You can use a script to perform complex operations on a Word document. Here is an example of a script that reads tables from a document.
First add a script component to your solution. You will need to add the interop dll for Word as a reference to the script. To do this, right click on the script component and select "Edit Reference". Choose the Microsoft.Office.Interop.Word reference from the GAC.
Now create a script, either double click on the Script component, or right click and select "Edit Script".
Click the green + button at the top of the Script editor to add a new script.
Click the Validate button to validate the script. Now click the green + button again to add a second script.
Enter the following values in the specified fields
Parameters: Microsoft.Office.Interop.Word.Document worddoc, out string message, out Hashtable header, out System.Data.DataTable instructors, out System.Data.DataTable students
My automation can be done as a script as well - this will be faster to execute on long documents. You will need to do what Mike demonstrated with adding the reference. Here is the script and the automation.