We help IT Professionals succeed at work.
Get Started

How do I best "neutralize" a complexely formatted Word-document by using macros?

Last Modified: 2012-05-12
How do I best "neutralize" a complexely formatted Word-document by using macros?

The two last translation projects I've received from my customer have been Word-documents with complex formatting. The reason for the complex formatting probably has been that the original documents first have been scanned and then OCR:ed.

I'm using SDL Trados 2007 Freelance Translator's Workbench) to translate these Word-documents (directly in MS Word itself, with Translator's Workbench as an add-on through an added template: TRADOS8.dot).

Usually, it works fairly well to do the actual translation: I open a Translation Unit (TU) in the Word-document and translate, row by row. A TU can be a single word, a phrase or a whole sentence. This TU opens in two color-highlighted fields: one for the source language TU, another for the target language TU. After I've translated a TU, Translator's Workbench adds a formatting marking around the whole TU, which contains information about the formatting.

The problem comes when I'm done with the whole translation, try to save as a monolingual Word-document (containing only the translation text) and am about to submit the finished translation to my customer. There are various problems, like yesterday when I received the error message that "This file cannot be processed as TTX because it was saved as a bilingual document in Word." (Which is illogical because there was no reason for it to be processed as TTX, I never did any settings in Translator's Workbench for that nor did I choose any such option). To cut it short, I can't save the document as monolingual Word-document, only as bilingual. I need to save as monolingual Word-document because that is the end-product: the finished translation. So there is some problem when trying to "move" the complex formatting from the bilingual document and create a monolingual document.

I am convinced the problem is related to the fact that the original document was scanned and OCR:ed which created lots of complicated formatting in the Word-document. Translator's Workbench then got confused by this formatting and couldn't save the document as a monolingual file.

My customer found a solution yesterday though: She just used a macro in Word and was able to save the Word-document as monolingual file.

So I wonder how I should handle this? I will continue to get more scanned Word-documents (and also scanned PDF-documents) that have been OCR:ed and therefore are very difficult to get to terms with the formatting.

Could I use a software like LaTex (or another desktop publishing software)? Otherwise, I have a license for ABBYY FinerReader Pro 9.0. Should I OCR once again myself, if that would help? Or how can I use macros in MS Word to "neutralize" a complex formatting?

Watch Question
Top Expert 2012
This problem has been solved!
Unlock 3 Answers and 4 Comments.
See Answers
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE