How do I best "neutralize" a complexely formatted Word-document by using macros?
The two last translation projects I've received from my customer have been Word-documents with complex formatting. The reason for the complex formatting probably has been that the original documents first have been scanned and then OCR:ed.
I'm using SDL Trados 2007 Freelance Translator's Workbench) to translate these Word-documents (directly in MS Word itself, with Translator's Workbench as an add-on through an added template: TRADOS8.dot).
Usually, it works fairly well to do the actual translation: I open a Translation Unit (TU) in the Word-document and translate, row by row. A TU can be a single word, a phrase or a whole sentence. This TU opens in two color-highlighted fields: one for the source language TU, another for the target language TU. After I've translated a TU, Translator's Workbench adds a formatting marking around the whole TU, which contains information about the formatting.
The problem comes when I'm done with the whole translation, try to save as a monolingual Word-document (containing only the translation text) and am about to submit the finished translation to my customer. There are various problems, like yesterday when I received the error message that "This file cannot be processed as TTX because it was saved as a bilingual document in Word." (Which is illogical because there was no reason for it to be processed as TTX, I never did any settings in Translator's Workbench for that nor did I choose any such option). To cut it short, I can't save the document as monolingual Word-document, only as bilingual. I need to save as monolingual Word-document because that is the end-product: the finished translation. So there is some problem when trying to "move" the complex formatting from the bilingual document and create a monolingual document.
I am convinced the problem is related to the fact that the original document was scanned and OCR:ed which created lots of complicated formatting in the Word-document. Translator's Workbench then got confused by this formatting and couldn't save the document as a monolingual file.
My customer found a solution yesterday though: She just used a macro in Word and was able to save the Word-document as monolingual file.
So I wonder how I should handle this? I will continue to get more scanned Word-documents (and also scanned PDF-documents) that have been OCR:ed and therefore are very difficult to get to terms with the formatting.
Could I use a software like LaTex (or another desktop publishing software)? Otherwise, I have a license for ABBYY FinerReader Pro 9.0. Should I OCR once again myself, if that would help? Or how can I use macros in MS Word to "neutralize" a complex formatting?