Repairing corrupted files - Excel, Word, PDF and docs

sidewaysguy
sidewaysguy used Ask the Experts™
on
I working on a time critical task with some lawyers at one of our clients, in exporting certain emails from PST files which have corrupted files. My task is to try, on a best effort basis, repair - PDFs, DOCx, XLSx, JPEG files. If anyone know of any softwares you have worked with in the past in repairing such files, please send any info my way. Thank you 

Error examples:

XLSs:  Excel cannot open the file ... because the file format or file extention is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

PDF: There was an error opening this document. The files i damaged and could not be repaired.
PDF files are old like 2015, however I am able to open some other attachments from the same email with PDF so it should not be the fact that how old is it.

IMage file " JPG:  It appears that we dont support this file format. using windows image viewer.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Joseph O'LoughlinSystems Administrator

Commented:
Exclude files too small for the relevant format.

LibreOffice makes a good stab at opening office files, often successful at recovering some corrupted files.

Irfanview is a good image utility, for example, is clever enough to recognise wrong file extensions.  

An other way of recovering word documents, is in an new empty document, use insert, object, text from file, and save under an new name.
NoahHardware Tester and Debugger

Commented:
Hi there, you stated quite a different range of file types. Hence, I would recommend looking at Repair Toolbox products as a good start :)

All of the products are free and you may refer to the link below for a list of their software.
https://www.repairtoolbox.com/download.html
I know this sounds like a lecture, but the mention of lawyers puts me on alert:
"I working .... with some lawyers at one of our clients"
If any of these emails or documents, or data from them, are going to potentially be used in a criminal or civil court, tribunal, etc, then you could be expected to vouch for the accuracy and integrity of the data and possibly explain or demonstrate how that data was recovered from seemingly corrupt files.  That would really be a job for a forensic data recovery expert.

My first thought about the DOCX and XLSX files is whether they may actually be DOC and XLS files.  An easy way to test is to rename COPIES of the files and change the extensions to ZIP.  The default Office 2007 and onwards file types with the X at the end of the extension are simply ZIP files containing a number of "layout" files in XML and other formats plus resource files like embedded images.  If it doesn't unzip, then the file is the pre-2007 Office type or it is very corrupted.

Another issue I encounter a lot is where files are copied from one computer to another and they are automatically "flagged" by the insertion of a tiny metadata marker that would make the file properties show the "unblock" option under the "General" tab.  This flag is what shows an "always ask before opening" popup with a checkbox on double-clicking a file.  Unchecking it removes the metadata flag in the same way as clicking the "unblock" button in the file properties and then clicking "apply".  With this flag in place in a file sometimes the default application isn't able to open the file on a double-click.  This flag is only supported on drives formatted as NTFS, so if they are copied via a FAT32 formatted drive (like a USB stick), the metadata is removed.  Go through each of the files and "unblock" them, then see what happens when you try to open them.

Image files like JPGs contain a main image and a thumbnail image.  If the thumbnail image is corrupt or missing, a simple image viewer like Windows Image Viewer can halt and wrongly infer that the image is corrupt.  As mentioned by Joseph above, Irfanview will usually open image files even if they have the wrong extension or have missing thumbnails.

How are you trying to open the PDFs?  Adobe Acrobat Viewer or using the built-in PDF viewing ability of a browser?

I assume that you have extracted the email messages to *.msg files and have then opened them in Outlook and saved out the attachments as standalone files?  If you are trying to open them directly from the attachment line in Outlook, this may be the issue.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial