HEX info for docx files

My brother has a external drive full of docx documents that are corrupted. According to him, his office got hit by a trojan. I've tried 5 of the recovery programs online. None of them have worked. I was chatting with a support tech for one of the programs and he said the document header was corrupted. After a little more research I downloaded HxD and opened the file to view the hex. Can someone tell me what I should be looking for to fix the document header so I can open them? Unfortunately he does not have a backup for these files and they are pretty much every case he's ever worked on. Any help would be greatly appreciated.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

The first thing to be aware of is that .docx files are in fact zipped. With an uncorrupted file it should be possible to change the extension to .zip and to unzip it into its component files.

The first two bytes of a zip file should be "PK".
snievesAuthor Commented:
I tried that already. When I try to open the zip in WinRar, I get an error. Please see attached.
Zip files store their "header" information at the end of the file, as it happens.  There is a tiny indicator at the beginning: the signature PK are the first two characters and then the zip utility knows to check the last 65535 bytes of the file for more PK signatures and one of them will be the directory or the directory indicator.  The full format is in PKWare's APPNOTE.txt (but as Microsoft has had their hands on these files, you can bet they've added something proprietary!).

A nice simple zip repair tool will be able to recover information from the zip and they don't come much nicer than DiskInternal's Zip Repair: http://www.diskinternals.com/zip-repair/

PKWare APPNOTE.txt: http://www.pkware.com/documents/casestudies/APPNOTE.TXT
Redefine Your Security with AI & Machine Learning

The implications of AI and machine learning in cyber security are massive and constantly growing, creating both efficiencies and new challenges across the board. Check out our on-demand webinar to learn more about how AI can help your organization!

btanExec ConsultantCommented:
Specific to DOCX See this for the necessary file for the ZIP.
Documentation on OOXML may provide a guide to analysing a DOCX file.
Also good to see other alright DOCX as ref

Can check on Offvis, S2 Services and Object Fix Zip or DiskInternals ZIP Repair. See this
Article - http://www.techradar.com/news/software/applications/9-ways-to-recover-a-corrupt-microsoft-office-file-712429/2
S2 Service - http://legacy.s2services.com/corrupt.htm
snievesAuthor Commented:
I tried wordfix, officerecovery.com, and cimaware's office fix. None of them worked. I have tried changing the docx file extension to .zip but it will not open either. I have attached an example corrupted file. I read through the PKware page. I am using HxD and can see the consistencies in data for files that work but I don't know where I would insert the updated HEX in the file that does not work. Am I screwed?
Your example file is not one that I recognise as ever having been a Zip file, nor do the array of tools I have (Windows and Linux): there is no PK signature anywhere in the file and there are no tell-tale blank (zero-byte) areas in the headers which MS docs get.  For MS Office docs, the first file is always "[Content_Types].xml", which you can see in the examples.

I suspect the file has been overwritten by garbage and your content is lost.

Example Doc1.docx (* line means repeat of previous):
00000000  50 4b 03 04 14 00 06 00  08 00 00 00 21 00 dd fc  |PK..........!...|
00000010  95 37 66 01 00 00 20 05  00 00 13 00 08 02 5b 43  |.7f... .......[C|
00000020  6f 6e 74 65 6e 74 5f 54  79 70 65 73 5d 2e 78 6d  |ontent_Types].xm|
00000030  6c 20 a2 04 02 28 a0 00  02 00 00 00 00 00 00 00  |l ...(..........|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000230  00 00 00 00 00 00 00 00  00 b4 54 cb 6e c2 30 10  |..........T.n.0.|

Open in new window

Example Book1.xslx:
00000000  50 4b 03 04 14 00 06 00  08 00 00 00 21 00 58 56  |PK..........!.XV|
00000010  c6 8f 60 01 00 00 18 05  00 00 13 00 da 01 5b 43  |..`...........[C|
00000020  6f 6e 74 65 6e 74 5f 54  79 70 65 73 5d 2e 78 6d  |ontent_Types].xm|
00000030  6c 20 a2 d6 01 28 a0 00  02 00 00 00 00 00 00 00  |l ...(..........|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000200  00 00 00 00 00 00 00 00  00 00 00 cc 94 4d 4e c3  |.............MN.|

Open in new window

Your sample:
00000000  29 f6 01 bb 08 1c 31 d6  6f 28 ee af 68 4d 4b 5b  |).....1.o(..hMK[|
00000010  54 4d ed dc 1b b6 96 90  99 f4 d8 e8 44 8f 3b f0  |TM..........D.;.|
00000020  ff 7a c4 c5 f0 1a df 87  fa 96 ae 5e 35 41 fe d4  |.z.........^5A..|
00000030  5b ec ab ec 32 a3 39 84  54 40 73 68 ba 03 ec 6b  |[...2.9.T@sh...k|
00000040  c5 60 a5 5d b6 7e 9c ca  fa 3e 63 6a 83 29 26 32  |.`.].~...>cj.)&2|
00000050  bb 8b d5 2f a9 eb 61 47  4b be 9a 02 9c f5 67 03  |.../..aGK.....g.|
00000060  bb b6 4f 6f 46 3e a3 01  00 5c 14 11 38 5d 46 08  |..OoF>...\..8]F.|
00000070  a8 0f 77 a4 10 a3 0c 87  49 43 0d 52 74 0f c4 0f  |..w.....IC.Rt...|
00000080  e4 8d ac a4 ab 43 6c 67  4a cf 57 83 d4 e7 fa 26  |.....ClgJ.W....&|
00000090  f4 48 99 29 80 9d 4d dc  e0 6e 54 9b d3 b1 ea d5  |.H.)..M..nT.....|
000000a0  09 fa 6c e3 f9 77 80 dc  85 1e af 54 68 cf a4 ee  |..l..w.....Th...|
000000b0  08 c3 b3 58 22 29 1d 98  29 15 5a 2c 70 a0 79 83  |...X")..).Z,p.y.|
000000c0  67 dc 1f cb 59 e7 86 50  b5 05 0b 04 60 72 0b 1d  |g...Y..P....`r..|
000000d0  45 d9 59 ed 94 d1 03 f6  d1 a7 40 84 50 46 b7 a0  |E.Y.......@.PF..|
000000e0  76 0d bd 21 fd 85 1e 55  32 eb a3 2c 62 82 ad b0  |v..!...U2..,b...|
000000f0  d1 8e 0f 29 a2 3c e2 bb  42 c9 8a aa d7 e8 86 d7  |...).<..B.......|
00000100  4b e4 d8 37 58 b2 6c 73  92 a6 8f 55 2c 70 28 e0  |K..7X.ls...U,p(.|
00000110  75 49 9f 78 43 a3 75 ce  20 5a 3f 6a c3 95 b2 02  |uI.xC.u. Z?j....|
00000120  69 ab 71 3a bd 83 9a 4e  1f 27 af 2e 29 6d 2d 77  |i.q:...N.'..)m-w|
00000130  2f 90 ff 0e 7d d4 0d e7  4b ab d3 e2 b1 97 4a dd  |/...}...K.....J.|
00000140  96 f5 4f f6 4e 65 5d c8  62 5f 37 58 80 dc 08 6e  |..O.Ne].b_7X...n|
00000150  a9 70 db a1 0e 7e 45 52  e8 d8 f8 10 f1 da d7 c7  |.p...~ER........|
00000160  a2 7e 2e 9b b0 95 a4 1b  08 ab a5 80 59 18 13 b7  |.~..........Y...|
00000170  31 4d 71 2e 29 5f fd 90  1d 1b 90 eb 8c 74 e3 c7  |1Mq.)_.......t..|
00000180  fb e0 88 05 d3 43 9b 90  4b ba 6b 01 bc 1e ea 20  |.....C..K.k.... |
00000190  3f de 4b 85 c2 7d cb 89  60 e1 48 86 e8 77 77 67  |?.K..}..`.H..wwg|
000001a0  ae b6 82 e2 e4 a6 30 bd  e9 bf 43 aa 59 52 5b 71  |......0...C.YR[q|
000001b0  e2 34 f9 e1 bb 8b 4b e3  01 0b 57 b3 62 c4 f5 db  |.4....K...W.b...|
000001c0  20 37 6f cf 51 c7 37 05  09 83 9a 1a b7 36 f5 8f  | 7o.Q.7......6..|
000001d0  17 fe 39 27 5f c4 e8 36  db 63 14 0b 6d fb a3 5a  |..9'_..6.c..m..Z|
000001e0  0f f0 eb 05 8d 25 32 ab  0d 3b 5e f1 28 e7 fa 65  |.....%2..;^.(..e|
000001f0  5a 2e 57 14 2c 40 02 5b  27 78 b7 38 a0 d5 25 14  |Z.W.,@.['x.8..%.|
00000200  43 56 e7 90 32 61 53 89  39 c8 7b e0 c5 79 02 51  |CV..2aS.9.{..y.Q|

Open in new window

Some sample random zip files (non-MS Office) for comparison:
00000000  50 4b 03 04 14 00 00 00  08 00 37 83 b0 42 c2 bf  |PK........7..B..|
00000010  ec 66 53 08 00 00 3d 16  00 00 0a 00 00 00 30 78  |.fS...=.......0x|
00000020  30 34 30 39 2e 69 6e 69  d5 58 59 6f 13 31 10 7e  |0409.ini.XYo.1.~|

Open in new window

This one has directory indicators at the beginning so you can see the PK signature repitition (PK signatures are always four bytes - the second pair indicate the type of record but the first pair are always PK):
00000000  50 4b 03 04 14 00 00 00  00 00 79 72 b5 42 00 00  |PK........yr.B..|
00000010  00 00 00 00 00 00 00 00  00 00 09 00 00 00 43 53  |..............CS|
00000020  43 6f 6d 6d 6f 6e 2f 50  4b 03 04 14 00 00 00 00  |Common/PK.......|
00000030  00 92 72 b5 42 00 00 00  00 00 00 00 00 00 00 00  |..r.B...........|
00000040  00 11 00 00 00 43 53 43  6f 6d 6d 6f 6e 2f 43 6c  |.....CSCommon/Cl|
00000050  61 73 73 65 73 2f 50 4b  03 04 0a 00 00 00 00 00  |asses/PK........|
00000060  79 72 b5 42 02 a5 1c 5b  32 ae 00 00 32 ae 00 00  |yr.B...[2...2...|
00000070  24 00 00 00 43 53 43 6f  6d 6d 6f 6e 2f 43 6c 61  |$...CSCommon/Cla|
00000080  73 73 65 73 2f 63 73 63  6f 6d 6d 6f 6e 63 6c 61  |sses/cscommoncla|
00000090  73 73 65 73 2e 46 58 50  fe f2 ee 36 67 3e 21 05  |sses.FXP...6g>!.|

Open in new window

Just another random standard zip example:
00000000  50 4b 03 04 14 00 02 00  08 00 a0 92 49 32 c4 f6  |PK..........I2..|
00000010  d0 bc 18 1c 00 00 73 5d  00 00 0b 00 00 00 4c 69  |......s]......Li|
00000020  63 65 6e 73 65 2e 74 78  74 d5 5c 4b 6f dc c6 96  |cense.txt.\Ko...|
00000030  de 07 c8 7f 28 08 18 44  0d b4 db b1 9c 38 b9 ce  |....(..D.....8..|

Open in new window

snievesAuthor Commented:
So there's no way to rebuild the PKs or blank areas? Is there any standard to the structure? Can I paste my sample into a different one? Or should I just call it a day and move on? Thank you very much for your help with this.
btanExec ConsultantCommented:
thought this carving tool may be of interest

Technically, the ooXML files are compressed by deflate method. When they are assembled together into a zip file (but with extension docx, xlsx or pptx), the deflated xml files are organized according to the zip format. That means there will usually be 3 different sections in the final zip file;

1. ZIP local file header. Start signature = 0x504b0304
2. ZIP data descriptor. Start signature = 0x504b0708
3. ZIP central directory file header. Start signature = 0x504b0102
4. ZIP end of central directory record. Start signature = 0x504b0506

At least the structure the of the docx need to be build back to see it using word esp "Document.xml.rels" and manual reconstruct
@snieves: You have the standard for the Zip structure - PKWare have always owned and retained the copyright on Zip files and how they are formed: their APPNote.txt is the standard.  As breadtan mentions, MS have some documents indicating their perspective on the standard and you'll note the quoted signatures match those of the hex dumps I made.

What was the trojan/virus that your brother was subjected to: perhaps the ultimate answer is in what that does.  As mentioned, however, what you have in that sample looks like garbage and is therefore unrecoverable.
Errata: The Zip format is in the public domain (not copyright).  PKWare's APPNote.txt is still the defacto standard. :)

@snieves: if the virus that you had was one of the current trend of ransomware you may still be able to decrypt them.

Ransomware: http://en.wikipedia.org/wiki/Ransomware_%28malware%29
btanExec ConsultantCommented:
Brute forcing to recover may not be worth tje effort if the data value is much lower than the effort to recover. Agree with Barthax. Even at times it is always advisable to rebuild system instead of relying AV say it clean up well for the system. .depends on risk appetite
snievesAuthor Commented:
Unfortunately I have to try. These files are case files from the past few years. My brother is a cop and this was his "backup" drive. I will find out what virus they got hit with. Apparently they called in a contractor to fix the Chief's files but could not fix them for everyone else because of budget issues. I will attempt to rebuild the files. Thank you all for your help with this!!!
snievesAuthor Commented:
Ok, apparently he got hit with cryptolocker and ever since it was cleaned off the computer, the files have been unreadable. Does this change things? Is it possible to de-crypt them?
btanExec ConsultantCommented:
cyberlocker use asymmetric or pub and private keys (RSA encryption) - it is not easy to do break though and not advisable or worth the effort.  It only means that most victims may end up paying for the key if their desire is to retrieve all files immediately.

Supposedly the "Panda Ransomware Decrypt." should help but not in your case as you shared - may want to retry
@ http://malwarefixes.com/remove-cyber-locker-ransomware/
snievesAuthor Commented:
I tried the panda decrypt without any luck. Like I mentioned before, these are case files for a cop so I have to keep trying. Any other ideas? I also tried using Emisoft's DECRYPTER and that didn't work. Do you think these files are still encrypted or were they decrypted and the doc header information removed?
btanExec ConsultantCommented:
If the files are RSA encrypted likely its symmetric key is protected by the RSA private keys but understand that it may also be hardcoded..The key is saved on a remote server, This key is required to get all keys used to encrypt individual files. In other words, each key is different and there is no way you can break this....

.wondering if it is embedded within the encrypted doc. can be tougher to reverse engineer when there is no real hint from here. Not sure if below tools can give any hints - but we already down to nothing to lose ...

File Investigator Tools - http://www.fid3.com/products/fi-tools
-Detects Encrypted Files, including TrueCrypt
-Displays all NTFS Alternate Data Streams & Security File Usernames
-Displays metadata extracted from many of the supported file types*

From the forum, it state unlikely to decrypt but get from backup only

Are there any tools that can be used to decrypt the encrypted files?

Unfortunately at this time there is no way to retrieve the key used to encrypt your files. Brute forcing the encryption key is realistically not possibly due to the length of time required to break the key. Any decryption tools that have been released by various companies will not work with this infection. The only method you have of restoring your files is from a backup, or if you have System Restore, through the Shadow Volume copies that are created every time a system restore is performed. More information about how to restore your files via Shadow Volume Copies can be found in the next section.

If you do not have System Restore enabled on your computer or reliable backups, then you will need to pay the ransom in order to get your files back. Please note that there have been cases when people have paid the ransom and the decryption did not work for whatever reason. Furthermore, if you do not pay the ransom within the allotted time, the Cryptolocker decryption tool will be removed from your system and make it much more difficult, if not impossible, to restore your files.

How to generate a list of files that have been encrypted

If you wish to generate a list of files that have been encrypted, you can download this tool:


Unfortunately, once the encryption of the data is complete, decryption is not feasible. To obtain the file specific AES key to decrypt a file, you need the private RSA key corresponding to the RSA public key generated for the victims system by the command and control server. However, this key never leaves the command and control server, putting it out of reach of everyone except the attacker.

Unique 2048 bit RSA public key created by the version of the malware, a numeric id, the systems network name, a group id as well as the language of the system.
Different 256 bit AES key created for each targeted file.

[2048 bit RSA public key]{256 bit AES key}ENCRYPTED FILE{256 bit AES key}[2048 bit RSA public key]

ECRYPT II Yearly Report on Algorithms and Keysizes (2010-2011) (30th June 2011) at http://www.ecrypt.eu.org/documents/D.SPA.17.pdf
To crack DES keys: Chapter 5, Determining Symmetric Key Size. Page 15 or 27/122.
To crack RSA keys: Chapter 6, Determining Equivalent Asymmetric Key Size. Page 25 or 37/122.

The Mathematics of the RSA Public-Key Cryptosystem, Burt Kaliski, RSA Laboratories. http://www.mathaware.org/mam/06/Kaliski.pdf
ABOUT THE AUTHOR: Dr Burt Kaliski is a computer scientist whose involvement with the security industry has been through the company that Ronald Rivest, Adi Shamir and Leonard Adleman started in 1982 to commercialize the RSA encryption algorithm that they had invented. At the time, Kaliski had just started his undergraduate degree at MIT. Professor Rivest was the advisor of his bachelors, masters and doctoral theses, all of which were about cryptography. When Kaliski finished his graduate work, Rivest asked him to consider joining the company, then called RSA Data Security. In 1989, he became employed as the companys first full-time scientist. He is currently chief scientist at RSA Laboratories and vice president of research for RSA Security.


Deliberately flawed? RSA Security tells customers to drop NSA-related encryption algorithm. http://rt.com/usa/nsa-weak-cryptography-rsa-110/

An encryption algorithm with a suspected NSA-designed backdoor has been declared insecure by the developer after years of extensive use by customers worldwide, including the US federal agencies and government entities.
Major US computer security company RSA Security, a division of EMC, has privately warned thousands of its customers on Thursday to immediately discontinue using all versions of company's BSAFE toolkit and Data Protection Manager (DPM), both using Dual_EC_DRNG (Dual Elliptic Curve Deterministic Random Bit Generator) encryption algorithm to protect sensitive data.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
btanExec ConsultantCommented:
I did saw some say below but not sure of explicit detail done


Nikolay Shmakov on October 13, 2013 at 10:57 pm said:
Now to the good news. I have found the way to decrypt files after Cryptolocker has done its modifications. :) It renamed the files but there were no encyption set. So you should be able to restore them by renaming the extension of the tmp file.
btanExec ConsultantCommented:
To be more specific the ley encrypting the files are the public keys..it is not going to be practical to derive the corresponding private key...
snievesAuthor Commented:
Well thought out and thorough.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Utilities

From novice to tech pro — start learning today.