Link to home
Start Free TrialLog in
Avatar of snieves
snieves

asked on

HEX info for docx files

My brother has a external drive full of docx documents that are corrupted. According to him, his office got hit by a trojan. I've tried 5 of the recovery programs online. None of them have worked. I was chatting with a support tech for one of the programs and he said the document header was corrupted. After a little more research I downloaded HxD and opened the file to view the hex. Can someone tell me what I should be looking for to fix the document header so I can open them? Unfortunately he does not have a backup for these files and they are pretty much every case he's ever worked on. Any help would be greatly appreciated.
Avatar of GrahamSkan
GrahamSkan
Flag of United Kingdom of Great Britain and Northern Ireland image

The first thing to be aware of is that .docx files are in fact zipped. With an uncorrupted file it should be possible to change the extension to .zip and to unzip it into its component files.

The first two bytes of a zip file should be "PK".
Avatar of snieves
snieves

ASKER

I tried that already. When I try to open the zip in WinRar, I get an error. Please see attached.
Capture.GIF
Zip files store their "header" information at the end of the file, as it happens.  There is a tiny indicator at the beginning: the signature PK are the first two characters and then the zip utility knows to check the last 65535 bytes of the file for more PK signatures and one of them will be the directory or the directory indicator.  The full format is in PKWare's APPNOTE.txt (but as Microsoft has had their hands on these files, you can bet they've added something proprietary!).

A nice simple zip repair tool will be able to recover information from the zip and they don't come much nicer than DiskInternal's Zip Repair: http://www.diskinternals.com/zip-repair/

PKWare APPNOTE.txt: http://www.pkware.com/documents/casestudies/APPNOTE.TXT
Specific to DOCX See this for the necessary file for the ZIP.
Documentation on OOXML may provide a guide to analysing a DOCX file.
Also good to see other alright DOCX as ref
http://www.forensicswiki.org/wiki/DOCX

Can check on Offvis, S2 Services and Object Fix Zip or DiskInternals ZIP Repair. See this
Article - http://www.techradar.com/news/software/applications/9-ways-to-recover-a-corrupt-microsoft-office-file-712429/2
S2 Service - http://legacy.s2services.com/corrupt.htm
Avatar of snieves

ASKER

I tried wordfix, officerecovery.com, and cimaware's office fix. None of them worked. I have tried changing the docx file extension to .zip but it will not open either. I have attached an example corrupted file. I read through the PKware page. I am using HxD and can see the consistencies in data for files that work but I don't know where I would insert the updated HEX in the file that does not work. Am I screwed?
12-00922-found-property.docx
Your example file is not one that I recognise as ever having been a Zip file, nor do the array of tools I have (Windows and Linux): there is no PK signature anywhere in the file and there are no tell-tale blank (zero-byte) areas in the headers which MS docs get.  For MS Office docs, the first file is always "[Content_Types].xml", which you can see in the examples.

I suspect the file has been overwritten by garbage and your content is lost.

Example Doc1.docx (* line means repeat of previous):
00000000  50 4b 03 04 14 00 06 00  08 00 00 00 21 00 dd fc  |PK..........!...|
00000010  95 37 66 01 00 00 20 05  00 00 13 00 08 02 5b 43  |.7f... .......[C|
00000020  6f 6e 74 65 6e 74 5f 54  79 70 65 73 5d 2e 78 6d  |ontent_Types].xm|
00000030  6c 20 a2 04 02 28 a0 00  02 00 00 00 00 00 00 00  |l ...(..........|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000230  00 00 00 00 00 00 00 00  00 b4 54 cb 6e c2 30 10  |..........T.n.0.|

Open in new window


Example Book1.xslx:
00000000  50 4b 03 04 14 00 06 00  08 00 00 00 21 00 58 56  |PK..........!.XV|
00000010  c6 8f 60 01 00 00 18 05  00 00 13 00 da 01 5b 43  |..`...........[C|
00000020  6f 6e 74 65 6e 74 5f 54  79 70 65 73 5d 2e 78 6d  |ontent_Types].xm|
00000030  6c 20 a2 d6 01 28 a0 00  02 00 00 00 00 00 00 00  |l ...(..........|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  00 00 00 00 00 00 00 00  00 00 00 cc 94 4d 4e c3  |.............MN.|

Open in new window


Your sample:
00000000  29 f6 01 bb 08 1c 31 d6  6f 28 ee af 68 4d 4b 5b  |).....1.o(..hMK[|
00000010  54 4d ed dc 1b b6 96 90  99 f4 d8 e8 44 8f 3b f0  |TM..........D.;.|
00000020  ff 7a c4 c5 f0 1a df 87  fa 96 ae 5e 35 41 fe d4  |.z.........^5A..|
00000030  5b ec ab ec 32 a3 39 84  54 40 73 68 ba 03 ec 6b  |[...2.9.T@sh...k|
00000040  c5 60 a5 5d b6 7e 9c ca  fa 3e 63 6a 83 29 26 32  |.`.].~...>cj.)&2|
00000050  bb 8b d5 2f a9 eb 61 47  4b be 9a 02 9c f5 67 03  |.../..aGK.....g.|
00000060  bb b6 4f 6f 46 3e a3 01  00 5c 14 11 38 5d 46 08  |..OoF>...\..8]F.|
00000070  a8 0f 77 a4 10 a3 0c 87  49 43 0d 52 74 0f c4 0f  |..w.....IC.Rt...|
00000080  e4 8d ac a4 ab 43 6c 67  4a cf 57 83 d4 e7 fa 26  |.....ClgJ.W....&|
00000090  f4 48 99 29 80 9d 4d dc  e0 6e 54 9b d3 b1 ea d5  |.H.)..M..nT.....|
000000a0  09 fa 6c e3 f9 77 80 dc  85 1e af 54 68 cf a4 ee  |..l..w.....Th...|
000000b0  08 c3 b3 58 22 29 1d 98  29 15 5a 2c 70 a0 79 83  |...X")..).Z,p.y.|
000000c0  67 dc 1f cb 59 e7 86 50  b5 05 0b 04 60 72 0b 1d  |g...Y..P....`r..|
000000d0  45 d9 59 ed 94 d1 03 f6  d1 a7 40 84 50 46 b7 a0  |E.Y.......@.PF..|
000000e0  76 0d bd 21 fd 85 1e 55  32 eb a3 2c 62 82 ad b0  |v..!...U2..,b...|
000000f0  d1 8e 0f 29 a2 3c e2 bb  42 c9 8a aa d7 e8 86 d7  |...).<..B.......|
00000100  4b e4 d8 37 58 b2 6c 73  92 a6 8f 55 2c 70 28 e0  |K..7X.ls...U,p(.|
00000110  75 49 9f 78 43 a3 75 ce  20 5a 3f 6a c3 95 b2 02  |uI.xC.u. Z?j....|
00000120  69 ab 71 3a bd 83 9a 4e  1f 27 af 2e 29 6d 2d 77  |i.q:...N.'..)m-w|
00000130  2f 90 ff 0e 7d d4 0d e7  4b ab d3 e2 b1 97 4a dd  |/...}...K.....J.|
00000140  96 f5 4f f6 4e 65 5d c8  62 5f 37 58 80 dc 08 6e  |..O.Ne].b_7X...n|
00000150  a9 70 db a1 0e 7e 45 52  e8 d8 f8 10 f1 da d7 c7  |.p...~ER........|
00000160  a2 7e 2e 9b b0 95 a4 1b  08 ab a5 80 59 18 13 b7  |.~..........Y...|
00000170  31 4d 71 2e 29 5f fd 90  1d 1b 90 eb 8c 74 e3 c7  |1Mq.)_.......t..|
00000180  fb e0 88 05 d3 43 9b 90  4b ba 6b 01 bc 1e ea 20  |.....C..K.k.... |
00000190  3f de 4b 85 c2 7d cb 89  60 e1 48 86 e8 77 77 67  |?.K..}..`.H..wwg|
000001a0  ae b6 82 e2 e4 a6 30 bd  e9 bf 43 aa 59 52 5b 71  |......0...C.YR[q|
000001b0  e2 34 f9 e1 bb 8b 4b e3  01 0b 57 b3 62 c4 f5 db  |.4....K...W.b...|
000001c0  20 37 6f cf 51 c7 37 05  09 83 9a 1a b7 36 f5 8f  | 7o.Q.7......6..|
000001d0  17 fe 39 27 5f c4 e8 36  db 63 14 0b 6d fb a3 5a  |..9'_..6.c..m..Z|
000001e0  0f f0 eb 05 8d 25 32 ab  0d 3b 5e f1 28 e7 fa 65  |.....%2..;^.(..e|
000001f0  5a 2e 57 14 2c 40 02 5b  27 78 b7 38 a0 d5 25 14  |Z.W.,@.['x.8..%.|
00000200  43 56 e7 90 32 61 53 89  39 c8 7b e0 c5 79 02 51  |CV..2aS.9.{..y.Q|

Open in new window


Some sample random zip files (non-MS Office) for comparison:
00000000  50 4b 03 04 14 00 00 00  08 00 37 83 b0 42 c2 bf  |PK........7..B..|
00000010  ec 66 53 08 00 00 3d 16  00 00 0a 00 00 00 30 78  |.fS...=.......0x|
00000020  30 34 30 39 2e 69 6e 69  d5 58 59 6f 13 31 10 7e  |0409.ini.XYo.1.~|

Open in new window


This one has directory indicators at the beginning so you can see the PK signature repitition (PK signatures are always four bytes - the second pair indicate the type of record but the first pair are always PK):
00000000  50 4b 03 04 14 00 00 00  00 00 79 72 b5 42 00 00  |PK........yr.B..|
00000010  00 00 00 00 00 00 00 00  00 00 09 00 00 00 43 53  |..............CS|
00000020  43 6f 6d 6d 6f 6e 2f 50  4b 03 04 14 00 00 00 00  |Common/PK.......|
00000030  00 92 72 b5 42 00 00 00  00 00 00 00 00 00 00 00  |..r.B...........|
00000040  00 11 00 00 00 43 53 43  6f 6d 6d 6f 6e 2f 43 6c  |.....CSCommon/Cl|
00000050  61 73 73 65 73 2f 50 4b  03 04 0a 00 00 00 00 00  |asses/PK........|
00000060  79 72 b5 42 02 a5 1c 5b  32 ae 00 00 32 ae 00 00  |yr.B...[2...2...|
00000070  24 00 00 00 43 53 43 6f  6d 6d 6f 6e 2f 43 6c 61  |$...CSCommon/Cla|
00000080  73 73 65 73 2f 63 73 63  6f 6d 6d 6f 6e 63 6c 61  |sses/cscommoncla|
00000090  73 73 65 73 2e 46 58 50  fe f2 ee 36 67 3e 21 05  |sses.FXP...6g>!.|

Open in new window


Just another random standard zip example:
00000000  50 4b 03 04 14 00 02 00  08 00 a0 92 49 32 c4 f6  |PK..........I2..|
00000010  d0 bc 18 1c 00 00 73 5d  00 00 0b 00 00 00 4c 69  |......s]......Li|
00000020  63 65 6e 73 65 2e 74 78  74 d5 5c 4b 6f dc c6 96  |cense.txt.\Ko...|
00000030  de 07 c8 7f 28 08 18 44  0d b4 db b1 9c 38 b9 ce  |....(..D.....8..|

Open in new window

Avatar of snieves

ASKER

So there's no way to rebuild the PKs or blank areas? Is there any standard to the structure? Can I paste my sample into a different one? Or should I just call it a day and move on? Thank you very much for your help with this.
thought this carving tool may be of interest
http://www.forensicfocus.com/Forums/viewtopic/t=7814/

Technically, the ooXML files are compressed by deflate method. When they are assembled together into a zip file (but with extension docx, xlsx or pptx), the deflated xml files are organized according to the zip format. That means there will usually be 3 different sections in the final zip file;

1. ZIP local file header. Start signature = 0x504b0304
2. ZIP data descriptor. Start signature = 0x504b0708
3. ZIP central directory file header. Start signature = 0x504b0102
4. ZIP end of central directory record. Start signature = 0x504b0506

At least the structure the of the docx need to be build back to see it using word esp "Document.xml.rels" and manual reconstruct
http://msdn.microsoft.com/en-us/library/bb266220(v=office.12).aspx
http://social.msdn.microsoft.com/Forums/office/en-US/d52e8532-fc0f-42ce-a40c-55811511d800/how-to-add-header-and-footer-to-docx-file-using-ooxml-format?forum=oxmlsdk
@snieves: You have the standard for the Zip structure - PKWare have always owned and retained the copyright on Zip files and how they are formed: their APPNote.txt is the standard.  As breadtan mentions, MS have some documents indicating their perspective on the standard and you'll note the quoted signatures match those of the hex dumps I made.

What was the trojan/virus that your brother was subjected to: perhaps the ultimate answer is in what that does.  As mentioned, however, what you have in that sample looks like garbage and is therefore unrecoverable.
Errata: The Zip format is in the public domain (not copyright).  PKWare's APPNote.txt is still the defacto standard. :)

@snieves: if the virus that you had was one of the current trend of ransomware you may still be able to decrypt them.

Ransomware: http://en.wikipedia.org/wiki/Ransomware_%28malware%29
Brute forcing to recover may not be worth tje effort if the data value is much lower than the effort to recover. Agree with Barthax. Even at times it is always advisable to rebuild system instead of relying AV say it clean up well for the system. .depends on risk appetite
Avatar of snieves

ASKER

Unfortunately I have to try. These files are case files from the past few years. My brother is a cop and this was his "backup" drive. I will find out what virus they got hit with. Apparently they called in a contractor to fix the Chief's files but could not fix them for everyone else because of budget issues. I will attempt to rebuild the files. Thank you all for your help with this!!!
Avatar of snieves

ASKER

Ok, apparently he got hit with cryptolocker and ever since it was cleaned off the computer, the files have been unreadable. Does this change things? Is it possible to de-crypt them?
cyberlocker use asymmetric or pub and private keys (RSA encryption) - it is not easy to do break though and not advisable or worth the effort.  It only means that most victims may end up paying for the key if their desire is to retrieve all files immediately.

Supposedly the "Panda Ransomware Decrypt." should help but not in your case as you shared - may want to retry
@ http://malwarefixes.com/remove-cyber-locker-ransomware/
Avatar of snieves

ASKER

I tried the panda decrypt without any luck. Like I mentioned before, these are case files for a cop so I have to keep trying. Any other ideas? I also tried using Emisoft's DECRYPTER and that didn't work. Do you think these files are still encrypted or were they decrypted and the doc header information removed?
ASKER CERTIFIED SOLUTION
Avatar of btan
btan

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I did saw some say below but not sure of explicit detail done

http://blog.malwarebytes.org/intelligence/2013/10/cryptolocker-ransomware-what-you-need-to-know/

Nikolay Shmakov on October 13, 2013 at 10:57 pm said:
Now to the good news. I have found the way to decrypt files after Cryptolocker has done its modifications. :) It renamed the files but there were no encyption set. So you should be able to restore them by renaming the extension of the tmp file.
To be more specific the ley encrypting the files are the public keys..it is not going to be practical to derive the corresponding private key...
Avatar of snieves

ASKER

Well thought out and thorough.