Solved

CheckSum of file in Memory before saving

Posted on 2011-02-21
34
1,391 Views
Last Modified: 2012-06-22
I have a file that I want to verify that it is saved and not corrupted when written.
How can I get the check sum MD5 of the file in memory?
I know I can open the file and check it once it's saved. With the following.
thanks
protected string GetMD5HashFromFile(string fileName)
{
  FileStream file = new FileStream(fileName, FileMode.Open);
  MD5 md5 = new MD5CryptoServiceProvider();
  byte[] retVal = md5.ComputeHash(file);
  file.Close();
  ASCIIEncoding enc = new ASCIIEncoding();
  return enc.GetString(retVal);
}

0
Comment
Question by:JElster
  • 15
  • 14
  • 5
34 Comments
 
LVL 10

Expert Comment

by:Jacco
ID: 34943457
You can do the save with a MemoryStream

Just catch the contents from whereever you get them in a MemoryStream in stead of saving as a file, check the MD5 then copy the MemoryStream to a FileStream if the MD5 was OK.

I added a routine I often use to copy one stream to another. (You could use it to copy a memorystream to a filestream).
public static void CopyStream(Stream src, Stream dst)
        {
          byte[] buf = new byte[4096];
          int rd = 0;
          while ((rd = src.Read(buf, 0, 4096)) > 0)
          {
            dst.Write(buf, 0, rd);
          }
          dst.Position = 0;
        }

Open in new window

0
 
LVL 10

Expert Comment

by:Jacco
ID: 34943475
Typo: You can do the same with a MemoryStream
0
 
LVL 1

Author Comment

by:JElster
ID: 34944463
How do I get the file into a Memory stream?  Then I Save , How do I copy the Memory to the FileStream?
thanks
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34946579
You can modify the method you already have to accept a parameter of type Stream instead of a string fileName, then you can pass in any kind of stream.

string md5Hash = GetMD5Hash(new FileStream("C:\\test.txt"));

Or, assuming you have the contents of the file in a byte array: string md5Hash = GetMD5Hash(new MemoryStream(fileBytes));

protected static string GetMD5Hash(Stream stream)
{
	MD5 md5 = new MD5CryptoServiceProvider();
	byte[] retVal = md5.ComputeHash(stream);
	return Encoding.ASCII.GetString(retVal);
}

Open in new window

0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34946616
Or you can make overloads so you can call string md5Hash = GetMD5Hash("C:\\test.txt"); or string md5Hash = GetMD5Hash(fileBytes);

protected static string GetMD5Hash(string fileName)
{
	return GetMD5Hash(new FileStream(fileName, FileMode.Open));
}

protected static string GetMD5Hash(byte[] content)
{
	return GetMD5Hash(new MemoryStream(content));
}

protected static string GetMD5Hash(Stream stream)
{
	MD5 md5 = new MD5CryptoServiceProvider();
	byte[] retVal = md5.ComputeHash(stream);
	return Encoding.ASCII.GetString(retVal);
}

Open in new window

0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34946660
Or, you can also just write (assuming fileContent is a byte array)

string md5Hash = Encoding.ASCII.GetString(new MD5CryptoServiceProvider().ComputeHash(fileContent));
0
 
LVL 1

Author Comment

by:JElster
ID: 34947852
How do I know if the fie Content is a byte array?
The file is PDF file.
thanks
0
 
LVL 10

Expert Comment

by:Jacco
ID: 34949105
>> How do I know if the fie Content is a byte array?

Depends where you get the data from. Is it transmitted over the internet?
var client = new WebClient();
var buf = client.DownloadData("http://yoururl");
var mem = new MemoryStream(buf);

MD5 md5 = new MD5CryptoServiceProvider();
byte[] retVal = md5.ComputeHash(mem);

mem.Position = 0; // don't know if its needed
var file = new FileStream("yourfile", FileMode.Create);
CopyStream(mem, file); // use routine of my previous comment
file.Close();

Open in new window

0
 
LVL 10

Expert Comment

by:Jacco
ID: 34949108
var buf = File.GetAllBytes("yourfile");

if its from disk
0
 
LVL 1

Author Comment

by:JElster
ID: 34951106
The file is memory... it's opened PDF.. how do I get the byte[]?
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34951515
How are you opening the file now?
0
 
LVL 1

Author Comment

by:JElster
ID: 34951545
I open the file using a FileStream.. but then the file is modified.. I need to check it before it gets written back or saved... thx
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34951575
You can pass a FileStream to the method I posted earlier.
0
 
LVL 1

Author Comment

by:JElster
ID: 34951698
Don't understand.. the file is in memory.. it's modified.. how do I pass that to a fileStream?
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34951857
You just said you already have the PDF as a FileStream - pass that FileStream to the GetMD5Hash method.

You can post the code you have so far if you're still stuck.
0
 
LVL 10

Expert Comment

by:Jacco
ID: 34952066
After the PDF is modified (how?) you have to save it to a stream or file first.

Jacco
0
 
LVL 33

Accepted Solution

by:
Todd Gerbert earned 500 total points
ID: 34952409
Here's some imaginary code that shows how your code might go.

I should point out that having an MD5 checksum doesn't really do anything in and of itself, you need to have something to compare it to.  For example, if I download a file from a website they sometimes will provide the checksum - after I download the file I can calculate the checksum myself and see if it equals the one the website provided me, if they differ I know an error occurred while downloading.

To that end this sample code saves the PDF stream and then makes sure the checksum of the saved file matches the one that calculated for the in-memory version.  Although in reality I'm not sure this sort of check makes sense.

private void YourCurrentMethod()
{
  // You open the PDF as a MemoryStream
  MemoryStream pdfStream = new MemoryStream(File.ReadAllBytes("C:\\test.pdf"));

  // You modify the PDF, by passing the MemoryStream to some
  // method in a PDF library that does the modification
  MakeChangesToPdf(pdfStream);

  // Calculate the checksum of the in-memory version of the file
  string md5Hash = GetMD5Hash(pdfStream);

  // Save the modified PDF to disk
  byte[] buffer = new byte[pdfStream.Length];
  pdfStream.Read(buffer, 0, buffer.Length);
  File.WriteAllBytes("C:\\modifiedPdf.pdf", buffer);

  // Check if checksum matches
  if (GetMD5Hash("C:\\modifiedPdf.pdf") == md5Hash)
    // They match
    ;
}

protected static string GetMD5Hash(string fileName)
{
	return GetMD5Hash(new FileStream(fileName, FileMode.Open));
}

protected static string GetMD5Hash(byte[] content)
{
	return GetMD5Hash(new MemoryStream(content));
}

protected static string GetMD5Hash(Stream stream)
{
	if (stream.CanSeek)
		stream.Position = 0;
	return Encoding.ASCII.GetString(new MD5CryptoServiceProvider().ComputeHash(stream));
} 

Open in new window

0
 
LVL 1

Author Comment

by:JElster
ID: 34953018
The PDF is in memory.. it's open... How do I get the 'In memory' version?  Isn't your code reading a file?
thanks
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953146
The PDF is in memory

What does that mean?  You said above (http:#a34951545) that you were opening the PDF as/from a FileStream.

You will need to post what code you have to go any further with this question, I think.
0
 
LVL 1

Author Comment

by:JElster
ID: 34953243
I'll open another questions.
Your code works.. I just need to check the file when it's modified in memory. before being saved...
thanks
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953270
need to check the file when it's modified in memory. before being saved...

The snippet I posted does just that.  Post your code here and we'll have a look.
0
 
LVL 1

Author Comment

by:JElster
ID: 34953387
I'm trying to use your code... It's opened in memory using a PDF program. The PDF is an object in memory it is not serializable..   It's modified by the program... I wan to get the bytes[] before it is written. so I can then check if it was written correctly
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953503
Okay, what PDF program was it opened with?

And by "an object in memory" do you mean you declare a PDF object in your code, perhaps like below?
PDFDocument pdf = new PDFDocument();
pdf.CreateFromWidgetOrSomeOtherMethod();
etc...

0
 
LVL 1

Author Comment

by:JElster
ID: 34953593
yes.... it's a third party program doing that.. thx
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953706
I gathered.  Which third party program?
0
 
LVL 1

Author Comment

by:JElster
ID: 34953738
Tall Components
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953771
Getting closer. ;)

PDFControls.Net, WebToPdf.Net, PDFRasterizer.Net, TallPDF.Net, PDFKit.Net, PDFWebViewer.Net, PDFA.Net or PDFThumbnail.Net?
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34953790
This would go much faster if you could just post your code...
0
 
LVL 1

Author Comment

by:JElster
ID: 34953821
PDFControls.Net...

I don't have any working code...

I open the PDF and it gets loaded into Document object which is not serializable...
I want to check the file before I call Write  - which saves it
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34954198
This is based on the professional edition, but I don't have a license so I can't test it.

In this example "pdfDocument" is an object whose type is TallComponents.PDF.Document.

// I have no idea how you're opening or creating the PDF
// so I've left that step out

// Change the PDF
pdfDocument.DocumentInfo.Author = "Me!";

// Calculate the checksum of the saved document
MemoryStream inMemoryPdf = new MemoryStream();
pdfDocument.Write(inMemoryPdf);
string inMemoryChecksum = GetMD5Hash(inMemoryPdf);

// Write the changed PDF to disk
pdfDocument.Write(new FileStream("C:\\test.pdf", FileMode.CreateNew));

// Calc disk file checksum
string onDiskChecksum = GetMD5Hash("C:\\test.pdf");

if (inMemoryChecksum == onDiskChecksum)
	MessageBox.Show("Success");
else
	MessageBox.Show("Failure");

Open in new window

0
 
LVL 1

Author Comment

by:JElster
ID: 34954593
I never get Success...  must be some other stuff that is written to the pdf.
???????????
0
 
LVL 1

Author Comment

by:JElster
ID: 34954631
The memory stream lenght are the SAME!
0
 
LVL 33

Expert Comment

by:Todd Gerbert
ID: 34956110
Apparently, everytime you call Document.Write() the bytes written are slightly different.  So when you call pdfDocument.Write() to get the contents into a memory stream, it's a different set of bytes when use pdfDocument.Write() to put it in a disk file (so I would expect the checksum to be different for two different sets of bytes).

The solution, is to call pdfDocument.Write() just once - to a MemoryStream.  Then you can get the MD5 hash for the memory stream, write the MemoryStream to disk (instead of using pdfDocument.Write), and get the MD5 has for the disk file you just created.

This seems to work (note that I changed GetMD5Hash to return a Base64 string representing the bytes of the hash):
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using TallComponents.PDF;
using System.IO;
using System.Security.Cryptography;

namespace WindowsFormsApplication1
{
	public partial class Form1 : Form
	{
		MD5CryptoServiceProvider md5Provider = new MD5CryptoServiceProvider();

		public Form1()
		{
			InitializeComponent();
		}

		private void btnSave_Click(object sender, EventArgs e)
		{
			Document pdf = new Document();

			// Open a test PDF
			pdf.Open("C:\\a.pdf");

			// Change it
			pdf.DocumentInfo.Author = "Santa Claus";
			
			// Write contents of Document to a MemoryStream
			MemoryStream pdfStream = new MemoryStream();
			pdf.Write(pdfStream);

			// Get checksum of MemoryStream
			string inMemoryChecksum = GetMD5Hash(pdfStream);

			// Write the changed PDF to disk
			byte[] buffer = new byte[pdfStream.Length];
			pdfStream.Position = 0;
			pdfStream.Read(buffer, 0, buffer.Length);
			File.WriteAllBytes("C:\\test.pdf", buffer);

			// Get checksum of disk file
			string diskFileChecksum = GetMD5Hash("C:\\test.pdf");

			if (inMemoryChecksum == diskFileChecksum)
				MessageBox.Show("Success");
			else
				MessageBox.Show("Failure");
		}

		private string GetMD5Hash(string fileName)
		{
			using (FileStream file = new FileStream(fileName, FileMode.Open))
			{
				return GetMD5Hash(file);
			}
		}

		private string GetMD5Hash(byte[] content)
		{
			return GetMD5Hash(new MemoryStream(content));
		}

		private string GetMD5Hash(Stream stream)
		{
			if (stream.CanSeek)
				stream.Position = 0;
			return Convert.ToBase64String(new MD5CryptoServiceProvider().ComputeHash(stream));
		}



	}
}

Open in new window

0
 
LVL 1

Author Comment

by:JElster
ID: 34956670
Thanks again!
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question