Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1640
  • Last Modified:

CheckSum of file in Memory before saving

I have a file that I want to verify that it is saved and not corrupted when written.
How can I get the check sum MD5 of the file in memory?
I know I can open the file and check it once it's saved. With the following.
thanks
protected string GetMD5HashFromFile(string fileName)
{
  FileStream file = new FileStream(fileName, FileMode.Open);
  MD5 md5 = new MD5CryptoServiceProvider();
  byte[] retVal = md5.ComputeHash(file);
  file.Close();
  ASCIIEncoding enc = new ASCIIEncoding();
  return enc.GetString(retVal);
}

0
JElster
Asked:
JElster
  • 15
  • 14
  • 5
1 Solution
 
JaccoCommented:
You can do the save with a MemoryStream

Just catch the contents from whereever you get them in a MemoryStream in stead of saving as a file, check the MD5 then copy the MemoryStream to a FileStream if the MD5 was OK.

I added a routine I often use to copy one stream to another. (You could use it to copy a memorystream to a filestream).
public static void CopyStream(Stream src, Stream dst)
        {
          byte[] buf = new byte[4096];
          int rd = 0;
          while ((rd = src.Read(buf, 0, 4096)) > 0)
          {
            dst.Write(buf, 0, rd);
          }
          dst.Position = 0;
        }

Open in new window

0
 
JaccoCommented:
Typo: You can do the same with a MemoryStream
0
 
JElsterAuthor Commented:
How do I get the file into a Memory stream?  Then I Save , How do I copy the Memory to the FileStream?
thanks
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Todd GerbertIT ConsultantCommented:
You can modify the method you already have to accept a parameter of type Stream instead of a string fileName, then you can pass in any kind of stream.

string md5Hash = GetMD5Hash(new FileStream("C:\\test.txt"));

Or, assuming you have the contents of the file in a byte array: string md5Hash = GetMD5Hash(new MemoryStream(fileBytes));

protected static string GetMD5Hash(Stream stream)
{
	MD5 md5 = new MD5CryptoServiceProvider();
	byte[] retVal = md5.ComputeHash(stream);
	return Encoding.ASCII.GetString(retVal);
}

Open in new window

0
 
Todd GerbertIT ConsultantCommented:
Or you can make overloads so you can call string md5Hash = GetMD5Hash("C:\\test.txt"); or string md5Hash = GetMD5Hash(fileBytes);

protected static string GetMD5Hash(string fileName)
{
	return GetMD5Hash(new FileStream(fileName, FileMode.Open));
}

protected static string GetMD5Hash(byte[] content)
{
	return GetMD5Hash(new MemoryStream(content));
}

protected static string GetMD5Hash(Stream stream)
{
	MD5 md5 = new MD5CryptoServiceProvider();
	byte[] retVal = md5.ComputeHash(stream);
	return Encoding.ASCII.GetString(retVal);
}

Open in new window

0
 
Todd GerbertIT ConsultantCommented:
Or, you can also just write (assuming fileContent is a byte array)

string md5Hash = Encoding.ASCII.GetString(new MD5CryptoServiceProvider().ComputeHash(fileContent));
0
 
JElsterAuthor Commented:
How do I know if the fie Content is a byte array?
The file is PDF file.
thanks
0
 
JaccoCommented:
>> How do I know if the fie Content is a byte array?

Depends where you get the data from. Is it transmitted over the internet?
var client = new WebClient();
var buf = client.DownloadData("http://yoururl");
var mem = new MemoryStream(buf);

MD5 md5 = new MD5CryptoServiceProvider();
byte[] retVal = md5.ComputeHash(mem);

mem.Position = 0; // don't know if its needed
var file = new FileStream("yourfile", FileMode.Create);
CopyStream(mem, file); // use routine of my previous comment
file.Close();

Open in new window

0
 
JaccoCommented:
var buf = File.GetAllBytes("yourfile");

if its from disk
0
 
JElsterAuthor Commented:
The file is memory... it's opened PDF.. how do I get the byte[]?
0
 
Todd GerbertIT ConsultantCommented:
How are you opening the file now?
0
 
JElsterAuthor Commented:
I open the file using a FileStream.. but then the file is modified.. I need to check it before it gets written back or saved... thx
0
 
Todd GerbertIT ConsultantCommented:
You can pass a FileStream to the method I posted earlier.
0
 
JElsterAuthor Commented:
Don't understand.. the file is in memory.. it's modified.. how do I pass that to a fileStream?
0
 
Todd GerbertIT ConsultantCommented:
You just said you already have the PDF as a FileStream - pass that FileStream to the GetMD5Hash method.

You can post the code you have so far if you're still stuck.
0
 
JaccoCommented:
After the PDF is modified (how?) you have to save it to a stream or file first.

Jacco
0
 
Todd GerbertIT ConsultantCommented:
Here's some imaginary code that shows how your code might go.

I should point out that having an MD5 checksum doesn't really do anything in and of itself, you need to have something to compare it to.  For example, if I download a file from a website they sometimes will provide the checksum - after I download the file I can calculate the checksum myself and see if it equals the one the website provided me, if they differ I know an error occurred while downloading.

To that end this sample code saves the PDF stream and then makes sure the checksum of the saved file matches the one that calculated for the in-memory version.  Although in reality I'm not sure this sort of check makes sense.

private void YourCurrentMethod()
{
  // You open the PDF as a MemoryStream
  MemoryStream pdfStream = new MemoryStream(File.ReadAllBytes("C:\\test.pdf"));

  // You modify the PDF, by passing the MemoryStream to some
  // method in a PDF library that does the modification
  MakeChangesToPdf(pdfStream);

  // Calculate the checksum of the in-memory version of the file
  string md5Hash = GetMD5Hash(pdfStream);

  // Save the modified PDF to disk
  byte[] buffer = new byte[pdfStream.Length];
  pdfStream.Read(buffer, 0, buffer.Length);
  File.WriteAllBytes("C:\\modifiedPdf.pdf", buffer);

  // Check if checksum matches
  if (GetMD5Hash("C:\\modifiedPdf.pdf") == md5Hash)
    // They match
    ;
}

protected static string GetMD5Hash(string fileName)
{
	return GetMD5Hash(new FileStream(fileName, FileMode.Open));
}

protected static string GetMD5Hash(byte[] content)
{
	return GetMD5Hash(new MemoryStream(content));
}

protected static string GetMD5Hash(Stream stream)
{
	if (stream.CanSeek)
		stream.Position = 0;
	return Encoding.ASCII.GetString(new MD5CryptoServiceProvider().ComputeHash(stream));
} 

Open in new window

0
 
JElsterAuthor Commented:
The PDF is in memory.. it's open... How do I get the 'In memory' version?  Isn't your code reading a file?
thanks
0
 
Todd GerbertIT ConsultantCommented:
The PDF is in memory

What does that mean?  You said above (http:#a34951545) that you were opening the PDF as/from a FileStream.

You will need to post what code you have to go any further with this question, I think.
0
 
JElsterAuthor Commented:
I'll open another questions.
Your code works.. I just need to check the file when it's modified in memory. before being saved...
thanks
0
 
Todd GerbertIT ConsultantCommented:
need to check the file when it's modified in memory. before being saved...

The snippet I posted does just that.  Post your code here and we'll have a look.
0
 
JElsterAuthor Commented:
I'm trying to use your code... It's opened in memory using a PDF program. The PDF is an object in memory it is not serializable..   It's modified by the program... I wan to get the bytes[] before it is written. so I can then check if it was written correctly
0
 
Todd GerbertIT ConsultantCommented:
Okay, what PDF program was it opened with?

And by "an object in memory" do you mean you declare a PDF object in your code, perhaps like below?
PDFDocument pdf = new PDFDocument();
pdf.CreateFromWidgetOrSomeOtherMethod();
etc...

0
 
JElsterAuthor Commented:
yes.... it's a third party program doing that.. thx
0
 
Todd GerbertIT ConsultantCommented:
I gathered.  Which third party program?
0
 
JElsterAuthor Commented:
Tall Components
0
 
Todd GerbertIT ConsultantCommented:
Getting closer. ;)

PDFControls.Net, WebToPdf.Net, PDFRasterizer.Net, TallPDF.Net, PDFKit.Net, PDFWebViewer.Net, PDFA.Net or PDFThumbnail.Net?
0
 
Todd GerbertIT ConsultantCommented:
This would go much faster if you could just post your code...
0
 
JElsterAuthor Commented:
PDFControls.Net...

I don't have any working code...

I open the PDF and it gets loaded into Document object which is not serializable...
I want to check the file before I call Write  - which saves it
0
 
Todd GerbertIT ConsultantCommented:
This is based on the professional edition, but I don't have a license so I can't test it.

In this example "pdfDocument" is an object whose type is TallComponents.PDF.Document.

// I have no idea how you're opening or creating the PDF
// so I've left that step out

// Change the PDF
pdfDocument.DocumentInfo.Author = "Me!";

// Calculate the checksum of the saved document
MemoryStream inMemoryPdf = new MemoryStream();
pdfDocument.Write(inMemoryPdf);
string inMemoryChecksum = GetMD5Hash(inMemoryPdf);

// Write the changed PDF to disk
pdfDocument.Write(new FileStream("C:\\test.pdf", FileMode.CreateNew));

// Calc disk file checksum
string onDiskChecksum = GetMD5Hash("C:\\test.pdf");

if (inMemoryChecksum == onDiskChecksum)
	MessageBox.Show("Success");
else
	MessageBox.Show("Failure");

Open in new window

0
 
JElsterAuthor Commented:
I never get Success...  must be some other stuff that is written to the pdf.
???????????
0
 
JElsterAuthor Commented:
The memory stream lenght are the SAME!
0
 
Todd GerbertIT ConsultantCommented:
Apparently, everytime you call Document.Write() the bytes written are slightly different.  So when you call pdfDocument.Write() to get the contents into a memory stream, it's a different set of bytes when use pdfDocument.Write() to put it in a disk file (so I would expect the checksum to be different for two different sets of bytes).

The solution, is to call pdfDocument.Write() just once - to a MemoryStream.  Then you can get the MD5 hash for the memory stream, write the MemoryStream to disk (instead of using pdfDocument.Write), and get the MD5 has for the disk file you just created.

This seems to work (note that I changed GetMD5Hash to return a Base64 string representing the bytes of the hash):
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using TallComponents.PDF;
using System.IO;
using System.Security.Cryptography;

namespace WindowsFormsApplication1
{
	public partial class Form1 : Form
	{
		MD5CryptoServiceProvider md5Provider = new MD5CryptoServiceProvider();

		public Form1()
		{
			InitializeComponent();
		}

		private void btnSave_Click(object sender, EventArgs e)
		{
			Document pdf = new Document();

			// Open a test PDF
			pdf.Open("C:\\a.pdf");

			// Change it
			pdf.DocumentInfo.Author = "Santa Claus";
			
			// Write contents of Document to a MemoryStream
			MemoryStream pdfStream = new MemoryStream();
			pdf.Write(pdfStream);

			// Get checksum of MemoryStream
			string inMemoryChecksum = GetMD5Hash(pdfStream);

			// Write the changed PDF to disk
			byte[] buffer = new byte[pdfStream.Length];
			pdfStream.Position = 0;
			pdfStream.Read(buffer, 0, buffer.Length);
			File.WriteAllBytes("C:\\test.pdf", buffer);

			// Get checksum of disk file
			string diskFileChecksum = GetMD5Hash("C:\\test.pdf");

			if (inMemoryChecksum == diskFileChecksum)
				MessageBox.Show("Success");
			else
				MessageBox.Show("Failure");
		}

		private string GetMD5Hash(string fileName)
		{
			using (FileStream file = new FileStream(fileName, FileMode.Open))
			{
				return GetMD5Hash(file);
			}
		}

		private string GetMD5Hash(byte[] content)
		{
			return GetMD5Hash(new MemoryStream(content));
		}

		private string GetMD5Hash(Stream stream)
		{
			if (stream.CanSeek)
				stream.Position = 0;
			return Convert.ToBase64String(new MD5CryptoServiceProvider().ComputeHash(stream));
		}



	}
}

Open in new window

0
 
JElsterAuthor Commented:
Thanks again!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 15
  • 14
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now