Could you point a PHP class that could give a file signature based on its content?

Posted on 2016-09-03
Last Modified: 2016-09-04
Hi Experts

Could you point a PHP class that could give a file signature based on its content?

My purpose is to check when a file is uploaded if something has been changed on it from the first time
it was uploaded.

Thanks in advance.
Question by:Eduardo Fuerte
LVL 55

Assisted Solution

by:Julian Hansen
Julian Hansen earned 100 total points
ID: 41782838
Why not do

$hash = md5(file_get_contents($filename));

Open in new window

LVL 109

Accepted Solution

Ray Paseur earned 200 total points
ID: 41782898
I think we covered md5() before, right?  You don't need a class -- PHP has a built-in function

MD5 strings are identical for identical inputs.  So if you read file#1 and make the md5() hash, then read file#2 and make the md5() hash, you can compare these hashes to see if the files are the same.  If the hashes match the files are the same.

Of course, you could just compare the contents of the files, too.  So why would anyone compare only the md5() hash?  

One reason could be that the files are so large that you can only get one of them into memory at a time.  You would read one into a string variable, make the md5() hash, unset the string variable to free the memory, then read the second file, make the md5() hash and compare the hash strings.  Thus the md5 hash becomes a proxy for the contents of the files.

A more mainstream and probably more frequent use of md5() is in securing data communications.  Consider a data transport problem where the recipient does not have access to the original file, and would like to know that the data had not been damaged or tampered with.  The sender would create the md5() hash from a known "salt" string appended to the information payload.  The recipient would take the payload, add the salt and create another md5() string.  If the strings match, the data is intact.

Author Comment

by:Eduardo Fuerte
ID: 41782903
That time I couldn't read all your posts with the attention it desires!
I'm in a hurry, having to attend a dificult test that have to be sent until next monday 08 o'clock in the morning!  So it will be carefully read after.

I guess what you point is a adequated solution - just a .pdf that when updated produces this warning
but the file is uploaded and the "signature" is ok.

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

LVL 29

Assisted Solution

by:Olaf Doschke
Olaf Doschke earned 200 total points
ID: 41782905
You should be exact on what you really want, because the term signature points in two totally different directions.

A file signature is a valid term more commonly known as a file checksum, any checksum or hash algorithm qualifies for this need, even those disqualifying for password hashing still are good enough for file checksums.

But there also is the topic of signing files, not only about the file being untouched and/or completely transferred, but also about identifying who signed a file, eg when you sign a pdf for upload to revenue office/service.

If this is about what Ray pointed back to, its hashing a file. Other reasoning why you don't only profit from hashing over comparing full files is not only about their size, once you know and store a hash, you can compare it to future uploads, also for faster computation and still checking the completeness of file uploads, you might hash just the first 1KB and last 1KB of a file, so your memory consumtion is less and the hash is computed faster. The most common upload errors are double upload and incomplete upload. For these two cases only taking a partial file hash is sufficient.

Bye, Olaf.

Author Comment

by:Eduardo Fuerte
ID: 41782907


I'm not sure I completelly understand what you posted.
But since it's not a "prodution" algorithm - is just to attend a test,  I guess  Julian solution it's sufficient by now, ok?
LVL 29

Expert Comment

by:Olaf Doschke
ID: 41782917
On the assumption you don't want to sign for knowing the authentic source of the file (eg the vendor of a software), an md5 is sufficient to check the file is same or differs.

Bye, Olaf.

Author Closing Comment

by:Eduardo Fuerte
ID: 41783620
Thank you for so qualified assistance!

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How do I fix this UPDATE error? 7 32
What's wrong with this PDO query? 5 27
Moving from Mcrypt to OpenSSL 18 45
jQuery force form POST 7 48
Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to count occurrences of each item in an array.

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question