Solved

What is the probability of two different strings having the same MD5 hash?

Posted on 2009-05-13
4
631 Views
Last Modified: 2012-05-06
Hi,

I am planning to create my own query cache. I will be looking up the query results from the cache using an MD5 hash of the query string. There will be tens of thousands of queries stored in the cache. What is the likelihood of two different query strings having the same MD5 hash?

The reason I ask is because to my knowledge MySQL's native query cache looks up cached results using the full query string, instead of using a hash of the query string (which would be faster and use less storage), and I wondered if this was due to potential hash conflicts.

Thanks
0
Comment
Question by:tomp_gl
  • 2
4 Comments
 
LVL 14

Assisted Solution

by:racek
racek earned 50 total points
Comment Utility
No problem, but before MD5 you need to
- replace all double spaces
- change all to capitals
- replace variables with ? or similar
- maybe replace ALIAS table and column names with whole table names :-) because different programmers use different aliases
- LEF JOIN to LEFT OUTER JOIN because different programmers use different aliases
etc

0
 
LVL 84

Accepted Solution

by:
ozo earned 150 total points
Comment Utility
with 10,000 querys, approximately 10,000^2 / 2^128
0
 
LVL 22

Assisted Solution

by:dportas
dportas earned 50 total points
Comment Utility
The risk of accidental MD5 collisions is vanishingly small. You don't need to worry about it. There is a possible risk of deliberately constructed collisions, which may present a security risk in some unusual circumstances.

As racek suggests, for your scheme to be effective you'll probably have to use some canonical form of the query rather than its raw format.
0
 
LVL 14

Assisted Solution

by:racek
racek earned 50 total points
Comment Utility
another thing is that query is stored in MySQL including comments like

SELECT *
FROM yourtable  /* changed 2009-01-05 */
Where ...;
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

You cannot be 100% sure that you can protect your organization against crypto ransomware but you can lower down the risk and impact of the infection.
Creating and Managing Databases with phpMyAdmin in cPanel.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now