Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

What is the probability of two different strings having the same MD5 hash?

Posted on 2009-05-13
4
Medium Priority
?
679 Views
Last Modified: 2012-05-06
Hi,

I am planning to create my own query cache. I will be looking up the query results from the cache using an MD5 hash of the query string. There will be tens of thousands of queries stored in the cache. What is the likelihood of two different query strings having the same MD5 hash?

The reason I ask is because to my knowledge MySQL's native query cache looks up cached results using the full query string, instead of using a hash of the query string (which would be faster and use less storage), and I wondered if this was due to potential hash conflicts.

Thanks
0
Comment
Question by:tomp_gl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 14

Assisted Solution

by:racek
racek earned 150 total points
ID: 24373331
No problem, but before MD5 you need to
- replace all double spaces
- change all to capitals
- replace variables with ? or similar
- maybe replace ALIAS table and column names with whole table names :-) because different programmers use different aliases
- LEF JOIN to LEFT OUTER JOIN because different programmers use different aliases
etc

0
 
LVL 84

Accepted Solution

by:
ozo earned 450 total points
ID: 24373336
with 10,000 querys, approximately 10,000^2 / 2^128
0
 
LVL 22

Assisted Solution

by:dportas
dportas earned 150 total points
ID: 24373796
The risk of accidental MD5 collisions is vanishingly small. You don't need to worry about it. There is a possible risk of deliberately constructed collisions, which may present a security risk in some unusual circumstances.

As racek suggests, for your scheme to be effective you'll probably have to use some canonical form of the query rather than its raw format.
0
 
LVL 14

Assisted Solution

by:racek
racek earned 150 total points
ID: 24374830
another thing is that query is stored in MySQL including comments like

SELECT *
FROM yourtable  /* changed 2009-01-05 */
Where ...;
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this series, we will discuss common questions received as a database Solutions Engineer at Percona. In this role, we speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller singl…
In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question