rubixcube5
asked on
How do I create a URL shortener that generates unique URL's that are not sequential (so that the next URL cannot be guessed from any existing URL) ?
For every file that is saved to disk a database record is created that contains the filename (and other fields). I need to generate a unique filename for each file. The URL containing the filename is posted on the internet. I want to generate unique (and preferably short) URL's that are not sequential so that the next URL cannot be guessed based on any existing URL. I could get the id from the database row and hash it in base62 (lowecase and uppercase letters and numbers) but this would generate sequential URL's allowing filenames for other files to be easily guessed. What do you suggest?
http://en.wikipedia.org/wiki/Cryptographic_hash_function#File_or_data_identifier
ASKER
Which algorithm do you suggest and how can I encode and decode in Java?
Will the hash generated be unique with no clashes?
Will the hash generated be unique with no clashes?
you should decide on a URL length adequate to hold all possible id's
you can then choose an encryption algorithm that handles id's of that length.
If the length you choose is short enough, you may be able to implement the encryption with a lookup table.
you can then choose an encryption algorithm that handles id's of that length.
If the length you choose is short enough, you may be able to implement the encryption with a lookup table.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Actually, you can use the method suggested by gremwell above using randomly generated values, It may not require you to change the database schema. You can just add an extra table to the database where you store the mappings of the file names and the randomly generated values.
As far as checking the already present values is concerned, here is what you can do:
Generate maximum random numbers in one go and store them in the table. Now, when you have a filename just map it to the next random number in the table which is not yet assigned to any filename. This way you will not have to check if the random number already exists, it won't. All are unique.
Random number generation can be done by generating a sequence and then using shuffle for that collection.
Randomization will help because here you actually have no algorithm that anybody can guess!
As far as checking the already present values is concerned, here is what you can do:
Generate maximum random numbers in one go and store them in the table. Now, when you have a filename just map it to the next random number in the table which is not yet assigned to any filename. This way you will not have to check if the random number already exists, it won't. All are unique.
Random number generation can be done by generating a sequence and then using shuffle for that collection.
Randomization will help because here you actually have no algorithm that anybody can guess!
In order to meet the non-guessability requirement, you should us a a cryptographically strong random number generator for this