Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1010
  • Last Modified:

How do I create a URL shortener that generates unique URL's that are not sequential (so that the next URL cannot be guessed from any existing URL) ?

For every file that is saved to disk a database record is created that contains the filename (and other fields). I need to generate a unique filename for each file. The URL containing the filename is posted on the internet. I want to generate unique (and preferably short) URL's that are not sequential so that the next URL cannot be guessed based on any existing URL. I could get the id from the database row and hash it in base62 (lowecase and uppercase letters and numbers) but this would generate sequential URL's allowing filenames for other files to be easily guessed. What do you suggest?
0
rubixcube5
Asked:
rubixcube5
1 Solution
 
rubixcube5Author Commented:
Which algorithm do you suggest and how can I encode and decode in Java?
Will the hash generated be unique with no clashes?
0
 
ozoCommented:
you should decide on a URL length adequate to hold all possible id's
you can then choose an encryption algorithm that handles id's of that length.
If the length you choose is short enough, you may be able to implement the encryption with a lookup table.
0
Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

 
gremwellCommented:
If you can store short file names in the database, you don't have to use hashes or encryption. Randomly generated values will do equally well and you can choose them to be as long or as short as you like. The values should use characters allowed in URLs, you can get some inspiration from the implementation of generateKey() here http://rocky-says.blogspot.com/2010/04/java-code-url-shortener.html. However, you need to modify the function to check if the generated value exists in the database already (the example keeps in-memory list of keys). The example I refer to generates strings 8 characters in length using 62-symbol alphabet. It gives you 62^8 (~200000000000000) combination, which is a lot.

If you can't change the database schema, you have to use an encryption. Encryption algorithms based on fixed lookup tables (alphabet substitution) are weak in many ways http://en.wikipedia.org/wiki/Substitution_cipher. The best way to protect confidentiality of variable-length data is to use a stream cipher, with random initialization vector. However, this approach is substantially more complicated than the first one and likely to generate longer URLs.
0
 
bhupeshchawdaCommented:
Actually, you can use the method suggested by gremwell above using randomly generated values, It may not require you to change the database schema. You can just add an extra table to the database where you store the mappings of the file names and the randomly generated values.
As far as checking the already present values is concerned, here is what you can do:
Generate maximum random numbers in one go and store them in the table. Now, when you have a filename just map it to the next random number in the table which is not yet assigned to any filename. This way you will not have to check if the random number already exists, it won't. All are unique.
Random number generation can be done by generating a sequence and then using shuffle for that collection.
Randomization will help because here you actually have no algorithm that anybody can guess!
0
 
ozoCommented:
In order to meet the non-guessability requirement, you should us a a cryptographically strong random number generator for this
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now