Solved

How do I create a URL shortener that generates unique URL's that are not sequential (so that the next URL cannot be guessed from any existing URL) ?

Posted on 2010-09-16
6
994 Views
Last Modified: 2012-05-10
For every file that is saved to disk a database record is created that contains the filename (and other fields). I need to generate a unique filename for each file. The URL containing the filename is posted on the internet. I want to generate unique (and preferably short) URL's that are not sequential so that the next URL cannot be guessed based on any existing URL. I could get the id from the database row and hash it in base62 (lowecase and uppercase letters and numbers) but this would generate sequential URL's allowing filenames for other files to be easily guessed. What do you suggest?
0
Comment
Question by:rubixcube5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 33690139
0
 

Author Comment

by:rubixcube5
ID: 33690347
Which algorithm do you suggest and how can I encode and decode in Java?
Will the hash generated be unique with no clashes?
0
 
LVL 84

Expert Comment

by:ozo
ID: 33690495
you should decide on a URL length adequate to hold all possible id's
you can then choose an encryption algorithm that handles id's of that length.
If the length you choose is short enough, you may be able to implement the encryption with a lookup table.
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
LVL 3

Accepted Solution

by:
gremwell earned 125 total points
ID: 33693265
If you can store short file names in the database, you don't have to use hashes or encryption. Randomly generated values will do equally well and you can choose them to be as long or as short as you like. The values should use characters allowed in URLs, you can get some inspiration from the implementation of generateKey() here http://rocky-says.blogspot.com/2010/04/java-code-url-shortener.html. However, you need to modify the function to check if the generated value exists in the database already (the example keeps in-memory list of keys). The example I refer to generates strings 8 characters in length using 62-symbol alphabet. It gives you 62^8 (~200000000000000) combination, which is a lot.

If you can't change the database schema, you have to use an encryption. Encryption algorithms based on fixed lookup tables (alphabet substitution) are weak in many ways http://en.wikipedia.org/wiki/Substitution_cipher. The best way to protect confidentiality of variable-length data is to use a stream cipher, with random initialization vector. However, this approach is substantially more complicated than the first one and likely to generate longer URLs.
0
 

Expert Comment

by:bhupeshchawda
ID: 33699797
Actually, you can use the method suggested by gremwell above using randomly generated values, It may not require you to change the database schema. You can just add an extra table to the database where you store the mappings of the file names and the randomly generated values.
As far as checking the already present values is concerned, here is what you can do:
Generate maximum random numbers in one go and store them in the table. Now, when you have a filename just map it to the next random number in the table which is not yet assigned to any filename. This way you will not have to check if the random number already exists, it won't. All are unique.
Random number generation can be done by generating a sequence and then using shuffle for that collection.
Randomization will help because here you actually have no algorithm that anybody can guess!
0
 
LVL 84

Expert Comment

by:ozo
ID: 33699815
In order to meet the non-guessability requirement, you should us a a cryptographically strong random number generator for this
0

Featured Post

[Webinar] Code, Load, and Grow

Managing multiple websites, servers, applications, and security on a daily basis? Join us for a webinar on May 25th to learn how to simplify administration and management of virtual hosts for IT admins, create a secure environment, and deploy code more effectively and frequently.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of Google's most recent algorithm changes affecting local searches is entitled "The Pigeon Update." This update has dramatically enhanced search inquires for the keyword "Yelp." Google searches with the word "Yelp" included will now yield Yelp a…
Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question