Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How do I create a URL shortener that generates unique URL's that are not sequential (so that the next URL cannot be guessed from any existing URL) ?

Posted on 2010-09-16
6
Medium Priority
?
1,007 Views
Last Modified: 2012-05-10
For every file that is saved to disk a database record is created that contains the filename (and other fields). I need to generate a unique filename for each file. The URL containing the filename is posted on the internet. I want to generate unique (and preferably short) URL's that are not sequential so that the next URL cannot be guessed based on any existing URL. I could get the id from the database row and hash it in base62 (lowecase and uppercase letters and numbers) but this would generate sequential URL's allowing filenames for other files to be easily guessed. What do you suggest?
0
Comment
Question by:rubixcube5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 33690139
0
 

Author Comment

by:rubixcube5
ID: 33690347
Which algorithm do you suggest and how can I encode and decode in Java?
Will the hash generated be unique with no clashes?
0
 
LVL 84

Expert Comment

by:ozo
ID: 33690495
you should decide on a URL length adequate to hold all possible id's
you can then choose an encryption algorithm that handles id's of that length.
If the length you choose is short enough, you may be able to implement the encryption with a lookup table.
0
Will your db performance match your db growth?

In Percona’s white paper “Performance at Scale: Keeping Your Database on Its Toes,” we take a high-level approach to what you need to think about when planning for database scalability.

 
LVL 3

Accepted Solution

by:
gremwell earned 500 total points
ID: 33693265
If you can store short file names in the database, you don't have to use hashes or encryption. Randomly generated values will do equally well and you can choose them to be as long or as short as you like. The values should use characters allowed in URLs, you can get some inspiration from the implementation of generateKey() here http://rocky-says.blogspot.com/2010/04/java-code-url-shortener.html. However, you need to modify the function to check if the generated value exists in the database already (the example keeps in-memory list of keys). The example I refer to generates strings 8 characters in length using 62-symbol alphabet. It gives you 62^8 (~200000000000000) combination, which is a lot.

If you can't change the database schema, you have to use an encryption. Encryption algorithms based on fixed lookup tables (alphabet substitution) are weak in many ways http://en.wikipedia.org/wiki/Substitution_cipher. The best way to protect confidentiality of variable-length data is to use a stream cipher, with random initialization vector. However, this approach is substantially more complicated than the first one and likely to generate longer URLs.
0
 

Expert Comment

by:bhupeshchawda
ID: 33699797
Actually, you can use the method suggested by gremwell above using randomly generated values, It may not require you to change the database schema. You can just add an extra table to the database where you store the mappings of the file names and the randomly generated values.
As far as checking the already present values is concerned, here is what you can do:
Generate maximum random numbers in one go and store them in the table. Now, when you have a filename just map it to the next random number in the table which is not yet assigned to any filename. This way you will not have to check if the random number already exists, it won't. All are unique.
Random number generation can be done by generating a sequence and then using shuffle for that collection.
Randomization will help because here you actually have no algorithm that anybody can guess!
0
 
LVL 84

Expert Comment

by:ozo
ID: 33699815
In order to meet the non-guessability requirement, you should us a a cryptographically strong random number generator for this
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: Nadia
Suppose you use Uber application as a rider and you request a ride to go from one place to another. Your driver just arrived at the parking lot of your place. The only thing you know about the ride is the license plate number. How do you find your U…
Prime numbers are natural numbers greater than 1 that have only two divisors (the number itself and 1). By “divisible” we mean dividend % divisor = 0 (% indicates MODULAR. It gives the reminder of a division operation). We’ll follow multiple approac…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question