Link to home
Start Free TrialLog in
Avatar of fmisa
fmisa

asked on

Which algorithm to use for account number/string obfuscation/encryption

Hi All,

Any thoughts on common/best practice for account number obfuscation ?

I've looked at classical character substitution/transposition algorithms, XOR (Base64) based approaches, TDES public/private key and one-way MD5 message digest approach as well......

I have account numbers - alphanumeric - that I would like to obfuscate (ideally one-way).
My only requirements -- are that the:
* cipher text is the same number of characters as the "plain text" (i.e. MD5 too long, transposition might work but I'm not sure about uniqueness)
* cipher text remains alphanumeric -- no =!@#$%& etc. symbolics introduced into the cipher (i.e. XOR type not really helpfull)
* cipher text does not need to be as strong as TDES but a little harder than Base64 or  simple character substitution/shift/mapping ?
* cipher text is unique -- so that audits and history info. -- based on the obfucated/cipher-text is not corrupted by the algorithm producing the same result for two different account numbers.

Any thoughts would be greatly appreciated...
Thanks
Frank
ASKER CERTIFIED SOLUTION
Avatar of moorhouselondon
moorhouselondon
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ozo
How about something like RC4 mod 62 instead of mod 256?
Or modulo 36 if case is ignored.
Avatar of fmisa
fmisa

ASKER

Thanks very much for the thoughts....
I've opted to use a: Vigenère cipher
See:
     http://en.wikipedia.org/wiki/Vigen%C3%A8re_Cipher
     http://www.csci.csusb.edu/public/crypto/game/VigenereCipher_Overview.php

I just want to obfuscate from casual eyes.....
I'm not trying to hide from someone capable of using frequency analysis -- or other cracking tools/approaches.....

The one worry I have about using stream ciphers -- like RC4 or UUENCODE ? -- is uniqueness of cipherText an issue ?
When randomized cipher keys/seeds are brought into the picture -- is it not possible for two different plainText strings to be encoded to the same cipherText at different times/dates ? Depending on what keys were being used at that time ?

I need to generate/prove uniqueness of cipherText algorithm -- so I'm using a very simple approach.....
The strength of TDES/MD5 type "security" is not critical for me....

I would consider some hashing algorithm/message digest approach -- if you can suggest something.....
Provided the algorithm is simple and cipherText output can be tailored in terms of length, character use....

Thanks

UUENCODE would be unique - it is reversible and it is lossless.  How you get to the UUENCODING is a different matter.  Using XOR to get to that point is fine because you are flipping bits one-to-one against a Cipher stream.  Decoding it back is a case of running the Cipher string again and XOR'ing, which is entirely reversible.

Ozo will answer for RC4, I have no knowledge of that one.  From his last comment though, you are going to lose case sensitivity with mod 36.
Sorry.  It is possible for there to be two different strings encoded to the same ciphertext.  However, to retrieve the original plaintext, you will need to know what the cipherkey was at the time you encoded the plaintext.  If this is not known then it cannot be decoded, and becomes a trap-door type encryption (similar to md5).  I have assumed that the ciphertext would be reproducible when decoding the text.  In which case I stand by my earlier comment.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of fmisa

ASKER

I'm getting a little confused.....
Let's say I go with on of the following three options we've talked about (all with a constant/fixed cipher "key"):
1) XOR  + UUENCODE
2) RC4
3) Vigenère cipher....

** How can I verify/confirm that given a constant key -- no two different strings (composed of [a-zA-Z0-9]) will "encrypt" to the same cipher text ?
I don't want to find out months later that my choosen algorithm results in two customer account numbers/string resulting in the same cipher text.
The cipher text is being used to track customer history -- and I'm hearing mixed things from you both on the "uniqueness" of the cipher text of each of these methods based on the same/fixed cipher key ?

Options 1 & 3 - in particular - interest me most - because they are easy to implement and understand.
Am I safe going with one of these ? Given "strength" of encryption from cracking techniques is not critical for me.....
Just a) obfuscate and b) ensure uniqueness of cipher text -- is all I need....

Thanks
Avatar of fmisa

ASKER

Let me rephrase....

Are "collisions" possible using Vigenère cipher or XOR ?
OR are cipher duplicates/collisions only experienced with hashing/MD5 type algorithms ??

Thanks
Are "collisions" possible using Vigenère cipher or XOR ?  
Not with a fixed key.
Otherwise, decoding would be ambigous.
Avatar of fmisa

ASKER

Thanks very much.....
I didn't actually use any of your suggestions -- but the discussion has really helped me.
I'll stick with Vigenère cipher -- with a long, fixed cipher key.

I'd like to split the points between you both -- ozo & moorhouselondon -- for your great thoughts and help.....

Do I need a seperate post to "admin" -- or is the following post enough to have site-admin split the points for me at a later date ?

Take Care...