Which algorithm to use for account number/string obfuscation/encryption

Hi All,

Any thoughts on common/best practice for account number obfuscation ?

I've looked at classical character substitution/transposition algorithms, XOR (Base64) based approaches, TDES public/private key and one-way MD5 message digest approach as well......

I have account numbers - alphanumeric - that I would like to obfuscate (ideally one-way).
My only requirements -- are that the:
* cipher text is the same number of characters as the "plain text" (i.e. MD5 too long, transposition might work but I'm not sure about uniqueness)
* cipher text remains alphanumeric -- no =!@#$%& etc. symbolics introduced into the cipher (i.e. XOR type not really helpfull)
* cipher text does not need to be as strong as TDES but a little harder than Base64 or  simple character substitution/shift/mapping ?
* cipher text is unique -- so that audits and history info. -- based on the obfucated/cipher-text is not corrupted by the algorithm producing the same result for two different account numbers.

Any thoughts would be greatly appreciated...
Thanks
Frank
fmisaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

moorhouselondonCommented:
You could do a UUENCODE on !@#$%& type text, turning it into alphanumeric.  So XOR might be ok with this wrapper tacked on.  Not terribly secure though.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
ozoCommented:
How about something like RC4 mod 62 instead of mod 256?
ozoCommented:
Or modulo 36 if case is ignored.
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

fmisaAuthor Commented:
Thanks very much for the thoughts....
I've opted to use a: Vigenère cipher
See:
     http://en.wikipedia.org/wiki/Vigen%C3%A8re_Cipher
     http://www.csci.csusb.edu/public/crypto/game/VigenereCipher_Overview.php

I just want to obfuscate from casual eyes.....
I'm not trying to hide from someone capable of using frequency analysis -- or other cracking tools/approaches.....

The one worry I have about using stream ciphers -- like RC4 or UUENCODE ? -- is uniqueness of cipherText an issue ?
When randomized cipher keys/seeds are brought into the picture -- is it not possible for two different plainText strings to be encoded to the same cipherText at different times/dates ? Depending on what keys were being used at that time ?

I need to generate/prove uniqueness of cipherText algorithm -- so I'm using a very simple approach.....
The strength of TDES/MD5 type "security" is not critical for me....

I would consider some hashing algorithm/message digest approach -- if you can suggest something.....
Provided the algorithm is simple and cipherText output can be tailored in terms of length, character use....

Thanks

moorhouselondonCommented:
UUENCODE would be unique - it is reversible and it is lossless.  How you get to the UUENCODING is a different matter.  Using XOR to get to that point is fine because you are flipping bits one-to-one against a Cipher stream.  Decoding it back is a case of running the Cipher string again and XOR'ing, which is entirely reversible.

Ozo will answer for RC4, I have no knowledge of that one.  From his last comment though, you are going to lose case sensitivity with mod 36.
moorhouselondonCommented:
Sorry.  It is possible for there to be two different strings encoded to the same ciphertext.  However, to retrieve the original plaintext, you will need to know what the cipherkey was at the time you encoded the plaintext.  If this is not known then it cannot be decoded, and becomes a trap-door type encryption (similar to md5).  I have assumed that the ciphertext would be reproducible when decoding the text.  In which case I stand by my earlier comment.
ozoCommented:
uuencode expands the string so 3 characters in becomes 4 characters out.
Given the key (or sufficient analysis) you can recover RC4 plaintext from the cyper text.
so cipher text is unique for a fixed key.
If the key can change, then different key/plaintext combinations can produce collisions in Vigenère cyphertext too.
If you want to use times/dates as part of the plaintext, then the cypher text will map to  unique times/dates for a fixed key.
fmisaAuthor Commented:
I'm getting a little confused.....
Let's say I go with on of the following three options we've talked about (all with a constant/fixed cipher "key"):
1) XOR  + UUENCODE
2) RC4
3) Vigenère cipher....

** How can I verify/confirm that given a constant key -- no two different strings (composed of [a-zA-Z0-9]) will "encrypt" to the same cipher text ?
I don't want to find out months later that my choosen algorithm results in two customer account numbers/string resulting in the same cipher text.
The cipher text is being used to track customer history -- and I'm hearing mixed things from you both on the "uniqueness" of the cipher text of each of these methods based on the same/fixed cipher key ?

Options 1 & 3 - in particular - interest me most - because they are easy to implement and understand.
Am I safe going with one of these ? Given "strength" of encryption from cracking techniques is not critical for me.....
Just a) obfuscate and b) ensure uniqueness of cipher text -- is all I need....

Thanks
fmisaAuthor Commented:
Let me rephrase....

Are "collisions" possible using Vigenère cipher or XOR ?
OR are cipher duplicates/collisions only experienced with hashing/MD5 type algorithms ??

Thanks
ozoCommented:
Are "collisions" possible using Vigenère cipher or XOR ?  
Not with a fixed key.
Otherwise, decoding would be ambigous.
fmisaAuthor Commented:
Thanks very much.....
I didn't actually use any of your suggestions -- but the discussion has really helped me.
I'll stick with Vigenère cipher -- with a long, fixed cipher key.

I'd like to split the points between you both -- ozo & moorhouselondon -- for your great thoughts and help.....

Do I need a seperate post to "admin" -- or is the following post enough to have site-admin split the points for me at a later date ?

Take Care...

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Programming Theory

From novice to tech pro — start learning today.