Solved

preg_match question

Posted on 2010-11-22
20
479 Views
Last Modified: 2012-05-10
Ok I am doing security upgrade for my site and I have few questions,

What I want to do is limit users to be allowed to create usernames only including letters A-Z a-z and numbers 0-9. Is this good idea or should I allow more (like - and _ and whitespace in between (using trim to strip from beginning and end)). I was thinking about using preg_match but I don't know how to write regular expressions.

Also what illegal characters should be used for passwords (if any). Right now I'm just stripping ' , but I will do md5 or custom encryption so there could be any char there...but what would be most logical you guys let me know. Also, should I really do md5 (no way to re-send passwords only reset) or should I rather do some other custom encryption?

And lastly, is there any quick way to validate email pattern (universal for all email types) using preg_match?

Thanks
0
Comment
Question by:GVNPublic123
  • 10
  • 5
  • 4
  • +1
20 Comments
 

Author Comment

by:GVNPublic123
ID: 34191194
ok found expression /[\w ]/ that does all illegal chars.
0
 

Author Comment

by:GVNPublic123
ID: 34191224
ok for emails I found on some JS tutorials site:
  var emailPattern = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$/;  

Will that pattern work in ALL email cases properly? Even if email is fake I dont care, I just want correct pattern (I know there can be further validation done, I just dont care because emails are manually confirmed by users via activation link).
0
 

Author Comment

by:GVNPublic123
ID: 34191263
or this one:
if(!preg_match("/^( [a-zA-Z0-9] )+( [a-zA-Z0-9\._-] )*@( [a-zA-Z0-9_-] )+( [a-zA-Z0-9\._-] +)+$/" , $email))
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 34191349
First things first - learning regular expressions will benefit you in more than just this project. They're not very hard to learn, either. There's a lot you CAN learn, but most people use just the basics, and those are easy to master.

http://www.regular-expressions.info/tutorial.html

Regarding the security side of things, I would probably allow all alphanumeric characters, as well as @, underscores _, dashes -, and periods ., since some people like to use their email addresses as their usernames. A regular expression for this would be:

[a-zA-Z0-9@_\.\-]

That expression would successfully match against any lower or upper case letter, any number, plus the aforementioned special characters. The brackets [ ] surrounding it simply indicate the possible range of characters.

When your form submits, you can use preg_replace to strip out any character that does not fall in that range. The syntax is basically:

$new_value = preg_replace("/REGULAR EXPRESSION HERE/","REPLACE WITH THIS",$original_value);

So to strip out all but the wanted characters in your username field:
$username = $_POST["username"];
$username = preg_replace("/[a-zA-Z0-9@_\.\-]/","",$username);

Regarding passwords, if you're doing one-way encryption with MD5, there's no need to restrict any characters. I would definitely avoid custom encryption if you're asking the question. The only people that should do custom encryption are those that are comfortable enough with that specific type of coding to not ask the question, in my opinion.

You may want to consider a different hash, though, like SHA-1 or something. While MD5 is mostly safe, there ARE some ways of brute-forcing it faster than other hashes. If this isn't super-secret stuff, then I wouldn't worry about it, but if there's any chance that you'll undergo an extremely strict audit or be under the supervision of anyone paranoid, then you might as well use SHA-1.

Regarding email matching, the pattern I gave you should be valid for any major email character, so:

$isValidEmail = preg_match("/[a-zA-Z0-9@_\.\-]+/",$emailAddressToBeChecked);

That should "validate" about 90% of your addresses. It doesn't validate the structure, so it would also pass addresses like .@@@abc1, which is clearly not valid. You can get a little more traditional with something like:

$isValidEmail = preg_match("/[a-zA-Z0-9_\.\-]+@[a-zA-Z0-9_\.\-]+/",$emailAddressToBeChecked);

That will ensure that there's an @ symbol between characters, but it'll still work for email addresses like bob@bob (which is technically valid).

That said, if someone is going to go out of their way to type a phony address in, there's not much you can do in terms of validation. You could always try to do MX lookups to check the actual DNS records and such, but that, too, will only get you so far. I would go with the second piece of code and simply acknowledge that you will have people that won't want to give their real address and will provide a phony one that matches the validation expression.
0
 

Author Comment

by:GVNPublic123
ID: 34191764
well once I choose crypt method its over, 650 people will have passwords irreversibly encrypted...and that would be a no-go to restore ever, so I gotta go with reliable system.

I've read that both MD5 and SHA1 are not so good as bruteforce attacks can be done fast on them. Any suggestions?
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34191797
Brute force attacks can only be done quickly if you allow them to be. For example, you could allow only 3 login attempts every 5 minutes from each IP address, or for each login id (preferably both). You could also set up a warning notification where you get emailed if there's an unusually high number of failed logins within a particular time.... etc.
0
 

Author Comment

by:GVNPublic123
ID: 34191859
yeh but if someone hacks mysql server and gets database is the case....
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34191887
A correction to gr8gonzo's code:

$username = $_POST["username"];
$username = preg_replace("/[^a-zA-Z0-9@_\.\-]/","",$username);

Prior to adding the ^ character, it strips out all the wanted characters, not the unwanted ones.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34191899
A 2nd correction:

$isValidEmail = preg_match("/[a-zA-Z0-9@_\.\-]+/",$emailAddressToBeChecked);

Needs to be:

$isValidEmail = preg_match("/^[a-zA-Z0-9@_\.\-]+$/",$emailAddressToBeChecked);

Without the ^ and $ characters, a match will be found if just one valid character is found (and there could be any number of invalid ones).
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34191903
So this also:
$isValidEmail = preg_match("/[a-zA-Z0-9_\.\-]+@[a-zA-Z0-9_\.\-]+/",$emailAddressToBeChecked);

Should be:
$isValidEmail = preg_match("/^[a-zA-Z0-9_\.\-]+@[a-zA-Z0-9_\.\-]+$/",$emailAddressToBeChecked);
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:GVNPublic123
ID: 34191964
ok, is using hash('sha256', $saltedpass) any better? Is it better than MD5 or SHA1 in terms of speed vs security or it just doesn't matter? If SHA256 slows down bruteforces significantly than it wouldn't be worth to bruteforce anymore in first place as my site is no big deal.
0
 
LVL 34

Accepted Solution

by:
gr8gonzo earned 500 total points
ID: 34192111
Thanks, Terry. Your corrections are... well... correct. Sometimes I just tap things out in a hurry and don't get a chance to actually test them. :)

I will disagree with one thing though:
"Brute force attacks can only be done quickly..."

Just to expand on this, I don't think "quickly" is the right word. Even offline brute-forcing takes a significant amount of time without some serious hardware. Keep in mind that normally the hacker does not have direct access to the hash itself, so trying to brute-force the login screen will take an equally long time no matter what the encryption is on the backend. Trying to brute-force something over HTTP requests is almost impossible, no matter the hardware. At a certain point, the server can only handle so many simultaneous requests, and it usually takes millions if not billions or more tries to successfully brute-force an unknown length, blind string, so any network latency will slow that down to something that's almost always not worth the hacker's time.

If the hacker were to gain access to the database itself, then he/she could perform an offline brute-force attack, but that would still take a significant amount of time. That said, MD5 hash collisions are simply more frequent than SHA-1, and are susceptible to some techniques (ala rainbow tables), so if any one-way hash were used, I would simply suggest the one that would take longer to crack.

At a certain point, you're simply trying to make the hacker decide that the time it takes to crack the password is not worth the prize.

GVN - all that said, if someone were to get the database, chances are that you would be screwed beyond whatever security a one-way hash provides. Still, it's good practice to have one-way passwords (and encrypt any other sensitive data) to limit the potential abuse by a corrupt employee (current or future). That's primarily why you want to secure the data inside the database - it's not as much to keep people from hacking your login form.

There are far better/easier ways to get into a system than by brute-forcing a password or a hash that someone probably doesn't have access to anyway. Your biggest weakness will always be insecure coding practices. You will have an employee that doesn't remember to sanitize a variable before using it in a query, and a hacker will be able to use that to his/her advantage.

In other words, a hacker doesn't need to crack a password if he can run queries on your database. Here's an example of how I hacked into an administrator account about 5 years ago (this was a legitimate scenario where an account was locked and the administrator had left the company without giving the passwords and was unreachable):

I was faced with a similar situation where the administrator password was encrypted, but I could run a query against the database using a hole in a piece of code that he had written (a basic SQL injection attack). Instead of waiting 2+ weeks for a server to brute-force the password, I signed up for an account myself. The system generated a password and sent it to me. All I had to do was find out MY password hash, then run an update query to set the admin's account to use the same hash. Presto, my newly-generated password was now also the admin's password. I never did find out what his password was, nor did I care, because now I could get in and take care of business. No brute-forcing required - it was all simply due to a hole in the code, and that will always be the part that you should pay most attention to.

Being able to run a query on a database is way more dangerous than almost any other security flaw. It's more dangerous than actually having read-only access to the entire database (e.g. finding a database dump).

Through code vulnerabilities both in PHP and also in Javascript (cross-site scripting), you can easily let a hacker gain access to your entire system. I wrote two articles on these topics:

http://www.experts-exchange.com/Programming/Project_Management/Security/A_1263-5-Steps-to-Securing-Your-Web-Application.html

http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/PHP_Databases/A_686-PHP-Prevent-SQL-Injection.html

Securing your entire application can sometimes be a hassle, but if you don't do it, a break-in is just a matter of time.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 34192146
@GVN - the speed difference between MD5 and SHA1 doesn't really matter. The brute-force bottleneck will be the network. You just can't try thousands and thousands of combinations every second over a web server connection. Again, the internal one-way hashing is more about protecting your data from internal threats than anything, IMHO.
0
 

Author Comment

by:GVNPublic123
ID: 34192155
so what should I do to encrypt passwords? What do you suggest? MD5, SHA1 or SHA256 are what I've been looking over...

Also, hash vs hash_hmac? Or it doesnt matter @ all and I can do simple salted md5 and be happy? Its certainly better than poorly encrypted as I have now.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 34192217
If you're protecting any sort of financial data, be paranoid about everything. Use the strong security option when you can - a form of SHA, and use hash_hmac if it's available to you. Also make sure you encrypt anything sensitive like credit card numbers. Always pretend or assume that you WILL have an employee that will try to use your data to his/her advantage, so encrypt whatever pieces of information they could abuse. If you need to be able to decrypt the data, then use an SSL certificate to encrypt the data before storing it (you don't need to buy an expensive certificate for internal use - you can generate and use your own free certificate, referred to as a "self-signed certificate").

You won't be able to completely eliminate all potential venues of attack from an internal source, but the point is to slow them down as much as possible.
0
 

Author Comment

by:GVNPublic123
ID: 34192237
no, I only have basic members website and importance of user accounts is not very high as hacking their accounts cannot make much damage.
0
 
LVL 34

Expert Comment

by:gr8gonzo
ID: 34192301
Then md5 should probably be fine for your purposes. Keep in mind that most hackers are after the administrator account, not the other people's accounts, so just think about what a successful break-in to the administrator could mean in terms of any ripple effects. Would they be able to get into other systems or have any special privileges that could lead to a different, more damaging break-in? You don't have to answer these questions here - it's just for your own contemplation.
0
 

Author Comment

by:GVNPublic123
ID: 34192305
I made up my mind, I will go with salted md5 cuz its only 32 chars hash and wont take much space in database and besides its just members site, noone gives a damn...
0
 

Author Comment

by:GVNPublic123
ID: 34192312
besides, I have registration/login queries now using real_escape_string and stuff so...sql injection isn't likely I guess.
0
 
LVL 5

Expert Comment

by:innotionent
ID: 34196898
To be secure most government entities use a password of greater than 8 characters.
Those characters must include at least 1 upper case character, one number and one non letter or non numerical character.
For example: Myp4ssw*

For extra security use a salt with your md5. so even if they guess the password they still have to guess the salt.

0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now