Link to home
Start Free TrialLog in
Avatar of Mike
MikeFlag for United States of America

asked on

Need Help Creating Regex Expressions

Greeting Experts,

I am new to writing "regex expressions" and need help creating some search patterns related to keywords (i.e. HR, SSN, Social Security Numbers, etc). Can somebody help me with this task if possible...
Avatar of Bill Prew
Bill Prew

Take a look at this, it talks a bit about regex in general, and then has some example you might be able to use or build on.  Come back with specific question on specific data patterns you want to find.



»bp
ASKER CERTIFIED SOLUTION
Avatar of Giovanni
Giovanni
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Also be aware there are several families of Regex.. Plain old, later more generalized and then perl-style regex.
Having a slightly larger set of options from old to new.
Hi Mike,
In addition to the "several families" of RegEx that noci mentioned, there are also differences depending on the programming language. For example, I do a lot of coding in AutoHotkey, which has its flavor of RegEx documented in its Quick Reference, as well as its RegExMatch and RegExReplace functions. Also, you may find this AutoHotkey RegEx tutorial to be helpful. So, I encourage you to look for doc and tutorials for the language in which you're planning to program. Regards, Joe
I often use the following web based tool to test and explore regex expressions.  It allows you you quickly try different things with sample data and breaks apart the regex so you can see what it's doing.  Also has some reference info on the various tokens and such available in most regex implementations.



»bp
Mike,

Please post some representative text and tell us what you need to capture/parse.
Avatar of Mike

ASKER

Thanks for the response from everybody....  Some of the things I am looking to detect is something to the following below..... Using some kind of lettiral Text (referenced in Regex Buddy)

Finanical|finanical|Vender|vender|SSN|ssn|Username|username|Password|password|Credit|credit

Open in new window


I was hoping to find something more simple that can identify both that of upper and lowercase charters
Depending on the regex engine you're using, you can usually make the comparisons case-insensitive.  By default, comparisons in the Powershell regex operations are case-insensitive.
So are you just trying to find the text "SSN" for example in files, or are you actually looking for social security numbers stored in files, so looking for things that match a format of xxx-xx-xxxx in this example?


»bp
Avatar of Mike

ASKER

yes, as an example of keywords I looking for.... like Salary, Salaries, Net-worth, Networth, Just using keywords to improve my chances of finding PII information on my companies network....
Okay, as mentioned then just use something like your pattern and do the search in case insensitive mode.

finanical|vender|ssn|username|password|credit

Open in new window

Keep in mind that as you have it you will also find those literals embed in other words, so for example "credits" would be a match.

Where will you be using this regex pattern, in what language or toolset?


»bp
and if there is no case insensitive mode available (classic regex f.e.)  you can use:

[Ff]inancial|[Ss][Ss][Nn]|....

etc.
Many anti-virus programs have this feature.  It may save you some coding.
I'd recommend the free tool Expresso for RegEx development ... I use it regulary ...
Avatar of Mike

ASKER

the software worked... thanks for the help