Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables.
There are many RegEx engines for use and these engines have different syntax and compilation. Perl5 is the most popular syntax which runs on NFA engine. There are three main types of engines: NFA, POSIX and DFA. Please see the references section at the end of the article for deatiled information.
Regular expressions are hard to explain by words and looks frightening. But if you have the patience and courage to jump into, it is one of the most useful and funny languages you may ever learn. So, here are the most used special characters, with examples.
Special Characters Used in Regular Expressions
Matches the pattern between the parenthesis or used to logically group patterns or characters together.
exchange in experts
The "dot" matches a single character. Note that it does not match line breaks unless the engine is operating in single line mode.
Match: experts, experts
This returns a result with zero or more occurences of the character before this. For example:
This returns a result with or without the character before it.
Match: expert, expert
Match: exper, exper
This character defines a range of characters in a character class. It also specifies a hyphen if placed immediately after the opening "[". If you want either "-" or "]" itself to be a member of a class, put it at the start of the list (possibly after a "^"), or escape it with a backslash. "-" is also taken literally when it is at the end of the list, just before the closing "]". The following all specify the same class of three characters: [-az] , [az-] , and [a\-z] . All are different from [a-z] , which specifies a class containing twenty-six characters, even on EBCDIC-based character sets. Also, if you try to use the character classes \w , \W , \s, \S , \d , or \D as endpoints of a range, the "-" is understood literally.
Match: All numbers between 1 and 9
Match: All lowercase letters from a to z
Match: All uppercase characters from A to Z
This matches the array or any of the characters enclosed.
The "backslash" is the escape character for any special characters after it.
Match: ^ or (
"\d", "\w" and "\s" characters
word characters (letters, digits, underscores) and white
spaces (tabs, spaces, line breaks) relatively.
"\D", "\W" and "\S" characters
The negated versions of the above.
This matches a backspace character when used inside a character class.
Matches at the position between a word character (anything matched by \w) and a non-word character (anything matched by [^\w] or \W) as well as at the start and/or end of the string if the first and/or last characters in the string are word characters.
s in expert
Matches at the position between two word characters (the position between \w\w) as well as at the position between two non-word characters (\W\W).
x in e
The OR character matches either characters on the left and right side.
Match: x or rts in e
Match: x or r in e
It can also be used to combine expression patterns:
Match: Numbers from 1 to 29
Matches the characters between "\Q" and "\E", suppressing the meaning of special characters.
Now let's put these into use to understand it better:
Building a Date Expression
Let's build a date expression that will catch the dates in a text in the format of
Before using RegEx, I personally recommend you to decide which engine to use and read the resources about that engine as all the engines' behaviour differs when interpreting the special characters. Please review the references below if you want to dive deeper into Regular Expressions.
Would it be OK with you if I link to your article from mine? I will not be presenting any information you covered; I will merely be pointing newcomers your way :)
Sorry I was away for some time and just read your post. If you're still interested, you can link to my article.
It seems I'm now the one who has been away for some time! Thanks. I'm finishing up one now :)
I've still a way to go...if anyone can help i've got a question open on the topic..
Thanks a lot