Regular Expressions

A regular expression ("regex") is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python and C++ (since C++11). Most other languages offer regular expressions via a library.

Share tech news, updates, or what's on your mind.

Sign up to Post

I need a simple REGULAR EXPRESSION that will find a specific list of numeric values.    For example;

LineByID
------------
3000
1000
3001
9991
8888

I need a regular expression that would only find values "3000" and "1000".
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE
LVL 4
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Form field validation for a date requirement to be: Current date on or after 10/01/2017 - cold fusion form

A new requirement for a form. I need to detect the date a form is being submitted to be the current date on or after 10/01/2017 in a cold fusion form.

The date won't be placed into the form field/s but I need to detect the date. This will be implemented on 10/01/2017 fyi. I'll need to set the code up to test say for today's date then once working set it up to detect 10/01/2017 for when it goes live.

Thanks for any help. I'll start researching the topic as I've never tried to work this type of code before. Current code for one of the fields is:

  <cfset errorMsg = ""> 
   	<cfif NOT refind("^[DdFfGgHhNnVv][A-Za-z0-9]{1,21}$", form.Contract_Number) AND NOT refind("^70[Zz]0[a-zA-Z0-9]{2}\d{2}[a-zA-Z][a-zA-Z0-9]{8}$", form.Contract_Number)>
	   <cfset errorMsg = "The Contract Number is required and can contain no more than 22 alphanumeric characters.,">
	</cfif>
	<cfset session.invoiceDataContr.errorString = errorMsg>

Open in new window

0
The Task To try and recognise peoples names (mainly last names) in any Office files (Outlook, Word or Excel).  Basically I can reduce all these items to just a text string.  The reason that I would like to do this is that I want to add an item to the Office context menu when a name is clicked on with a right click of the mouse.

My research pretty much looks like this with respect to Regex;

StackOverflow Forum Question
StackOverflow question
StackOverflow question

I have also found this library and have set up an example with one of the regular expressions here.

Here is a link to my regex101 code with text.. You can see that I need to fix this for German Umlauts.

Reading up on human name recognition and Regex there are some people that do not think it is a great idea. I have therefore thought of the following set of steps which still involves Regex but should not miss any names.

1. Create a simple Regex that collects all words which start with a capital, is longer than 2 characters and only has alpha characters
2. …
0
I am looking for a regex that will catch things like;

gJ sKR Bow HRsT HRT BO KeT

In other words all 2, 3 or 4 letter characters where the word contains at least one capital.  Thus normal words such as

this, cat, ball etc

would not be found due to the capital rule.

I am do not know regex but doing a little bit or reading I think I have below 2 to 4 alpha characters but this allows them to be all lower case.
\b[A-Za-z]{2,4}\b

Open in new window


How can I change the above so that at least (meaning it could be more than one) one of the characters should be a capital.

Any regex will get me started but I am working in a .net environment.
0
I have a string of indeterminate length from which I wish to removed the Computer section using a regular expression.

1afrsComputer
3frs878Computer

Can anyone help please?
0
Hello,

I have a case where I have lots of HTML pages with <a name="[x]"> tags without any closing </a> tags.

Is there a regex that will find those <a tags so I can remove them?

Thanks in advance,

Steve
0
Hello,

1. Is it possible to write a regex that searches for a group of tags and finds the text regardless of whether or not a line break is there or not?

For example:

Can one regular expression find these two text samples:

<center>
<p><table bgColor="#e2dcc5" border="1" cellPadding="5" cellSpacing="0" width="475">
<tbody>
<tr>
<td>

and

<center><p><table bgColor="#e2dcc5" border="1" cellPadding="5" cellSpacing="0" width="475"><tbody>
<tr><td>

Thanks.
0
Hi Team,

I am having a text file having pipe separated string values. The issue is getting string within the string as shown below. I am looking for a regular expression or some solution to remove that quotes as shown in the desired text below.

Existing: "This string is ok."|"This is an example with a "C" double quoted grade in middle."|"Next line"
Desired: "This string is ok."|"This is an example with a C double quoted grade in middle."|"Next line"

Looking forward to hearing from you.
0
I need to extract email body from a MIME file using java regex. However, I would not be able to use any java library. I need a regex itself. If anyone knows, please let me know.
Thanks.
Shams
0
Hello,

I've got some HTML that looks like:

            <center>
                                                                        <table bgcolor="#E2DCC5" border="1" cellpadding="5" cellspacing="0" width="450">
                                                                              <tr>
                                                                                    <td>
                                                                                          <img align="left" alt="Tip" border="0" height="16" src="/global/images/icons/tip.gif" width="41">You might find it helpful to practice your presentation and get feedback on how well you did. The tool <img align="absbottom" alt="Tool" src="/global/Images/icons/tool.gif"><a href="/tools/improving_your_presentations.asp">Improving Your Presentations</a> provides some additional tips on making presentations and contains an observer feedback form to get targeted feedback on your presentation.
                                                                                    </td>
                                                                              </tr>
                                                                        </table>
                                                                  </center>

I'd like it to be changed to:
<div class="box-highlight box-content-round"><p><a class="float-left button-blue-dark button-round"><i class="fa fa-lightbulb-o fa-2x"></i> Tip</a>
You might find it helpful to practice your presentation and get feedback on how well you did. The tool <img align="absbottom" alt="Tool" src="/global/Images/icons/tool.gif"><a href="/tools/improving_your_presentations.asp">Improving Your Presentations</a> provides some additional tips on making presentations and contains an observer feedback form to get targeted feedback on your presentation.
</div>

Essentially it would look for the "tip.gif" and remove the table tags and add the <divs>

Is this doable?
0
Free Tool: IP Lookup
LVL 9
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Hello,

I need a regular expression to find an HTML tag that contains a piece of text.

For example:

If I want to find any HTML tag that contains the text "tool", like:

 <img src="../global/Images/icons/tool.gif" alt="Tool" align="absbottom">

If would find the whole tag including its parameters.
0
I have a string AAABBCCCDEEEFF. I want to find out if any character in this sting repeats only 2 times. For example, AA or BB or CC etc. Also how to replace that char sequence in one regex? What is regex to identify this in java?
0
I have a bulk file that has multiple HL7 messages contained in the file.  Each message block starts with MSH segment.  However the batch file has index id numbers prepended to the string.  I need to remove the 10 characters before the 'MSH' and replace with blank.

I am using Notepad++ as the editor and I think this can be done possibly with regex but I don't know how to accomplish the task.  I have no regex experience, nor programming/scripting.

Thanks in advance for the assistance.

Current data example:
0000000764MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622085225||ADT^A14|WIEH52WIJQQ7Q6JQVA7E|P|2.5
EVN|A14|20170622085225
...
0000000924MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622091618||ADT^A14|TQSJB2CG9GTZH8SY8HWB|P|2.5
EVN|A14|20170622091618
...
0000000742MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622091619||ADT^A14|EF0S0NGHIZLQGAYAMPT4|P|2.5
EVN|A14|20170622091619

Desired output would be:
MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622085225||ADT^A14|WIEH52WIJQQ7Q6JQVA7E|P|2.5
EVN|A14|20170622085225
...
MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622091618||ADT^A14|TQSJB2CG9GTZH8SY8HWB|P|2.5
EVN|A14|20170622091618
...
MSH|^~\&|RELAYHEALTH|RWJSL|RELAYHEALTH|RWJSL|20170622091619||ADT^A14|EF0S0NGHIZLQGAYAMPT4|P|2.5
EVN|A14|20170622091619
0
I want to keep all the lines that start with a + or a - but delete all other
lines in a file. I see a ready receipe for ^[+-]. that I could use to delete those
lines. But I want to keep them and delete all the ones. What's the mojo?
Thank you.
0
Hi Experts, how to pass (ignore) below line in regex. I'm  trying to achieve this in jenkins log parser plugin.

# Should pass
chmod: cannot access './config/temp_bkp_24072016/resources': Permission denied

# Should fail
chmod: cannot access './config/resources_bkp_24072016/resources': Permission denied

#regex rule. which I'm trying to handle above two scenario
warning /^(?!.*(temp_bkp))(.*Permission denied.*)$/

Open in new window



Thanks in advance
0
I have a situation where I need to be able to always highlight 3rd section or column of a string. Can anyone suggest the appropriate syntax?

For example in the below strings bird would be highlighted.

dog asds-sds-assa bird y

dinosaur nkj-as bird
0
I have a regex function like this

^[0-9]\\d?-\\d{7}+;+([0-9]\\d?-\\d{7})*$

But I have a long text field where I enter multiple tax id numbers with semicolon. How to modify or allow this feature. Sometimes I only enter one tax id. How to handle this
0
is it possible to set up a regex to validate each position in a string of characters in a form field?

I have a requirement for a contract number field in a Cold Fusion form. Requirements are:

•      Positions 1-6 will be the  70Z0XX where XX is the contracting office code IE: 23.
•      Positions 7-8 will be the two digit Fiscal Year IE: 17.
•      Position 9 will be the one character instrument code IE: C, D, F.
•      Positions 10-17 will be agency assigned number.
•      New Example: 70Z02317D00000001

Is it possible to set up an expression at different points w/in the string of characters? I've never done that...only length requirements and forcing it to start w/ either a number or a letter. thanks for any help.
0
Hi,

I am in need of some assistance with a character search in any string for the followign characers

-
[
]
*
!

Any and all help would be very much appreciated.

Thanks

Simon
0
Announcing the Most Valuable Experts of 2016
LVL 6
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

I am trying to gather all of the characters following the final space in a string but am struggling with the syntax. Can anyone help please? The last set of characters will always be alphanumeric but will be of differing characters and amount of characters.

 In the below example I would only want to highlight cat.  

 dog xxx-klsfkd-sdf-sdf cat


 And in the next one aaaaaaaaaaaaaddddddddddddddd

 heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd


however if the final character in the string is single I wish to ignore it e.g.

 heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd 0

would still highlight aaaaaaaaaaaaaddddddddddddddd


 Many Thanks
0
I am trying to gather all of the characters following the final space in a string but am struggling with the syntax. Can anyone help please? The last set of characters will always be alphanumeric but will be of differing characters and amount of charaters.

In the below example I would only want to highlight cat.  

dog xxx-klsfkd-sdf-sdf cat


And in the next one aaaaaaaaaaaaaddddddddddddddd

heht ooijl-nanhjhsh aaaaaaaaaaaaaddddddddddddddd


Many Thanks
0
Hi All,

Background:
I have written a script to grab the most common words in a page (with tags stripped etc.) It mostly works, however there is an occasional occurrence of the following happening: helloThisIsAnExampleOfTheAnomoly.

This occurs while grabbing certain HTML via a cURL based function, stripping tags and counting word frequency. It mostly appears to occur in menus and widgets.

What I'm looking for is an elegant/efficient solution to pop/push/unset values in the array with the values split.

To expand:
preg_replace('/(?<! )(?<!^)[A-Z]/',' $0', $words)

Open in new window


I'm using the above regular expressions to essentially split the values based on uppercase values occurring mid string/array element.

To summarise:
$array is currently something like this: ("This", "is", "okay", "this", "IsNotOkay")
What I want:
$array is going to look something like this ("This", "is", "okay", "this", "Is", "Not", "Okay")

Don't worry too much about the repeat values as I am utilising a "stop words" array to rid the ones I would not like to keep.

I've not got it working nicely yet so thought I'd turn to you for your expert input.

Thanks in advance.
Chris
0
I am looking to replace the value between two spaces in a string with a backslash. Can someone assist with the syntax.

E.G. I want

RED 123456789 White

to become

RED\White

Thanks
0
I am looking for a solution that uses CFSCRIPT to remove all text that is surrounded by '<style' and '/style>' including removing the style tags. Can someone show me how I can use something like ReReplaceNoCase to solve this.

For example: if I have text that looks like:

section a <style>here is style to remove</style>section b <style>more style to remove</style>section c

I need the function to return:
section a section b section c

Thank you.
0
I have a regular expression that does exactly what I want it to do but I don’t understand how it works.  I'm running a bash shell under MAC OS Sierra.

Here’s the command: sed -e 's/.*\"\(.*\)\"/\1/'

Here’s what it’s applied to: | "IOPlatformSerialNumber" = "QP93505K0TM"

which returns this: QP93505K0TM

which is exactly what I want.

The pattern match has two parts:

s/.*\”   and (.*\)\”

If I run just part one, it returns the entire string, which I expect because I read the first part to mean: “Match any string than ends with a quote”.

Part two always grabs the last part of the string.  I tested this:

“abc” “def” returns def

“abc” “def” “ghi” returns ghi

“abc” “def” “ghi” “jol” returns jul

My question is why does the expression enclosed in () always refer to the last part of the string?

Thanks.
0

Regular Expressions

A regular expression ("regex") is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python and C++ (since C++11). Most other languages offer regular expressions via a library.