Regular Expressions

A regular expression ("regex") is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python and C++ (since C++11). Most other languages offer regular expressions via a library.

Share tech news, updates, or what's on your mind.

Sign up to Post

i have a data to find and replace a values with other values in a column,ex "arjun + suresh + rajesh +marco" which has unique field to be replaced

ie: arjun should be replaced with 03 arjun, suresh should be replaced with 02 suresh, and rajesh with 08 rajesh, and marco with 06 marco.

after replacement the output will be "03 arjun +02 suresh +08 rajesh + 06 marco"

once we replace the values for the entire column in the same way mentioned above the each cell values should be sorted like ascending order with number "02 suresh+03 arjun +06 marco +08 rajesh"
0
Become a Certified Penetration Testing Engineer
LVL 19
Become a Certified Penetration Testing Engineer

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

Given the code bellow

using System;
using System.Text;				
using System.Text.RegularExpressions;

public class Program
{
	public static void Main()
	{
		string Sentance  =@"pest, irritant, nag, nuisance, Colloq pain, pain in the neck or Brit taboo arse or US taboo ass; Slang US nudge";
		string[] Colloq = new string[] { "Brit", "US", "Australian", "Canadian", "New Zealand" };
        string[] Labels = new string[] { "Colloq", "Slang", "Taboo", "Archaic", "Old-fashioned" };
		
		string[] Words = Sentance.Split(',');
		foreach (var word in Words)
		{
			if (word.Contains(Labels))
			{
				// everything upto the next ; is that Label 
				// I need to capture the Label , Colloq and word treat 'or' as separate word
			}
			else{
			AlternateWords alternateWords = new AlternateWords()
			{
				AlertnateWord = word.Trim()
			};
			thesauri.alternateWords.Add(alternateWords);
			}
		}

    }
	
}

Open in new window


I need
AlertnateWord pest
AlertnateWord irritant
AlertnateWord  nag
AlertnateWord   nuisance

AlertnateWord Colloq pain
AlertnateWord Colloq pain in the neck
AlertnateWord Colloq Brit taboo arse
AlertnateWord Colloq US taboo ass
AlertnateWord Slang US nudge

Open in new window


Sorry I don't know how else to explain this
0
Attached is a section from "BitBake User's Manual" that uses inline Python variable expansion to set variables.
Please explain in detail how the DATE variable is set.
python.PNG
0
Hi I need some help refining my regEx

I need to split the example the numbers eg ". 2 " however I also need  to split on "--n. 5 " and keep the "n."
Note: i'm first splitting on ". 1 " has the preceding Lexical doesn't have '--'

What I'm getting is every number, which I don't need!
I do need the Lexical If there is 1 And the sentance

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        Regex SentanceSpit = new Regex(@"\.(\s+\d+\s+)|\.--([a-z]+\.)\s+\d+\s+");
        string Line = @"abandon v. 1 give up or over, yield, surrender, leave, cede, let go, deliver (up), turn over, relinquish: I can see no reason why we should abandon the house to thieves and vandals. 2 depart from, leave, desert, quit, go away from: The order was given to abandon ship. 3 desert, forsake, jilt, walk out on: He even abandoned his fianc,e. 4 give up, renounce; discontinue, forgo, drop, desist, abstain from: She abandoned cigarettes and whisky after the doctor's warning.--n. 5 recklessness, intemperance, wantonness, lack of restraint, unrestraint: He behaved with wild abandon after he received the inheritance.";
        // Output strings
		string Term;
		string Lexical; // not every example have diferant Lexical
        string[] WordsExample;
        string[] Words;
        string Example;
		string[] FirstSecond = Regex.Split(Line, @"\s1\s");
		if (FirstSecond.Length ==2)
		{
			string 

Open in new window

0
Using MS SQL Server and regex matching, how do you handle optional (zero or one instances) of a character?

models
-----------
H55N6800UK
H55NU8700UK
H60NEC5600UK

SELECT model FROM models WHERE (model LIKE 'H[0-9][0-9]N[0-9][0-9][0-9][0-9]UK') OR (model LIKE 'H[0-9][0-9]NU[0-9][0-9][0-9][0-9]UK') OR (model LIKE 'H[0-9][0-9]NEC[0-9][0-9][0-9][0-9]UK')

The SELECT statement above returns the 3 rows but uses an OR with three different regex expressions.  What I would like to do is use a single regex expression where the U or EC in the 5th character position are optionally matched.
Ideally a single regex expression to handle this would be something like:
 'H[0-9][0-9]N Optional U OR Optional EC   [0-9][0-9][0-9][0-9]UK'

The matching needs to be relatively tight and just using the % wildcard would accidentally match more items in the database than desired.

In practice, I will have one table of models to match (tens of thousands of records) and a second table of model-related data that includes the regex match in one column.  Then I'll simply join the tables in the form: SELECT * FROM MODELS LEFT OUTER JOIN MODELINFO ON MODELS.MODEL LIKE MODELINFO.REGEX.  There are already a huge number of permutations of the regex matches needed to cover my model data.  Inability to use an optional character in the regex will add a lot of work and double up a lot of rows.
0
I would like to prevent certain characters from being entered anything except &,(){}$%^!",

I thought this would work:
^(?=.*[A-Z])(?=.*[a-z])(?=.*[\d])(?=.*[\!\£\$\%\^\@\#\~])

Open in new window


Any help is appreciated.
0
I have a string that follows this pattern:  CT201945681-3012AMC  

I would like to have it keep all of the first Alpha Characters.  Then, it strips the first two digits of the year.  
It would remove the hyphen and any alpha characters after the hyphen.

So the example above would look like this:  CT19456813012    
The tricky part is the 20 from the year, so I would like to see that as a single Regex if possible.  

Thank You.
0
What is the proper JavaScript regex to cover these hosts?
  • apply.essexcredit.com
  • apply-uat.essexcredit.com
  • apply-qa.essexcredit.com

Thanks!
0
I have this:
ModelRangeFound: InStr([mODEL RANGE],[forms]![Form1].[word].[value])>0

It fails as it picks up the search word anywhere.

I was thinking of using regex to do the search term but how do I tell it to use the found word if its the first or second word only?

"RG125 bike year a6"  fails as it finds as a6 at end
"A6 quattro s line" This is valid

How do I get regex to find the search string but only return true if the search word is at the beginning of the sentence?

I dont know how to make that regex pattern
0
I have the following regular expression which is in some VB.net code I am maintaining. I know that it is used for formatting amounts of money.
Can someone interpret exactly what the regular expression represents as far as digits, decimal places etc. or whatever?

^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)(\.[0-9]{1,4})?$


In VB.net :
Dim myMoneyRegEx As String ="^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)(\.[0-9]{1,4})?$"
0
Expert Spotlight: Joe Anderson (DatabaseMX)
LVL 19
Expert Spotlight: Joe Anderson (DatabaseMX)

We’ve posted a new Expert Spotlight!  Joe Anderson (DatabaseMX) has been on Experts Exchange since 2006. Learn more about this database architect, guitar aficionado, and Microsoft MVP.

I have the following regex (see onkeyup) that automatically inserts a slash (/) in between the 2-digit month, 2-digit day and 4-digit year, but now I want my users to only enter the 2-digit month and 2-digit year. Could someone please help me modify the below regex to achieve this?

Enter MM/YY: <input type="text" id="myDate" name="myDate" maxlength="10" style="width:80px;" onkeyup="this.value=this.value.replace(/^(\d\d)(\d)$/g,'$1/$2').replace(/^(\d\d\/\d\d)(\d+)$/g,'$1/$2').replace(/[^\d\/]/g,'')" />

Many thanks in advance.
0
This version of the expression extracts the email address from a string of characters contained within cell E2 and replaces those characters with the result.

I also need versions of the expression that will extract the first name and one to extract the last name.

The email address can possibly be defined as any string of characters that are separated by an @ symbol with no spaces and surrounded entirely by either a (parenthesis) or <chevrons>.

This should work in a google sheet such as this one here https://docs.google.com/spreadsheets/d/1rQ5QC6Ipr5kkBDuNnIMY05Z09Q4q7lqZ52vd0bfCvtY/edit#gid=1895941459

=Regexextract(E2,"[A-z0-9._%+-]+@[A-z0-9.-]+\.[A-z]{2,4}")

Open in new window

0
I need to add validation to a text input on a web app form to prevent a user from entering a "x" character in a phone extension text field. The web app form allows to use RegEx validation.

I've tried a few different expressions like /^((?!x).)*$/gi and what I'm finding, is that if I enter "x12345", the expression will let me know an "x" was entered, however, if I only remove the "x" and validate again, the site still thinks there is an "x" in the value. If I completely remove all characters, it will then validate there is not an "x".

To help understand if there is an issue with the web app, what is the 'best' example to use for testing? I need to make sure the validation checks for an upper or lower case x in any part, but especially the very first character as that is how users will enter an x. They will either enter "x 12345" or "x12345" or X 12345" or "X12345" (the number of numbers could be any number of numbers though).
0
This is a learning exercise for me. It has no practical value other than for learning about regex expressions.
Here is a program with several regex patterns.
import re

a = r"stuff<offset>1234</offset><length>78</length>stuff <offset>1000134</offset><length>5678</length>stuff...<offset>11234</offset><length>5678</length>stuff"
r  = re.compile(r"<offset>([^<]+)</offset><length>([^<]+)</length>")
matches = r.findall(a)   #instantiate our matches variable
print(matches)
print(max(r.findall(a), key = lambda x: int(x[0])))
print("***** done with original working example using <>")
print("")


b = r"stuff[offset]1234[/offset](length)78(/length)stuff [offset]1000134[/offset](length)5678(/length)stuff...[offset]11234[/offset](length)5678(/length)stuff"
rb0 = re.compile(r"[offset]([^\[]+)[/offset][length]([^\[]+)[/length]")
rb1 = re.compile(r"\[offset\]([^\[]+)\[/offset\]\[length\]([^\[]+)\[/length\]")
rb2 = re.compile(r"\[offset\]([^[]+)\[\/offset\]\[length\]([^[]+)\[\/length\]")
rb3 = re.compile(r"\[offset\](\d+)\[\/offset\]\(length\)(\d+)\(\/length\)")

bmatches0 = rb0.findall(b)
bmatches1 = rb1.findall(b)
bmatches2 = rb2.findall(b)
bmatches3 = rb3.findall(b)

print("bmatches0: ",bmatches0)
print("bmatches1: ",bmatches1)
print("bmatches2: ",bmatches2)
print("bmatches3: ",bmatches3)

print(max(rb0.findall(b), key = lambda x: ([0])))

print(max(rb3.findall(b), key = lambda x: int(x[0])))

Open in new window

Here is the output. Output: Lines 5,6,7,9 are wrong. Can you please explain why the respective pattern produced the results?
[('1234', '78'), ('1000134', '5678'), ('11234', '5678')]
('1000134', '5678')
***** done with original working example using <>

bmatches0:  [('ffset](length)78(/leng', ')s'), ('ffset](length)5678(/leng', ')s'), ('ffset](length)5678(/leng', ')s')]
bmatches1:  []
bmatches2:  []
bmatches3:  [('1234', '78'), ('1000134', '5678'), ('11234', '5678')]
('ffset](length)78(/leng', ')s')
('1000134', '5678')

Open in new window

Thanks for the help.
Paul
0
I have a text string like this:
"stuff<offset>1234</offset><length>78</length>stuff <offset>1000134</offset><length>5678</length>stuff...<offset>11234</offset><length>5678</length>stuff"

Open in new window

My goal is to find the largest offset value and the corresponding length value.

I know I can write a loop searching for each "<offset>" and extract the value. I was wondering if in python3.7, there is a non-loop approach. (I can  use existing xml parsing code, but this seems simple enough to just use the text string.)

Thanks,
Paul
0
Not sure if this possible but is there a way for regular expression to remove everything in a string except for certain words?  

Data                           Result
---------------------         -------------
1.2 GPM  Test          GPH
LPH   999                  LPH
ZZ gps (test)             gps
0
Im using php but i have a problem, when allowing user to create a list they adding charaters they shouldnt, here is a list i collected below

. ; / ? ! " @ $ ()

how can i make sure non of these enter the database in one go instead of mutiple find and replaces ?
0
i have to add 10.x.x.0/24 subnet in regex format in ADFS claim rule

is this correct format below


\b10\.x\.x\.([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-5][0-9])\b
0
Need help with a regular expression that can extract data between ()

Data                         Extract
--------------------
1.2 (3.4L)                    ----> 3.4L
100 (3.4G/Max)        ----->3.4G/MAX
Z (ABC) V                   ------> ABC
ZZZZZ                        ------> Nothing
(X ...MMM                ------->Nothing because not closing bracket
0
Become a CompTIA Certified Healthcare IT Tech
LVL 19
Become a CompTIA Certified Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

Hi

I have inhertited a Reg Ex patter to validate mobile phone numbers.

/^(\+44\s?7\d{3}|\(?07\d{3}\)?)\s?\d{3}\s?\d{3}$/g;

I need to change it to validate an 8 digit number, starting with 2.

Can anyone help please.
0
Hi Expert. I could not write the regex command of the following values.
How do I write this regex command?
how to write a regex command that covers the following lines:
12345
1a4a5
12a45
12abc
12abc
12ab5
12aa5
12AB5

Open in new window

thanks
0
I need a regular expression that will extract numbers and "/", ".".

Data Example:            Results:
----------------------           -----------
3.66                               3.66
1-1/34                           1-3/4
N/A                                [null]
Test 3.55                       3.55
5                                     5
0
I need a regular expression to test for mm/dd/yyyy dates at the beginning of lines in a form field, as in the example below:
·07/29/2019 event 3
·07/08/2019-event 2
·05/29/2019: event

Users can enter as many dates and details as they want, but I need to verify the date format.

Users may also enter only one date, as below.

07/29/2019 event

Here's what I have (that doesn't work):

var dateString = /^\d{1,2}(\-|\/|\.)\d{1,2}\1\d{4}$/

But I don't know how to test (or ignore) the bullet character. I've tried .  I've tried \·  

But none of my efforts find the lines below valid (though I want them to be accepted):

·07/29/2019 event 3
·07/08/2019-event 2
·05/29/2019: event

Here's the context in my javascript:  else if(!dateString.test(theForm.EI16.value)) {alert("Please use mm/dd/yyyy format for all dates.");theForm.EI16.focus();return false;}

Who can help?
0
I have a file with more rows.
I want to delete some lines from within this file.
I am using the vi editor.
I wrote a simple sentence.
but it only deletes a single line.
I have to run this code every time.This is a waste of time.
How do I delete the rows I want to delete in a lump ?

MEDULA_ESEVKTAMAM_TANI_PK,--rows I want to delete
  CREATE UNIQUE INDEX "HASTANE"."MEDULA_ESEVKTAMAM_TANI_PK" ON "HASTANE"."MEDULA_ESEVKTAMAM_TANI" ("SIRA_NO", DOSYA_NO, PROTOKOL_NO, "TEDAVI_ICD")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE USERS,

MEDULA_ESEVKBILDIR_TANI_PK,--rows I want to delete
  CREATE UNIQUE INDEX "HASTANE"."MEDULA_ESEVKBILDIR_TANI_PK" ON "HASTANE"."MEDULA_ESEVKBILDIR_TANI" ("SIRA_NO", DOSYA_NO, PROTOKOL_NO, "SEVK_ICD")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE USERS,

MAKBUZ_D_NO,--rows I want to delete
  CREATE INDEX "HASTANE"."MAKBUZ_D_NO" ON "HASTANE"."MAKBUZMAIN" (DOSYA_NO, PROTOKOL_NO)
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE
0
i need to enter IP's in regex format in my claim rule on my ADFS server

is there any way to do
0

Regular Expressions

A regular expression ("regex") is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. Regular expression processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK. Many programming languages provide regular expression capabilities, some built-in, for example Perl, JavaScript, Ruby, AWK, and Tcl, and others via a standard library, for example .NET languages, Java, Python and C++ (since C++11). Most other languages offer regular expressions via a library.