regex match one number (or character) but not more than one

I am trying to write regex that will match a single digit preceded by a $ sign.  (example $5) . The reason I am doing this is to learn to write custom rules to catch spam.

for a specific test example,  I would like to match  something like "sign up now for only $5" in a subject of the email. The wording is irrelevant and obviously can change. I already know how to match subjects, so, I am not asking about spamassassin or spamassassin rule types. only regex matching.

In order to help do some tests, I created an example test file like this.

$
$5
$55
$555
$5555
5

I am using the one line perl 'script' listed below to do my tests:

 perl -ne 'if (/\&\d/) {print "$&\n";}' < testfile

my result/goal is to match the line that contains "$5".. and no other line

after testing, my regex matches each line containing a $ followed by a 5, no matter how many '5's are on the line, I have tried the following regexs

\d
[\d]
\d{1}

and similar, but nothing works, it will match each line containing a $ followed by a 5. I did a little research and found that since \d matches a single digit. The [] and {1} are unneeded.  After some thought and research, I think my initial thoughts of \d would only match the line that contains $5. however, \d matches $5 on each line containing a $ followed by a 5, no matter how many 5's are on the line.  Note: It doesn't match the line having only a $  or only a 5.  This makes sense after thinking about it.

But how can I match only the line containing $5  and no other line?

(I do not want to match $55 or $555 etc.  and per my initial subject example. the $5 can be surrounded by unknown words, characters etc. (as email subject lines can vary after all)


Thanks in advance.
camstutzAsked:
Who is Participating?
 
ozoConnect With a Mentor Commented:
if file2 contains
$5

then the \D in /\$\d\D/ will match the newline following the "5" and $& would print it, then the \n
/[^\d]/ is equivalent to /\D/

/\b\d\b/ matches only the digit, so $& would not contain whatever non-word characters that may surround it.
0
 
NVITCommented:
Does \b\d\b work?
0
 
Jeff DarlingDeveloper AnalystCommented:
I'm assuming you only care about matches as long as it is one dollar sign followed by 1 digit.


(\$[0-9]{1}[^0-9])

Open in new window


http://regexr.com/3acf4
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
Jeff DarlingDeveloper AnalystCommented:
You are correct, {1} is not needed.

(\$[0-9][^0-9])

Open in new window

0
 
Dan CraciunIT ConsultantCommented:
Or, shorter: \$\d\D
Uppercase classes are negated.

HTH,
Dan
0
 
camstutzAuthor Commented:
First, thank you everyone for posting suggestions. NewVillageIT: I did try that, it didn't seem to work.   Jeff. I tried doing \$\d[^\d]  but that didn't seem to work. However, I didn't use the parenthesis (I think it is called a character class? if I remember right :) ..  Both Jeff and Dan, I will try your suggestions.
0
 
camstutzAuthor Commented:
Hello Everyone, here is the test results.

/(/(\&\d[^\d])/   returned nothing.

/\&\d\D/  returns nothing.

/\&[0-9][^0-9]/  returns nothing.
0
 
camstutzAuthor Commented:
oh and also...
/(\&[0-9][^0-9]) returned nothing
0
 
camstutzAuthor Commented:
I am wondering if it has to do with perl  vs using spamassassin searches?
0
 
ozoCommented:
A line of text containing exactly one $ sign and one digit, with nothing preceding or following
/^\$\d$/

If a non-digit can be allowed following the one digit
/^\$\d(?!\d)/
0
 
camstutzAuthor Commented:
Thanks ozo, however that won't work. In my opening post, I mentioned that the $5 is surrounded by random words that would make up an email subject. If i use your example, it will return nothing unless the email subject is *only* $5.
0
 
ozoCommented:
Anything allowed before  the $ sign, and a non-digit followed by other stuff allowed after the digit
/\$\d(?!\d)/
0
 
camstutzAuthor Commented:
Thanks ozo... that worked....   However, I owe an apology. I discovered a typo (user error) when trying the other  suggestions. I had a & where a $ was supposed to be. No wonder it was printing nothing. I went back and tried the word boundary, unless I did something wrong again, it printed three lines of 5's and stripped off the word boundary.

however, I do have a question, many of the other examples print the correct line, but also seems to add a blank line at the end. Can someone explain this to me or am I doing something wrong again?

This isn't my first regex, but i am still pretty new to them still.
0
 
ozoCommented:
A "\n" in your print statement will print a newline character.
0
 
camstutzAuthor Commented:
Ozo, please forgive my ignorance, but when you gave me this regex: /\$\d(?!\d)/

and when used this way:

perl -ne 'if (/\$\d(?!\d)/) {print "$&\n";}' < file2


then it doesn't print the second "blank line"
0
 
ozoCommented:
What regex, used in what way, seems to add a blank line?
And what regex, used in what way, printed three lines of 5's and stripped off the word boundary?
0
 
camstutzAuthor Commented:
These print a second blank line. Please trust that I copied this exactly except for changing the username and host name.  the empty line is what the command output produced, not me hitting enter to separate the commands.

user@host:~ # perl -ne 'if (/\$\d\D/) {print "$&\n";}' < file2
$5

user@host:~ # perl -ne 'if (/\$\d[^\d]/) {print "$&\n";}' < file2
$5

user@host:~ # perl -ne 'if (/\$\d[^\d]/) {print "$&\n";}' < file2
$5

user@host:~ # perl -ne 'if (/\$\d\D/) {print "$&\n";}' < file2
$5

======================Just a line separation I added to this post for separation===========================
This is the word boundary:

user@host:~ # perl -ne 'if (/\b\d\b/) {print "$&\n";}' < file2
5
5
user@host:~ #

This is the second regex you mentioned ozo:

user@host~ # perl -ne 'if (/\$\d(?!\d)/) {print "$&\n";}' < file2
$5
user@host:~ #
0
 
Jeff DarlingDeveloper AnalystCommented:
This is why I like to use a group.

try this

perl -ne 'if (/(\$\d)\D/) {print "---\nYes \[$1"."]\n---\n";}else{print "---\nNo \[$1"."]\n---\n";}' < file2

Open in new window

0
 
ozoCommented:
Failing matches will not set $1, so }else{print "---\nNo \[$1"."]\n---\n" may not be very meaningful.
(also, the \[ seems unnecessary and the "." seems superfluous)

\D requires a non-digit following the digit.
With -n reads from a file, there will usually be a newline at the end of $_, but it is possible for it to be missing from the last line, in which case it may prevent the /(\$\d)\D/ from matching.
0
 
camstutzAuthor Commented:
ozo, while it seems that the \D matches a newline (from my tests)  ... I read it doesn't match white space. (tabs, space, etc.)  do you have a link that says exactly what it does match? I usually just see non digit... and in my preliminary studying and understanding, I was just thinking of ASCII characters. (A-Z or a-z) as an example.  But that is my limited experience with regex and thinking of this.
0
 
camstutzAuthor Commented:
I think I found it from a previous post on rexegg: http://www.rexegg.com/regex-quickstart.html

[\d\D]      One character that is a digit or a non-digit      [\d\D]+      Any characters, inc-
luding new lines, which the regular dot doesn't match
0
 
ozoCommented:
\D matches any character that \d does not.  equivalent to [^\d]
0
 
camstutzAuthor Commented:
and not white space... (according to what I read) ... though I should try it for myself
0
 
ozoCommented:
whitespace is not [0-9], so it will not match \d and it will match \D
0
 
käµfm³d 👽Commented:
and not white space... (according to what I read)
Perhaps if you post what you read? Every engine that I've used which defines \D defines it as anything not a digit, which includes whitespace.
0
 
ozoCommented:
Perhaps you were thinking of \W, which matches non-whitespace characters (like [^\w])
0
All Courses

From novice to tech pro — start learning today.