?
Solved

Please help decipher regular expression.

Posted on 2006-04-04
12
Medium Priority
?
273 Views
Last Modified: 2012-05-05
Please be as detailed as possible.  
rex.Pattern = "^.*(;|(<|%3[Cc])[a-zA-Z]).*$"

A developer who left the company wrote a page with this in it and didn't comment.  I'd like to place a comment above this so anyone else can read what it's looking for.


Thanks.
0
Comment
Question by:skipper68
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 9

Author Comment

by:skipper68
ID: 16371804
Also, is there a reverse regex program anywhere?
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16372131
Hi skipper68,
> "^.*(;|(<|%3[Cc])[a-zA-Z]).*$"

^ = line-start
.* = zero or more '.', which is any character, so this is zero or more of any character
| = or sign, alternation
() = grouping parentheses
%3 = content of a variable ?, likely something you picked up before
[Cc] = character class, 'C' or 'c'
[a-zA-Z] = character class, any lower case or upper case alfabetic character
$ = line-end


so this is a single line,
filled with any characters up to a ';' or a sequence (< or content of variable3) followed by a C or c
all that followed by one alfabetic character and again a bunch of any character up to line end

I am only not sure about the %3, that is a language dependent thing... is it Perl?

Cheers!
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16372143
skipper68,
> Also, is there a reverse regex program anywhere?

not that I am aware of... what would it reverse to in your mind?
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 9

Author Comment

by:skipper68
ID: 16373903
I've been doing a little research on my own and I think I'm even more confused now...

Before I read your post, I though it was looking like this.

^.
Starting with a dot

*(.....)
Matching any criteria between the parentheses, separated by the pipe (|) symbol

;
No semicolons

<
No Less than signs

%3[Cc])[a-zA-Z])
No 3 capital letters in a row (ie. AAA)

.
ending with a dot

*$
Containing a DollarSign

If any of these holds true, the expression returns the count of the number of violations.

Can someone confirm or clarify?
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16374156
what is the programming language?

in most languages
^ is a positional pattern for start of sentence
. is any character
meaning that a dot needs to be escaped like this \.

you have negations everywhere in your explanations
I don't see any negation sign in the regex
so I think you are far off... unless this is a weird prog language I have never done regex in...

cheers
0
 
LVL 9

Author Comment

by:skipper68
ID: 16374246
It's in an asp page set up like this.

I've been talking to a few people here and it's supposed to be a request.querystring and request.form check for specific characters.


Set rex = New regexp

rex.Pattern = "^.*(;|(<|%3[Cc])[a-zA-Z]).*$"

Set colMatches = rex.Execute(Request.QueryString)

If colMatches.Count > 0 Then
      Response.Write "A potentially dangerous Request.QueryString value was detected from the client."
      Response.End
End If

Set colMatches = rex.Execute(Request.Form)

If colMatches.Count > 0 Then
      Response.Write "A potentially dangerous Request.Form value was detected from the client."
      Response.End
End If
0
 
LVL 35

Accepted Solution

by:
TimYates earned 200 total points
ID: 16380457
It looks to me like it is checking for strings such as:

    hello ; there

or

    hello < there

or

    hello %3C there

which it considers "potentially dangerous"

I guess due the chances that it could be an SQL or Javascript injection attempt...

Tim
0
 
LVL 4

Expert Comment

by:ysre
ID: 16396362
I second what TimYates said :)

Ys
0
 
LVL 9

Author Comment

by:skipper68
ID: 16400468
Accepted prematurely....

rex.Pattern = "^.*(;|(<|%3[Cc])[a-zA-Z]).*$"

1. Starts with a dot
2. Cannot contain semi-colon, less than sign, or %3

Does the [a-zA-Z] mean that it can only contain lowercase and uppercase letters?  Numbers work

What does the dot at the end mean?

What does the *$ mean?
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16400574
.* means any character zero or more times
$ means end of line or end of inputstream
[a-zA-Z] means a or b or c or d.... or z or A or B or C or... or Z (it is a character class)
0
 
LVL 35

Expert Comment

by:TimYates
ID: 16400630
So it means

  <any chars>  

followed by

  ;

OR

  < or %3C or %3c followed by any letter ( eg:  <A or %3cA or %3CB )

followed by

  <any chars>  

Tim
0
 
LVL 9

Author Comment

by:skipper68
ID: 16417061
Wonderful.  Thank you to all.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface This article introduces an authentication and authorization system for a website.  It is understood by the author and the project contributors that there is no such thing as a "one size fits all" system.  That being said, there is a certa…
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
Suggested Courses

750 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question