[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

Regular Expression Needed

Posted on 2006-11-29
6
Medium Priority
?
198 Views
Last Modified: 2010-03-05
I am looking for the Perl Regular Expression to strip part of an HTTP request out.

Basically take the following request examples:

1) GET /index.html HTTP/1.1
2) GET /dir/test.asp?param=value HTTP/1.1
3) POST /dir/dir/dir/test.php HTTP/1.1
4) GET /dir/index.js HTTP/1.1
5) POST /dir/dir/post.asp?param=value HTTP/1.1
6) GET /dir/images/index.jpg HTTP/1.1
7) GET / HTTP/1.1
8) GET /test/script HTTP/1.1


I would like a regular expression that would give me the actual page name. For each one I would like the following to be in Group 1.

1) index.html
2) test.asp
3) test.php
4) index.js
5) post.asp
6) index.jpg
7) /
8) script

I don't care about the number or names of directories unless there is no specific resource name.
0
Comment
Question by:mikedgibson
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
6 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 18039785
Is the
1)
2)
part of the request example and the output you want?
0
 
LVL 2

Expert Comment

by:jingks03
ID: 18039863
# the following regex works for your test data.  The only thing is the single '/' is returned as a blank.
# i personally don't know how to return the '/' in that case with only a regex, if I used some if () else() then i could

# if you DO NOT want the 1), 2) etc...
if (m#/([\w\.]*)(?:\?[\w\=])?\sHTTP/1.1#) { print "$1\n"; }

# if you do want them
if (m#^(\d+\)\s).*/([\w\.]*)(?:\?[\w\=])?\sHTTP/1.1#) { print "$1$2\n"; }
0
 
LVL 2

Author Comment

by:mikedgibson
ID: 18039894
No I do not want the 1) and 2) included I was just showing that that output corresponded to the matching input.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 2

Expert Comment

by:jingks03
ID: 18039930
in the case of mine, you may have to edit the [\w\=] to accept whatever is acceptable in "param=value"
0
 
LVL 2

Author Comment

by:mikedgibson
ID: 18040107
It doesn't need to be just a regex .. If you ned to use if() else () then that is fine as well
0
 
LVL 84

Accepted Solution

by:
ozo earned 1000 total points
ID: 18040116
while( <DATA> ){
    print "$1\n" if m#(?=/)(?:\S*/)?([^?\s]+)[\s?]#
}
__DATA__
GET /index.html HTTP/1.1
GET /dir/test.asp?param=value HTTP/1.1
POST /dir/dir/dir/test.php HTTP/1.1
GET /dir/index.js HTTP/1.1
POST /dir/dir/post.asp?param=value HTTP/1.1
GET /dir/images/index.jpg HTTP/1.1
GET / HTTP/1.1
GET /test/script HTTP/1.1
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question