Solved

Regular Expression Needed

Posted on 2006-11-29
6
192 Views
Last Modified: 2010-03-05
I am looking for the Perl Regular Expression to strip part of an HTTP request out.

Basically take the following request examples:

1) GET /index.html HTTP/1.1
2) GET /dir/test.asp?param=value HTTP/1.1
3) POST /dir/dir/dir/test.php HTTP/1.1
4) GET /dir/index.js HTTP/1.1
5) POST /dir/dir/post.asp?param=value HTTP/1.1
6) GET /dir/images/index.jpg HTTP/1.1
7) GET / HTTP/1.1
8) GET /test/script HTTP/1.1


I would like a regular expression that would give me the actual page name. For each one I would like the following to be in Group 1.

1) index.html
2) test.asp
3) test.php
4) index.js
5) post.asp
6) index.jpg
7) /
8) script

I don't care about the number or names of directories unless there is no specific resource name.
0
Comment
Question by:mikedgibson
  • 2
  • 2
  • 2
6 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 18039785
Is the
1)
2)
part of the request example and the output you want?
0
 
LVL 2

Expert Comment

by:jingks03
ID: 18039863
# the following regex works for your test data.  The only thing is the single '/' is returned as a blank.
# i personally don't know how to return the '/' in that case with only a regex, if I used some if () else() then i could

# if you DO NOT want the 1), 2) etc...
if (m#/([\w\.]*)(?:\?[\w\=])?\sHTTP/1.1#) { print "$1\n"; }

# if you do want them
if (m#^(\d+\)\s).*/([\w\.]*)(?:\?[\w\=])?\sHTTP/1.1#) { print "$1$2\n"; }
0
 
LVL 2

Author Comment

by:mikedgibson
ID: 18039894
No I do not want the 1) and 2) included I was just showing that that output corresponded to the matching input.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 2

Expert Comment

by:jingks03
ID: 18039930
in the case of mine, you may have to edit the [\w\=] to accept whatever is acceptable in "param=value"
0
 
LVL 2

Author Comment

by:mikedgibson
ID: 18040107
It doesn't need to be just a regex .. If you ned to use if() else () then that is fine as well
0
 
LVL 84

Accepted Solution

by:
ozo earned 250 total points
ID: 18040116
while( <DATA> ){
    print "$1\n" if m#(?=/)(?:\S*/)?([^?\s]+)[\s?]#
}
__DATA__
GET /index.html HTTP/1.1
GET /dir/test.asp?param=value HTTP/1.1
POST /dir/dir/dir/test.php HTTP/1.1
GET /dir/index.js HTTP/1.1
POST /dir/dir/post.asp?param=value HTTP/1.1
GET /dir/images/index.jpg HTTP/1.1
GET / HTTP/1.1
GET /test/script HTTP/1.1
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
perl CPAN issue 3 99
How to search multiple patterms in a file with perl? 4 79
PERL variable conundrum 9 89
XML::LibXML and Xpath syntax How do I get attribute of sibling 2 117
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now