Regexp for url address

Jan Vojtech Vanicek
Jan Vojtech Vanicek used Ask the Experts™
on
Hi programmers,
I have question about Perl regular expression:

I have for example that url:
http://google.com/search?q=regexp

and I need the expression that is true if data contains "http://google." and not contain "regexp"

It is possible to do that somehow?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2011
Top Expert 2015

Commented:
Based on what you described, you should be able to use a negative lookahead.

e.g.

$input =~ m/http:\/\/google\.(?!.*regexp)/;

Open in new window

So if I understand your question correctly, you're looking for URLs that start with http://google, but does not contain a search parameter value? Is this for search string injection or something?

You should be able to accomplish the above stated goals with the following regexp:

/http:\/\/google\..*(?<!regexp)/

Open in new window

If you're wanting to do something more advanced, then we'll need to know more about what you're trying to do exactly, perhaps with some code snippets. :-)
Jan Vojtech VanicekIT Specialist

Author

Commented:
Im trying to setup a rule in my firewall, it supports regular expressions in content filter, based on Perl

I need rule that fires action when Im accessing some site eg. google and there isnt some string in the query.
Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Most Valuable Expert 2011
Top Expert 2015

Commented:
Derek Jensen's pattern won't work because it roughly checks that the string does not end with "regexp", not that it does not contain such.
Jan Vojtech VanicekIT Specialist

Author

Commented:
Maybe I implement it badly, but none of solutions is working :-( and also I cant debug it, it simply not working
Most Valuable Expert 2011
Top Expert 2015
Commented:
What I gave you above was Perl code. I thought that's what you were after when I read your original question. The part of that code that is just the pattern is this:

http:\/\/google\.(?!.*regexp)

Open in new window

@kaufmed,
Indeed you are correct. I should've trusted my gut & posted what I assumed he was asking for, as it was much closer to what he actually wanted than what I ended up posting. Let me see if I can duplicate it...

https?:\/\/(www\.)?google\.com\/(.+?)(\?|#)(?=.*regexp)

Open in new window

Ok so not exactly the same, but I think I like this version better, as it's more flexible. :-)

So in the (?=.*regexp) portion of the regular expression, you would put what you're looking for, or not looking for, as the case may be, in place of "regexp". If you're wanting a not search, simply replace

(?=.*

with

(?!.*
use 2 rules

google.com.*allowedstuff -> ALLOW
google.com -> YOUR_SPECIFIC_ACTION

negative lookaheads should work. maybe your firewall does not handle them
Jan Vojtech VanicekIT Specialist

Author

Commented:
Sorry, feature in my firewall is poorly supported, I havent get running even simplier regexps. So I cant tell who has right.
try simple eregs with and without delimiters ( "/.*/" VS ".*" )

note that your eregs may be docked (IE implicitely mach the whole string so when you type "stuff.*otherstuff", the firewall treats it as "^stuff.*otherstuff$" rather than ".*stuff.*otherstuff.*"

start with the simplest possible ereg such as ".*" and make it gradually more complex

i have no idea what firewall you are using, but there should be some documentation

good luck

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial