Solved

Regexp for url address

Posted on 2014-03-05
11
242 Views
Last Modified: 2014-04-06
Hi programmers,
I have question about Perl regular expression:

I have for example that url:
http://google.com/search?q=regexp

and I need the expression that is true if data contains "http://google." and not contain "regexp"

It is possible to do that somehow?
0
Comment
Question by:Vanikcz
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 39907161
Based on what you described, you should be able to use a negative lookahead.

e.g.

$input =~ m/http:\/\/google\.(?!.*regexp)/;

Open in new window

0
 
LVL 9

Expert Comment

by:Derek Jensen
ID: 39907197
So if I understand your question correctly, you're looking for URLs that start with http://google, but does not contain a search parameter value? Is this for search string injection or something?

You should be able to accomplish the above stated goals with the following regexp:

/http:\/\/google\..*(?<!regexp)/

Open in new window

If you're wanting to do something more advanced, then we'll need to know more about what you're trying to do exactly, perhaps with some code snippets. :-)
0
 
LVL 6

Author Comment

by:Vanikcz
ID: 39907217
Im trying to setup a rule in my firewall, it supports regular expressions in content filter, based on Perl

I need rule that fires action when Im accessing some site eg. google and there isnt some string in the query.
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 39907277
Derek Jensen's pattern won't work because it roughly checks that the string does not end with "regexp", not that it does not contain such.
0
 
LVL 6

Author Comment

by:Vanikcz
ID: 39907330
Maybe I implement it badly, but none of solutions is working :-( and also I cant debug it, it simply not working
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 6

Author Comment

by:Vanikcz
ID: 39907338
0
 
LVL 74

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 250 total points
ID: 39907350
What I gave you above was Perl code. I thought that's what you were after when I read your original question. The part of that code that is just the pattern is this:

http:\/\/google\.(?!.*regexp)

Open in new window

0
 
LVL 9

Accepted Solution

by:
Derek Jensen earned 250 total points
ID: 39907748
@kaufmed,
Indeed you are correct. I should've trusted my gut & posted what I assumed he was asking for, as it was much closer to what he actually wanted than what I ended up posting. Let me see if I can duplicate it...

https?:\/\/(www\.)?google\.com\/(.+?)(\?|#)(?=.*regexp)

Open in new window

Ok so not exactly the same, but I think I like this version better, as it's more flexible. :-)

So in the (?=.*regexp) portion of the regular expression, you would put what you're looking for, or not looking for, as the case may be, in place of "regexp". If you're wanting a not search, simply replace

(?=.*

with

(?!.*
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 39914464
use 2 rules

google.com.*allowedstuff -> ALLOW
google.com -> YOUR_SPECIFIC_ACTION

negative lookaheads should work. maybe your firewall does not handle them
0
 
LVL 6

Author Comment

by:Vanikcz
ID: 39981076
Sorry, feature in my firewall is poorly supported, I havent get running even simplier regexps. So I cant tell who has right.
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 39981115
try simple eregs with and without delimiters ( "/.*/" VS ".*" )

note that your eregs may be docked (IE implicitely mach the whole string so when you type "stuff.*otherstuff", the firewall treats it as "^stuff.*otherstuff$" rather than ".*stuff.*otherstuff.*"

start with the simplest possible ereg such as ".*" and make it gradually more complex

i have no idea what firewall you are using, but there should be some documentation

good luck
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now