Solved

Regulary Expression for filtering redundant directory notations

Posted on 2003-11-24
13
312 Views
Last Modified: 2010-03-04
Hi,

i am coding a little script that navigates through a standard file system. As I submit the path of the new directory I need a security mechanism to make sure that nobody can fake the url indexing a directory that is higher than the directory set as base directory for the script.

I need a regulary expression that filters any ../ - style notation from directory path. so that redundant directory notations are impossible!

I tried the following Perl Regulary Expression: (it's php code but should be no problem to understand for Perl coders ;-)

preg_replace( "|/(.*)/../|U", "/", $cur_dir );

The problem is that the expression fails notations that have more than one /.. e.g:  testbasedir/../../..

0
Comment
Question by:WebFerret
  • 4
  • 4
  • 2
  • +3
13 Comments
 

Expert Comment

by:Alex2k_developer
Comment Utility
Try so (perl):
$cur_dir="/../../../dir";
$cur_dir=~s/\/..//i;

Now, variable $cur_dir contain "/dir". In this part of a code all "/.." are replaced on "".
Thus I too am defended from hackers.
0
 

Expert Comment

by:Alex2k_developer
Comment Utility
I'm sorry.
Valid code is:
$cur_dir="/../../../dir";
$cur_dir=~s/\/..//g;     # "g" instead of "i"
0
 

Author Comment

by:WebFerret
Comment Utility
Damn!!! I only need the expression, not the Perl code! ;-)

Sorry, but your solution does not work in php PREG_xxx-command. ~s seems to be PERL but not part of a PERL regulary expression?!

222 if you give me a single common Perl RegExpr that works with php preg_replace( )-function (PHP uses the PERL module)!
0
 
LVL 51

Expert Comment

by:ahoffmann
Comment Utility
s#\.\./##g
0
 

Author Comment

by:WebFerret
Comment Utility
Your regulary expression only reduces all repeating /.. to one /.. It would shorten

   /part1/part2/../..

to

   /part1/part2/..

But I don't want to cut the end of the string but kill the redundant directories notations from the path. I want the "true" path in this example it has to be:

/    (base directory)

as /part1/part2 and /../.. are compensating each other!

hope now it is now more clearly what I want! Is there still someone who can help me!? :-)
0
 
LVL 51

Expert Comment

by:ahoffmann
Comment Utility
what you want is not a regex but a regex-substitution:
 what do you expect for: /part1/part2/../../xx/../../..
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 3

Expert Comment

by:terageek
Comment Utility
#Get rid of "/./"  to handle /./..
$cur_dir =~ s{/\./}{/}g;

#Get rid of "//"
$cur_dir =~ s{//}{/}g;

#Simplify /dir/../ patterns to /
my ($temp) = "";
while ($temp ne $cur_dir) {
    $temp = $cur_dir;
    $cur_dir =~ s{/[^./][^/]+/\.\.(/|$)}{/};  # Handle most cases
    $cur_dir =~ s{/\.[^.][^/]+/\.\.(/|$)}{/}; # Handle most cases with directories starting with "."
    $cur_dir =~ s{/\.\.[^/]+/\/.\.(/|$)}{/};   # Handle the last few cases with dirs starting with ".."
}

# Clean up a trailing / that may have been added
$cur_dir =~ s{/$}{};

Since you need to backtrace after each replacement, you can't use a "g" modifyer, but instead you must use a loop.
0
 

Author Comment

by:WebFerret
Comment Utility
@ahoffmann:

expecting function that generates from

/part1/part2/../../xx/../../..

the result path is

"/"

or ε (empty string!)

(deepest allowed path is the base directory "/" - not nescessarily the root directory of the filesystem but the deepest allowed directory. In my case it is the directory from which the script is called.)
0
 
LVL 28

Expert Comment

by:FishMonger
Comment Utility
>> deepest allowed path is the base directory "/" - not nescessarily the root directory

I don't know how php "reads" path notation, but on unix systems / is the root dir and ./ is the current working directory (or "base directory").

So far, each of the solutions removes the ../ relative path notation but may leave you with a properly formatted but a non existing directory path.   Also nothing has been mentioned about the user inputting an absolute path that is higher up the dir tree than the previous path.  Following along with the relative path problem (as outlined in your last post), you could do this (using Perl syntax since I don't know php):

$cur_dir = './' if ($cur_dir =~ /\.\./);

or this:

$cur_dir =~ s#^.*?\.\./.*$#./#;

If you need to test for absolute paths, then we'll need to approach this from a slightly different angle.
0
 
LVL 2

Expert Comment

by:icrf
Comment Utility
In Perl-speak, I'd use something like this:

my $path = '/part1/part2/../../xx/../../../';
1 while($path =~ s#(?:[^/]+/)?\.\./##);

And it php, it'd look more like this:

$path = '/part1/part2/../../xx/../../';
while($path != ($tmp = preg_replace( "#(?:[^/]+/)?\.\./#", "", $path, 1)))
        $path = $tmp;

And people say Perl isn't beautiful. :) If you have a trailing / on the path, it leaves you with that, otherwise, it ends with the empty string.
0
 
LVL 51

Expert Comment

by:ahoffmann
Comment Utility
icrf, I came up with a similar regex:
  while (s#[^/.]+[/]+\.\./##){}

but it suffers from the same problem as your sugestion, use
   $path = '/part1/part2/../../xx/../../..';

Think I need to go to bed with J.Friedl's regular ex. and a few beers ...
Probably there is something better with perl's look-ahead tomorrow ..
0
 
LVL 51

Accepted Solution

by:
ahoffmann earned 222 total points
Comment Utility
perl goes here:
   while (s#(?:[^/]+[/]+)+\.\./?##){}

hopefully its similar in php, see icrf's suggestion
0
 

Author Comment

by:WebFerret
Comment Utility
Okay...
ahoffmann's preg-expression works perfectly! :-)
Thank you to icrf too!

working php solution for eliminating redundant directories in pathes is:

$path = '/part1/part2/../../xx/../../..';
while($path != ($tmp = preg_replace( "#(?:[^/]+[/]+)+\.\./?#", "", $path, 1))) $path = $tmp;
echo $path;
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now