Regulary Expression for filtering redundant directory notations

Hi,

i am coding a little script that navigates through a standard file system. As I submit the path of the new directory I need a security mechanism to make sure that nobody can fake the url indexing a directory that is higher than the directory set as base directory for the script.

I need a regulary expression that filters any ../ - style notation from directory path. so that redundant directory notations are impossible!

I tried the following Perl Regulary Expression: (it's php code but should be no problem to understand for Perl coders ;-)

preg_replace( "|/(.*)/../|U", "/", $cur_dir );

The problem is that the expression fails notations that have more than one /.. e.g:  testbasedir/../../..

WebFerretAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Alex2k_developerCommented:
Try so (perl):
$cur_dir="/../../../dir";
$cur_dir=~s/\/..//i;

Now, variable $cur_dir contain "/dir". In this part of a code all "/.." are replaced on "".
Thus I too am defended from hackers.
0
Alex2k_developerCommented:
I'm sorry.
Valid code is:
$cur_dir="/../../../dir";
$cur_dir=~s/\/..//g;     # "g" instead of "i"
0
WebFerretAuthor Commented:
Damn!!! I only need the expression, not the Perl code! ;-)

Sorry, but your solution does not work in php PREG_xxx-command. ~s seems to be PERL but not part of a PERL regulary expression?!

222 if you give me a single common Perl RegExpr that works with php preg_replace( )-function (PHP uses the PERL module)!
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

ahoffmannCommented:
s#\.\./##g
0
WebFerretAuthor Commented:
Your regulary expression only reduces all repeating /.. to one /.. It would shorten

   /part1/part2/../..

to

   /part1/part2/..

But I don't want to cut the end of the string but kill the redundant directories notations from the path. I want the "true" path in this example it has to be:

/    (base directory)

as /part1/part2 and /../.. are compensating each other!

hope now it is now more clearly what I want! Is there still someone who can help me!? :-)
0
ahoffmannCommented:
what you want is not a regex but a regex-substitution:
 what do you expect for: /part1/part2/../../xx/../../..
0
terageekCommented:
#Get rid of "/./"  to handle /./..
$cur_dir =~ s{/\./}{/}g;

#Get rid of "//"
$cur_dir =~ s{//}{/}g;

#Simplify /dir/../ patterns to /
my ($temp) = "";
while ($temp ne $cur_dir) {
    $temp = $cur_dir;
    $cur_dir =~ s{/[^./][^/]+/\.\.(/|$)}{/};  # Handle most cases
    $cur_dir =~ s{/\.[^.][^/]+/\.\.(/|$)}{/}; # Handle most cases with directories starting with "."
    $cur_dir =~ s{/\.\.[^/]+/\/.\.(/|$)}{/};   # Handle the last few cases with dirs starting with ".."
}

# Clean up a trailing / that may have been added
$cur_dir =~ s{/$}{};

Since you need to backtrace after each replacement, you can't use a "g" modifyer, but instead you must use a loop.
0
WebFerretAuthor Commented:
@ahoffmann:

expecting function that generates from

/part1/part2/../../xx/../../..

the result path is

"/"

or ε (empty string!)

(deepest allowed path is the base directory "/" - not nescessarily the root directory of the filesystem but the deepest allowed directory. In my case it is the directory from which the script is called.)
0
FishMongerCommented:
>> deepest allowed path is the base directory "/" - not nescessarily the root directory

I don't know how php "reads" path notation, but on unix systems / is the root dir and ./ is the current working directory (or "base directory").

So far, each of the solutions removes the ../ relative path notation but may leave you with a properly formatted but a non existing directory path.   Also nothing has been mentioned about the user inputting an absolute path that is higher up the dir tree than the previous path.  Following along with the relative path problem (as outlined in your last post), you could do this (using Perl syntax since I don't know php):

$cur_dir = './' if ($cur_dir =~ /\.\./);

or this:

$cur_dir =~ s#^.*?\.\./.*$#./#;

If you need to test for absolute paths, then we'll need to approach this from a slightly different angle.
0
icrfCommented:
In Perl-speak, I'd use something like this:

my $path = '/part1/part2/../../xx/../../../';
1 while($path =~ s#(?:[^/]+/)?\.\./##);

And it php, it'd look more like this:

$path = '/part1/part2/../../xx/../../';
while($path != ($tmp = preg_replace( "#(?:[^/]+/)?\.\./#", "", $path, 1)))
        $path = $tmp;

And people say Perl isn't beautiful. :) If you have a trailing / on the path, it leaves you with that, otherwise, it ends with the empty string.
0
ahoffmannCommented:
icrf, I came up with a similar regex:
  while (s#[^/.]+[/]+\.\./##){}

but it suffers from the same problem as your sugestion, use
   $path = '/part1/part2/../../xx/../../..';

Think I need to go to bed with J.Friedl's regular ex. and a few beers ...
Probably there is something better with perl's look-ahead tomorrow ..
0
ahoffmannCommented:
perl goes here:
   while (s#(?:[^/]+[/]+)+\.\./?##){}

hopefully its similar in php, see icrf's suggestion
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
WebFerretAuthor Commented:
Okay...
ahoffmann's preg-expression works perfectly! :-)
Thank you to icrf too!

working php solution for eliminating redundant directories in pathes is:

$path = '/part1/part2/../../xx/../../..';
while($path != ($tmp = preg_replace( "#(?:[^/]+[/]+)+\.\./?#", "", $path, 1))) $path = $tmp;
echo $path;
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.