Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


Regulary Expression for filtering redundant directory notations

Posted on 2003-11-24
Medium Priority
Last Modified: 2010-03-04

i am coding a little script that navigates through a standard file system. As I submit the path of the new directory I need a security mechanism to make sure that nobody can fake the url indexing a directory that is higher than the directory set as base directory for the script.

I need a regulary expression that filters any ../ - style notation from directory path. so that redundant directory notations are impossible!

I tried the following Perl Regulary Expression: (it's php code but should be no problem to understand for Perl coders ;-)

preg_replace( "|/(.*)/../|U", "/", $cur_dir );

The problem is that the expression fails notations that have more than one /.. e.g:  testbasedir/../../..

Question by:WebFerret
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +3

Expert Comment

ID: 9809376
Try so (perl):

Now, variable $cur_dir contain "/dir". In this part of a code all "/.." are replaced on "".
Thus I too am defended from hackers.

Expert Comment

ID: 9809382
I'm sorry.
Valid code is:
$cur_dir=~s/\/..//g;     # "g" instead of "i"

Author Comment

ID: 9809622
Damn!!! I only need the expression, not the Perl code! ;-)

Sorry, but your solution does not work in php PREG_xxx-command. ~s seems to be PERL but not part of a PERL regulary expression?!

222 if you give me a single common Perl RegExpr that works with php preg_replace( )-function (PHP uses the PERL module)!
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

LVL 51

Expert Comment

ID: 9809875

Author Comment

ID: 9810251
Your regulary expression only reduces all repeating /.. to one /.. It would shorten




But I don't want to cut the end of the string but kill the redundant directories notations from the path. I want the "true" path in this example it has to be:

/    (base directory)

as /part1/part2 and /../.. are compensating each other!

hope now it is now more clearly what I want! Is there still someone who can help me!? :-)
LVL 51

Expert Comment

ID: 9810924
what you want is not a regex but a regex-substitution:
 what do you expect for: /part1/part2/../../xx/../../..

Expert Comment

ID: 9812432
#Get rid of "/./"  to handle /./..
$cur_dir =~ s{/\./}{/}g;

#Get rid of "//"
$cur_dir =~ s{//}{/}g;

#Simplify /dir/../ patterns to /
my ($temp) = "";
while ($temp ne $cur_dir) {
    $temp = $cur_dir;
    $cur_dir =~ s{/[^./][^/]+/\.\.(/|$)}{/};  # Handle most cases
    $cur_dir =~ s{/\.[^.][^/]+/\.\.(/|$)}{/}; # Handle most cases with directories starting with "."
    $cur_dir =~ s{/\.\.[^/]+/\/.\.(/|$)}{/};   # Handle the last few cases with dirs starting with ".."

# Clean up a trailing / that may have been added
$cur_dir =~ s{/$}{};

Since you need to backtrace after each replacement, you can't use a "g" modifyer, but instead you must use a loop.

Author Comment

ID: 9812968

expecting function that generates from


the result path is


or ε (empty string!)

(deepest allowed path is the base directory "/" - not nescessarily the root directory of the filesystem but the deepest allowed directory. In my case it is the directory from which the script is called.)
LVL 28

Expert Comment

ID: 9819626
>> deepest allowed path is the base directory "/" - not nescessarily the root directory

I don't know how php "reads" path notation, but on unix systems / is the root dir and ./ is the current working directory (or "base directory").

So far, each of the solutions removes the ../ relative path notation but may leave you with a properly formatted but a non existing directory path.   Also nothing has been mentioned about the user inputting an absolute path that is higher up the dir tree than the previous path.  Following along with the relative path problem (as outlined in your last post), you could do this (using Perl syntax since I don't know php):

$cur_dir = './' if ($cur_dir =~ /\.\./);

or this:

$cur_dir =~ s#^.*?\.\./.*$#./#;

If you need to test for absolute paths, then we'll need to approach this from a slightly different angle.

Expert Comment

ID: 9819774
In Perl-speak, I'd use something like this:

my $path = '/part1/part2/../../xx/../../../';
1 while($path =~ s#(?:[^/]+/)?\.\./##);

And it php, it'd look more like this:

$path = '/part1/part2/../../xx/../../';
while($path != ($tmp = preg_replace( "#(?:[^/]+/)?\.\./#", "", $path, 1)))
        $path = $tmp;

And people say Perl isn't beautiful. :) If you have a trailing / on the path, it leaves you with that, otherwise, it ends with the empty string.
LVL 51

Expert Comment

ID: 9819920
icrf, I came up with a similar regex:
  while (s#[^/.]+[/]+\.\./##){}

but it suffers from the same problem as your sugestion, use
   $path = '/part1/part2/../../xx/../../..';

Think I need to go to bed with J.Friedl's regular ex. and a few beers ...
Probably there is something better with perl's look-ahead tomorrow ..
LVL 51

Accepted Solution

ahoffmann earned 888 total points
ID: 9820012
perl goes here:
   while (s#(?:[^/]+[/]+)+\.\./?##){}

hopefully its similar in php, see icrf's suggestion

Author Comment

ID: 9837410
ahoffmann's preg-expression works perfectly! :-)
Thank you to icrf too!

working php solution for eliminating redundant directories in pathes is:

$path = '/part1/part2/../../xx/../../..';
while($path != ($tmp = preg_replace( "#(?:[^/]+[/]+)+\.\./?#", "", $path, 1))) $path = $tmp;
echo $path;

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

596 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question