Link to home
Start Free TrialLog in
Avatar of dmontgom
dmontgom

asked on

A regular Expression for fnding a relative path

Hi.

I need a regular expression for finding relatives paths e.g. ../img/xxx.gif or ./img/xxx.gif or img/xxx.gif  in general ../path/to/file/xxx.gif



I am using python2.5.

Please provide a clear solution...I am not an expert in regex...

Thanks



Avatar of HonorGod
HonorGod
Flag of United States of America image

Something like this perhaps?
>>> import re
>>>
>>> def isRelative( path ) :
...   RE = re.compile( r'\.\.?[\\/]' )
...   return ( RE.search( path ) != None )
...
>>> isRelative( r'../img/xxx.gif' )
True
>>>

Open in new window

better...
>>> def isRelative( path ) :
...   RE = re.compile( r'(^|[\\/])\.\.?[\\/]' )
...   return ( RE.search( path ) != None )
...
>>> isRelative( 'a./b' )
False
>>> isRelative( './b' )
True
>>> isRelative( 'a../b' )
False
>>> isRelative( 'img/xxx.gif' )
False
>>> isRelative( '../path/to/file/xxx.gif' )
True
>>>

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of HonorGod
HonorGod
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ah.  I was thinking of an object method.

Nevermind.  Go with the 2nd... (i.e., better)
Avatar of dmontgom
dmontgom

ASKER

Hi,

My bad...I was not clear....

I have a html page and there are relative tags in java scripts. I need to search the string and find patters that look like a relative url so I can do a find and replace....  I will be looking for files that have extensions if e.g. css,js,jpg etc....

So...if given this string "adadfadf adafd ../test/test.gif adfadfaf"  how can I do this for any arbitrary pattern?
Ah, so you want to locate the relative address substring(s) within a string?
Something like this?
>>> def Relative( str ) :
...   RE = re.compile( r'(^|[\\/])\.\.?[\\/]' )
...   result = []
...   for data in str.split() :
...     if RE.search( data ) != None :
...       result.append( data )
...   return result
...
>>> Relative( "adadfadf adafd ../test/test.gif adfadfaf" )
['../test/test.gif']
>>>

Open in new window


The answer HonorGod gave is quite good.   It doesn't match the img/xxx.gif case you gave, but that's easily fixed.

The problem, however, isn't particularly well defined.   For example, is  "foo.gif" a relative path?   Strictly speaking it is, but that may not be what you have in mind.  

Are you looking for all paths that have at least one / (or \) and don't start at the root (don't begin with a /).   What's your platform (Windows, Linux, ...)?

Again, using HG's script above, where it splits into strings first, you can use this:

r'(^[^\\/].*[\\/].*)'

which basically says: 1) don't start w/ a  / or \, then there must be at least one / or \ somewhere else in the string.
cool...almost there...here a real example...

because of the path between the ' ' it did not find it.....


tt = """ <body onload="MM_preloadImages('../images_home/home1.gif ','../images_home/started1.gif','../images_home/pricing1.gif','../images_home/success1.gif','../images_home/how1.gif','../images_home/about1.gif','../images_home/faqs1.gif','../images_home/home_step01.gif','../images_home/home_step02.gif','../images_home/home_step03.gif')">
"""
change the split above to:

for data in str.split(''") :

that's double quote, single quote, double quote... I don't think you have to escape the ' quote... but if you do, its

"\'"

basically, instead of splitting on whitespace, we're splitting on the single quotes
no need for regular expression


tt = """ <body onload="MM_preloadImages('../images_home/home1.gif ','../images_home/started1.gif','../images_home/pricing1.gif','../images_home/success1.gif','../images_home/how1.gif','../images_home/about1.gif','../images_home/faqs1.gif','../images_home/home_step01.gif','../images_home/home_step02.gif','../images_home/home_step03.gif')">
"""
for item in tt.split(","):
    item = item[ item.index("../"):]
    print item[: item.index("'")]

Open in new window

what is need is regex...there is a larger scope here...

I use beautiful soup is parse html tags....

I need regex it handle arbitrary complexity....

PS  jiust imagine a big html file.....,is there a regex way to find all relative links?  e.g. / ./ ../ ../../ etc....

HonorGod's code worked great for all all cases...just not if there is a quote e.g.  '' or a " "..  Also it wont work if the rel path is e.g. images/test.gif.

Thanks for the grade & points.

Good luck and have a great day.