Link to home
Start Free TrialLog in
Avatar of jonberg
jonberg

asked on

In Python, how do I .find the '#'

I am trying to write code that can handle html anchor tags.  I need to test a string to see if it contains the '#'.  The code is returning URLs without '#' in them.

Thanks in advance
currTestIndex = currURL.find("#")
    if(currTestIndex == -1 or currTestIndex >= len(currURL)):
       print currURL
       return currURL

Open in new window

Avatar of Superdave
Superdave
Flag of United States of America image

Your code works for me.  Can you post a larger section of your program, or show an example that doesn't work?
ASKER CERTIFIED SOLUTION
Avatar of jonberg
jonberg

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of pepr
pepr

You may be interested in the standard module urlparse (http://docs.python.org/release/2.4.3/lib/module-urlparse.html; it was renamed/moved to urllib.parse in Python 3). Try the snippet below. It writes:

C:\tmp\___python\jonberg\Q_25855993>a.py
http
docs.python.org
/release/2.4.3/lib/module-urlparse.html
xyz
ParseResult(scheme='http', netloc='docs.python.org', path='/release/2.4.3/lib/module-urlparse.html', params='', query='', fragment='xyz')
('http', 'docs.python.org', '/release/2.4.3/lib/module-urlparse.html', '', '', 'xyz')
http://docs.python.org/release/2.4.3/lib/module-urlparse.html


import urlparse

url = 'http://docs.python.org/release/2.4.3/lib/module-urlparse.html#xyz'

x = urlparse.urlparse(url)

print x.scheme
print x.netloc
print x.path
print x.fragment
print 
print x

# The ParseResult class is actually derived from the tuple class.
print tuple(x)

# Construct the new URL from the parts.
nu = urlparse.urlunparse((x.scheme, x.netloc, x.path, '', '', ''))
print nu

Open in new window