Solved

python log-on script

Posted on 2011-09-23
39
488 Views
Last Modified: 2012-05-12
I am trying to limit access to an apache web directory by requiring users to type a password. As you can see from the attached, I don't understand how the value from the html form is passed to the python login script.

I understand that python has a getpass() module. But any script that uses this module would have to launch its own login (html) form for apache to serve a page to the user, and I'm not sure how that works either.  Finally, the code to redirect the user from the login page to the destination directory page is untested (I've done some research but I'm getting confused).

# the html page ...
<form method="POST" action="logon.py">
<p>Type in your password: <input type="text" name="j_password"> </p>
<p>Go: <input type="submit" name="login" value="Login"> </p>
</form>

# logon.py ...
import os, sys
import hashlib
import urllib, urllib2, cookielib

m = hashlib.sha1()
pass_file = os.path.abspath("/path/to/password.txt")
read_me = open(pass_file, 'r')
for line in read_me:
    line = line.strip()
    if line.isalnum():
        password = line
        #print password

m.update(password)
pass_key = m.hexdigest()
#print 'key value:', pass_key

def checkPassword():

    for key in range(1):
    
        p = hashlib.sha1()
        #q = raw_input("Enter the password >>")
        q = 'value passed from the html form'
        p.update(q)
        user_pass = p.hexdigest()
        #print 'test:', user_pass # 
        if user_pass == pass_key:
            #print 'match', user_pass, pass_key
            cj = cookielib.CookieJar()
            opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
            login_data = urllib.urlencode({'j_password' : password})
            opener.open('http://mysite.com/login.html', login_data)
            resp = opener.open('http://mysite.com/secure_dir/main.html')
            print resp.read()
            return True
        else:
            #print 'No match: \n', user_pass, '\nkey value: \n', pass_key
            print 'Wrong password, try again'

    print '<p>Access denied<p>'
    return False
 
def main():
    
    checkPassword()  

if __name__ == "__main__":
    main()

Open in new window

0
Comment
Question by:sara_bellum
  • 18
  • 17
  • 4
39 Comments
 
LVL 82

Expert Comment

by:Dave Baldwin
Comment Utility
There are many scripts to do this in PHP.  I would see if I could get a Python page to return anything to the browser before trying to code a login with it.  And where will you store the username and password info?  Not, I hope, in a cookie.
0
 

Author Comment

by:sara_bellum
Comment Utility
Understood, I think. I have the .py extension enabled in Apache mime.conf and in the /cgi-bin handlers for the site, so I should be able to do the same thing in a python script that I've seen with index.cgi : just put an html print statement in the script.

In checking this I found an unrelated error in my apache config which I need to repair, will get back to you.
0
 
LVL 16

Assisted Solution

by:gelonida
gelonida earned 500 total points
Comment Utility
sara_bellum.
There's multiple ways of running python scripts in a web server.

- simplest one (but least efficient for high traffic  is running python in a cgi script)
- another option is mod_python, but it is rather old and being obsoleted
- next one is mod_wsgi

I assume for the rest of the discussion, that you use  python behind a cgi script.

The way how you could get a password is rather simple.
You just create an html form, which amongst others requires a password.

the password will be sent like any other form context and can thus be obtained with the
'normal' python cgi functions.

import cgi
form = cgi.FieldStorage()

should return all form values
if your password field was named 'pass_word' in your form, then
you can get the password with

password = form.['pass_word'].value

You can also look at the documentation of the cgi module or at
http://gnosis.cx/publish/programming/feature_5min_python.html


Pleae note, that an HTTP server is stateless.
So if you make another request it will have forgotten, that you entered already a password.

So therefore one uses normally web frame works with session management.
(for example Django, but there's many others)







0
 

Author Comment

by:sara_bellum
Comment Utility
Thanks Gelonida that was very helpful!!

I was looking for the cgi solution you describe so this is it. Can't implement at the moment because I just finished debugging my apache config and I'm too tired. Tomorrow is another day.
0
 

Author Comment

by:sara_bellum
Comment Utility
I spent the day trying to figure this out and I must be missing something!! I used template substitution because I thought it would be the simplest way to read the form and prompt the user to type in a different password without having to click on a link to return to the original form.

The code below doesn't work - form input is not read, although at least the original form appears at http://www.mysite.com/cgi-bin/logon_ee.cgi. Earlier when I was testing the script at the command line, the password specified in the script did print to tty and to the destination html page.
There is nothing unusual in my apache config, so I have no idea why this is so difficult :(
import os, sys
import hashlib
import cgi
import config as rc
from string import Template

log_form = rc.log_form
pass_file = rc.pass_file
template_file = rc.template_file
contact_address = rc.contact_address
destination = open(log_form, 'w')

fields = {}
errors = []
print "Content-type: text/html \n"
form = cgi.FieldStorage() 

m = hashlib.sha1()
read_me = open(pass_file, 'r')
for line in read_me:
    line = line.strip()
    if line.isalnum():
        password = line
        #print password
m.update(password)
pass_key = m.hexdigest()
#print 'key value:', pass_key

def checkPassword(**kw):
    
    #read form values into the fields dict - 
    #this may be overkill but it has worked in the past
    fields = kw.get('fields', {}) 
    
    if form.getvalue('j_password') == None:
        #form.['j_password'].value = ' ' # syntax error
        fields['j_password'] = ' ' 
               
        if fields['j_password'] == ' ':
            errors.append('') # any string prints 
            pass
        
    # Evaluate form inputs
    for field_name in fields.keys():
        if field_name == 'j_password' and fields[field_name] != ' ':
            p = hashlib.sha1()
            #q = 'some_password'
            p.update(fields['j_password'])
            user_pass = p.hexdigest()
            if user_pass == pass_key: 
                errors.append('match found!')               
            else:     
                errors.append('Wrong password: ' + q) 
                    
    #if len(errors) > 3:
        #return False

    try:
        s = Template(open(template_file).read())
        destination.write(s.substitute(fill_me=' '.join(errors))) # page is empty 
        print s.substitute(fill_me=' '.join(errors)) # page is blank                           

    except IOError:
        print 'Could not open html template!  Please contact %s. ' % contact_address
        sys.exit(1)
 
def main():
    
    checkPassword(fields=fields)  

if __name__ == "__main__":
    main()

# here's the template form
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>...</style>
</head>
<body> 
<form method="POST" action="/cgi-bin/logon_ee.cgi">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_password" value=""></p>
<p><input type="submit" name="login"> <input type="reset" name="reset"> 
</p>
</form>
</body>
</html>

Open in new window

0
 

Author Comment

by:sara_bellum
Comment Utility
The remarks in the script showing page is empty or blank should say the (blank) form is printed instead.
0
 

Author Comment

by:sara_bellum
Comment Utility
I ran this script on another server where there's no doubt that apache is correctly configured and logon_ee.cgi behaves in the same way (a blank form reloads on submit).  

I had to disable this: destination = open(log_form, 'w') and a related line, which makes sense, since writing/changing server files with a cgi script is probably not a good idea. It was useful as a control measure but in retrospect it's not needed.
0
 
LVL 82

Expert Comment

by:Dave Baldwin
Comment Utility
Here is the famous "Hello World" in Python:  http://webpython.codepoint.net/cgi_hello_world  Try this basic program to see what it takes to make it work.  Both of the versions below run on IIS5.1 on Windows XP.
#!/usr/bin/env python
print "Content-Type: text/html"
print
print """\
<html>
<body>
<h2>Hello World!</h2>
</body>
</html>
"""

Open in new window

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

# enable debugging
import cgitb
cgitb.enable()

print "Content-Type: text/plain;charset=utf-8"
print

print "Hello World!"

Open in new window

0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Not sure what is going wrong, what I suggest is, that for debugging you just work with TWO cgi scripts.

One displaying the form and allowing you to enter the password
/cgi/logon_ee.cgi  
Just change
<form method="POST" action="/cgi-bin/logon_ee.cgi">
to
<form method="POST" action="/cgi-bin/show_posted_data.cgi">

and /cgi/show_posted_data.cgi

which will just show what has been posted to your form.

The second script should be as simple as possible to 'just' show whether data has been properly posted. I would also suggestt, that show_posted_data.cgi outputs text/plain instead of text/html
(Easier for debugging)

As a very first shot  /cgi/show_posted_data.cgi

could just be a shell script

#!/bin/sh
echo Content-type: text/plain
echo ""
echo "GET"
echo "$@" # display get parameters
echo POST
cat # should display POST data
echo ENV
env # Display all the environment variables

Do you see the form data posted????

You can also try a simple python script

import cgi
print "Content-type: text/plain\n"
form = cgi.FieldStorage()
print "%r" % form

Perhaps you can see something which helps you to find the error.
0
 
LVL 82

Expert Comment

by:Dave Baldwin
Comment Utility
Right now I can't get python to run thru Apache on Ubuntu but I'm too tired to look at it again tonight.
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
This is a simplified version of your script.
Not doing any external file accesses and directly comparing passwords instead of
comparing hashes.

Could you try out this simplified version
Attached example prompts for a password and detects wehher it is correct or not.
(I removed also the action from the form, such, that any user can test it without aving to choose the same urls script names as you)

#!/usr/bin/python
import cgi
from string import Template

fields = {}
errors = []
print "Content-type: text/html\n"
form = cgi.FieldStorage() 

def checkPassword(fields):
    passwd =  form.getvalue('j_password')
    if passwd is None:
        errors.append('No password specified:') 
        passwd_match =  False
    else:
        passwd_match =  passwd == 'topsecret'
        errors.append('password match %s' % passwd_match)               
    s = Template(template)
    print s.substitute(fill_me=' '.join(errors))
 
def main():
    checkPassword(fields)  

template = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>...</style>
</head>
<body> 
<form method="POST">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_password" value=""></p>
<p><input type="submit" name="login"> <input type="reset" name="reset"> 
</p>
</form>
</body>
</html>
"""

if __name__ == "__main__":
    main()

Open in new window

0
 
LVL 16

Assisted Solution

by:gelonida
gelonida earned 500 total points
Comment Utility
If you want to remember your form values, then you have to explicitely  fill them into your template.

Just change above code with following snippets
def checkPassword(fields):
    passwd =  form.getvalue('j_password')
    if passwd is None:
        errors.append('No password specified:') 
        passwd_match =  False
        passwd=""
    else:
        passwd_match =  passwd == 'topsecret'
        errors.append('password match %s' % passwd_match)               
    s = Template(template)
    print s.substitute(fill_me=' '.join(errors), passwd=passwd)
.
.
.

template="""
. . .
<form method="POST">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_password" value="${passwd}"></p>
<p><input type="submit" name="login"> <input type="reset" name="reset"> 
</p>
</form>
. . .
"""

Open in new window

0
 

Author Comment

by:sara_bellum
Comment Utility
For reasons that will be obvious to some I've requested that the previous post be deleted so I copy my comments here:

This has been very helpful!! By simplifying the code I was finally able to print the user input that is appended to the errors list.

I post my latest below, a work-around that I've been futzing with.

The problem remains that if the user presses the Submit button without password input, an error for an empty password string will not appear -- if I add that to the script, the error will show up when the page loads.

I thought I could get around this problem with this line:
if form.getvalue('Submit'): -- and then add a "Submit" value to the correct location on the html form, but there's no response. In fact, much of what I thought I knew about how form values are passed to a cgi script doesn't work any more (which may be  outside the scope of this effort).

Finally, I have no idea how to implement a python redirect - I think that I'm supposed to use an httplib module, but the examples I found merely print data to the top of the page.

TIA!

import os, sys
import hashlib
import cgi
import config as rc
from string import Template
import cgitb; cgitb.enable()
import httplib2

httplib2.debuglevel = 1 
h = httplib2.Http('.cache') 

log_form = rc.log_form
pass_file = rc.pass_file
template_file = rc.template_file
contact_address = rc.contact_address

fields = {}
errors = []
print "Content-type: text/html \n"
form = cgi.FieldStorage() 

m = hashlib.sha1()
read_me = open(pass_file, 'r')
for line in read_me:
    line = line.strip()
    if line.isalnum():
        password = line
        #print password
m.update(password)
pass_key = m.hexdigest()

def checkPassword(fields):
    
    passwd = form.getvalue('j_password')
    
    if passwd is None:
        passwd = ' '
    
    elif len(passwd) != 8:
        errors.append('Password must be 8 characters, "' + passwd + '" is ' + str(len(passwd)) + ' characters ')
        
    elif len(passwd) == 8:
        p = hashlib.sha1()
        q = form.getvalue('j_password')
        p.update(q)
        user_pass = p.hexdigest()
        if user_pass == pass_key: 
            response, content = h.request("http://www.mysite.com/path/to/file.html", "GET")
        else:
            errors.append('Incorrect password: ' + q )  

    try:
        s = Template(open(template_file).read())
        print s.substitute(fill_me=' '.join(errors)) #                

    except IOError:
        print 'Could not open html template!  Please contact %s. ' % contact_address
        sys.exit(1)
        
    #print "%r" % form
    
 
def main():
    
    checkPassword(fields)  

if __name__ == "__main__":
    main()

Open in new window

0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
I'm cuurently not on a PC with a web server, so I can't try out any code.

I must admit I don't understand exactly the problem with the empty password.
Could you try to explain exactly what is happening.

I would expect, that pressing submit should yield in passwd  == ''  and you should thus end up in line 40 of your most recently posted script.

Minor suggestions:
In line 38 I would add
errors.append('no password field was submitted ' )
Line 42 can be change to else: # len(passwd) is necessarily 8 so no need for an 'elif'
You can get rid of line 44 of your script and change
line  45 to  p.update(passwd)



Could you try to modify my simplified script and reexplain the problem.
Your current problem is probably not related to the config module and to calculating hashes. and the smaller your code to produce a problem the simpler for everybody who wants to help you.

Good luck
0
 

Author Comment

by:sara_bellum
Comment Utility
Gelonida,

The code in your post # 36708489 loads the html page with a "No password specified:" error. I want to load the page without errors! If the user presses the "Submit" button without typing in a password, that should trigger the error.

There are a number of free-ware web servers you can install on a PC for web development testing. I had one a few years ago but have forgotten the name of it.
0
 
LVL 16

Assisted Solution

by:gelonida
gelonida earned 500 total points
Comment Utility
What was missing is to distinguish between the first request where you just want
to display the form and the second request where you pressed the submit button.

There are two different ways to do this:

Either you verify whether the request method is a GET or a POST request
or alternatively whether the submit button was pressed.

you can determine the request method with

request_method = os.environ['REQUEST_METHOD']
the value will be 'GET" or 'POST'

you can find out whether the submit button was pressed by
checking

form.getvalue('login')

which will be either None or the value of the submit input field.

In attached example I named it 'SUBMIT' (It's also the value displayed in the button)

I added also some logging (which you can remove) to display the request method, the form values and the value of the submit button (if pressed)


import cgi
from string import Template

fields = {}
errors = []
print "Content-type: text/html\n"
form = cgi.FieldStorage()

def checkPassword(fields):
    request_method = os.environ.get('REQUEST_METHOD', 'GET')
    errors.append('request_method = %s<br/>' % request_method)
    submit_value  =  form.getvalue('login')
    errors.append('submit value %r<br/>' % submit_value)
    errors.append('form values %r<br/>' % form)

    if submit_value == 'SUBMIT': # Was the submit button pressed
        passwd =  form.getvalue('j_password')
        if passwd is None:
            # either no password specified or empty password
            passwd = ""
        passwd_match =  passwd == 'topsecret'
        if passwd_match:
            errors.append('password match')
        else:
            errors.append('password "%s" doesn\'t match' % passwd)
    s = Template(template)
    print s.substitute(fill_me=' '.join(errors))

def main():
    checkPassword(fields)

template = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>...</style>
</head>
<body>
<form method="POST">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_password" value=""></p>
<p><input type="submit" name="login" value="SUBMIT"> <input type="reset" name="reset">
</p>
</form>
</body>
</html>
"""

if __name__ == "__main__":
    main()

Open in new window

0
 

Author Comment

by:sara_bellum
Comment Utility
Great, thanks very much!! Now to my final question: how to redirect the user to another html page on submitting the correct password?
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Apologies for the rather long answer, but it might be necessary.

Well now you're touching the really difficult part:

Perhaps you could explain in a little more detail what exactly you try to achieve.
Perhaps the best solution would be something completely different.

Please remember what I wrote before:

Please note, that an HTTP server is stateless.
So if you make another request it will have forgotten, that you entered already a password.
So therefore one uses normally web frame works with session management.
(for example Django, but there's many others)


What's your exact goal?

Protect one and only one html document:
If it is just one html page, which you want  to protect and this HTML page does have any links to other html pages which should be protected, then the answer is very simple.
- you can either just open the html file (as template) and display it
- or you just call another script generating the html for you.

You don't really want to protect the final url, but just hide it for 'first time users'
Theoretically your CGI script could redirect to another url.

However: every person who once entered the password correctly could then bookmark the url you redirected to and this link would not be protected against anyone knowing this url.


You want to protect a whole set of urls, but all of them are cgi scripts
and you don't mind sending the password with every request

If all the contents, that you want to protect is also created via CGI scripts, then you could save the
password in a cookie and it would thus automatically be sent to all subsequent CGI scripts.
If your server is an HTTP server and not an HTTPS server, then your password would be sent over and over again in clear text over the network and could be sniffed out.
(Definitely not something I would do for data to be seriously protected. Not such a bigl issue for fun projects though)

However mostly one sends the password only once to the server.
The server creates then a random session cookie (with a limited life time) and stores the cookie and the related user on the server's file system or in a database.


You want to protect a whole set of html (and other files)
Then I strongly recommend to use one of apache's default authentification modules or to use a web framwork with session management. Rolling your own will probably result in a lot of work and in potential security holes. If it is for learning purposed, then this is of course no problem.

If you want to use apache's  basic authentification, then be aware, that the password is sent unencrypted over the network and anybody could intercept it (would also be the case with your form based authentification by the way)
So ideally you should use it only with anHTTPS server (where the entire transfer is encrypted and the password cannot be intercepted)


Look for example at:  to get an overview over some of apache's standard authentification modules.
http://httpd.apache.org/docs/1.3/howto/auth.html

There's even a module to implement custom authentifications I personally never used it though:
http://code.google.com/p/mod-auth-external/



Authentifications, which I personally used (depending on the roject)
- Apache's basic authentification over https (can be used for static html, cgi scripts, . . . )
- some PHP custom code in combination with PHP session cookies
- under python the default django authentification modul over https





You
Could you please exactly explain what you want to protect with a password and why you didn't want to use the .htaccess authentification of

In order to protect data on a server there's many ways. Using a custom authentification scheme via cgi is probably the hardest possible option, except you just want t

0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Ooops!!!

Apologies for the last two paragraphs. It's just some clutter I forgot to delete before posting.
If any admin is present I'd appreciate if this message (and the last two paragrahs of my previous post) could be deleted.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 16

Expert Comment

by:gelonida
Comment Utility
0
 

Author Comment

by:sara_bellum
Comment Utility
I realize that authentication is a can of worms, thanks for the links. As far as deleting the parts of your post that you don't want, you have to ask Community Support to do it for you. I'm clueless as to why we can't delete our own posts - we can do it on Facebook, so this is not obscure technology...

I chose not to use .htaccess because the password is tied to a user name. All I want is a password. I want to protect a whole directory of html files, and you're pointing me to apache or other modules to do that. I'm not aware of any apache module that authenticates independent of a user name, but I haven't read all the docs on the subject.

Where https makes sense for webmail, I think it's overdoing it for these pages.
I've seen a number of similar sites to mine on the web which simply request a password to log in, but use straight http urls. I expect most people to access this website from their home computer, and although there's no way to control that, network packet sniffing is not my primary concern.

Since password authentication typically directs users to some place other than the login page, I thought there might be a ready-made solution that would not take a whole lot of work to implement. I found web2py, but haven't tried it yet.
0
 
LVL 82

Expert Comment

by:Dave Baldwin
Comment Utility
HTTPS does Not secure a web page, it encrypts the transmission between the two computers so others can not tap into it.

Apache's .htpasswd docs are here: http://httpd.apache.org/docs/2.0/programs/htpasswd.html  It is commonly used to restrict access to a directory.
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Thanks for clarifying your requirements.

I think you're right, that my suggested apache modules do all require a user name.
What I saw on some pages, that they just changed the login text (the realm) such, that it suggests to enter 'guest' as user name.
you could also try (I can't try it as I currently don't know any web site with basic authentification)
http://guest@my.site/my/path/index.html
Depending on the browser it might prepopulate the user name with guest.
I am not sure though.

- web2py has a good reputation, I never used it though.
- django is probably overkill, but would also easily handle the issue of protecting one directory
- werkzeug would also be an option http://werkzeug.pocoo.org/docs/contrib/sessions/

all of the three should probably be added to apache via mod_wsgi

You can go on with a CGI only solution though if you want to, not sure if it's really less work though than learning one of the above, but if I understand well enough your security expectations it should be 'save' enough.

The way you could do it is:

Be sure, that the directory, that you want to protect cannot be directly accessed via apache.
Either it would be located outside of the document root or it would be protected with an .htaccess file not allowing to access the directory.

http://your.site/cgi-bin/mycgi.py
would be your login script.
It had to store the password as a cookie and to redirect to
http://your.site/cgi-bin/mycgi.py/index.html (This looks weird, but is perfectly legal.)
The part '/index.html' would be available in the environment variable $PATH_INFO

if your cgi-script contains a $PATH_INFO which is not empty and if the cookie contains the password, then it could just read the html files from your directory and write them to stdout.
You just had to provide the correct HTTP header with the correct mime type if some files were for example .jpg .png .txt and not .html files.

Alternatively you could read about apache's sendfile header.
In this case your cgi-script would not serve the file , but just tell apache to serve it (if the authentification was true).
X-sendfile headers make mostly sense for high bandwidth web servers.
For low bandwidth servers just opening the file, reading it and sending it out via cgi print()/write()
is probably sufficient.



0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Forgot to attach the link explaining apache's xsendfile header. However if it's probably not really needed for what your current issue.

https://tn123.org/mod_xsendfile/
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Hi Dave,

Perhaps I didn't express myself to well.
https avoids password sniffing (man in the middle attacks)
Apache's basic authentification is not considered safe with http, but can be considered safe in an https context.


In many cases however depending on the contents to be protected even basic authentification via HTTP and saving a password in a  session cookie may be good enough.

It's only the owner of the contents (after having understood the implications of each protocol) who can decide when contents is well enough protected.

I personally serve my own contents only over https
and for more important contents I additionally use client certificates.
It's not that the contents is really that important, but it's a way of learning by playing

0
 

Author Comment

by:sara_bellum
Comment Utility
gelonida,

Your post #36818717 has me thoroughly confused. I think you understand that I'm trying to use a CGI-only solution (which I assume stores the password as a cookie and redirects), but even when I go to your links I can't find a way to do that.
0
 

Author Comment

by:sara_bellum
Comment Utility
Also, when using a cgi solution, if the url you are re-directing to doesn't match the path of the directory you are trying to protect, you can simply by-pass it. You merely need to know the http path of the target directory, for example
http://www.mysite.com/album/index.html.
It doesn't matter what cgi-bin path you are trying to use for your redirect, as I understand it.
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
I'll send you an example a little later on.

The problem is, that if you redirect from a cgi script to another path (after successful password verification), then the user on the  browser will finally see the path he/shewas redirected to.

He can now just bookmark this path and will never ever need to enter the password again.
He can even send the bookmarked url to anybody else.

Therefore this is not really such a good solution.

What I tried to explain in #36818717 is, that the only way to protect files with a cgi script (and without an .htaccess or any other server amgic) is, that EVERY request has always to go throught the cgi script and the cgi script will verify the password for EVERY request.

this can be done the following way:

imagine your webserver is accessible under the url
http://myserver

and the cgi-script is accessible via
http://myserver/cgi-bin/login.py
( and for example stored on /www/cgi-bin/login.py)

all your other unprotected html files would be located under:
/www/html/

In order to avoid, that your protected files can be accessed directly via a url you have to place them outside of the document root or in a directory, which has a .htaccess file, which will not give access.

You could place them for example at
/www/protected_files

let's assume, that you have
/www/protected_files/index.html
/www/protected_files/another_html_file.html

They are now NOT directly accessible by apache or any known url.

Now go to
http://myserver/cgi-bin/login.py
It will display a login form.
if you enter a password it will store it as cookie and it will also display whether the password is ok or not.

If the password is ok, then it will redirect you to following url:

http://myserver/cgi-bin/login.py/index.html

This url means, that the cgi script will be called again, but that the environment variable
PATH_INFO will now contain the value '/index.html' (basically the path following the cgi script without any GET parameters)

now your cgi script can (if the password cookie is correct) transform
PATH_INFO
into '/www/protected_files' + PATH_INFO
read the file and serve it.
( In fact it had to determine the file type and to send the correct Content-Type in case you have non html files located in /www/protected_files


I hope this clarifies a little.
0
 

Author Comment

by:sara_bellum
Comment Utility
I had no idea that apache could serve web pages outside of its own doc root!!

I have another problem then: I'm running virtual hosts, with two separate urls (for example http://www.mysite.com/ and http://www.yoursite.org)

Each of these sites has their own cgi-bin directory, and each of these sites is located under apache's doc root (/var/www)

So assuming you're right, I have no idea how apache can serve pages outside of a site doc root, since they need to be associated with a url which is in turn tied to a DNS entry.
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Please look at attached code.

It should give you the idea.

you had to adapt protected_path to point to the directory containing the files you'd like to protect.


You also had to add code for file type detection in case you want to server more than plain html data.

This snipped shows for a python CGI script:
- how to get cookies
- how to set cookies
- how to redirect to another url
- how to present a form
- how to parse a form input
- how to use PATH_INFO
- how to distinguish between POST and GET methods
- how to protect a set of html files below a certain directory from access without  a password, which is
transmitted in plain text via a form and consecutively via a cookie


#!/usr/bin/python
import os, sys
import cgi, Cookie
from string import Template

good_password = 'topsecret'
my_path = os.path.abspath(os.path.dirname(__file__))
protected_path = os.path.join(my_path, 'protected')
request_method = os.environ.get('REQUEST_METHOD', 'GET')
cgi_script = os.environ.get('SCRIPT_NAME','')
path_info = os.environ.get('PATH_INFO', '')
form = cgi.FieldStorage() 
errors = []
http_head = []

def print_http_head():
    print '\n'.join(http_head)+'\n'


def show_passwd_form(): # displays the login form
    s = Template(template)
    http_head.append('Content-type: text/html')
    print_http_head()
    print s.substitute(fill_me=' '.join(errors), cgi_script

def check_form_passwd(): # 
    # if password given passwd cookie will be set.
    # return value: True if password is good
    submit_value  =  form.getvalue('login')
    if submit_value != 'SUBMIT': # abort submit wasn't pressed
        return False 
    passwd =  form.getvalue('j_password')
    if passwd is not None:
        cookie = Cookie.SimpleCookie()
        cookie['password'] = passwd
        http_head.append(str(cookie))
    passwd_match =  passwd == good_password
    if not passwd_match:
        errors.append('wrong password')
    return passwd_match

def check_passwd_cookie():
    cookies = Cookie.SimpleCookie(os.environ.get('HTTP_COOKIE',
    passwd = cookies['password'].value
    errors.append('cookiepasswd is &lt;%s&gt;' % passwd)
    return passwd == good_password

def is_subdir_of(subpath, parentpath):
    """ make sure to not serve any file locate
        protected directory
    """
    subpath = os.path.normpath(os.path.abspath
    parentpath = os.path.normpath(os.path.absp
    parent_len = len(parentpath)
    if len(subpath) <= parent_len:
        return False # can't be below parent p
    if subpath[:parent_len] != parentpath:
        return False # doesn't even start the 
    if subpath[parent_len] == '/':
        return True # seems to be a sub dir
    return False

def serve_file(path_info):
    file_to_serve = os.path.join(protected_pat
    if not is_subdir_of(file_to_serve, protect
        return # could display error message
    ## should determine correct Content type d
    http_head.append('Content-type: text/html'
    print_http_head()
    fin = open(file_to_serve)
    content = fin.read()
    fin.close()
    sys.stdout.write(content)


ef main():
   if path_info == '':
       if request_method == 'GET':
           show_passwd_form()
           sys.exit(0)
       elif request_method == 'POST':
           passwd_ok = check_form_passwd()
           if passwd_ok:
               index_url = cgi_script + '/index.html'
               http_head.append('Location: ' + index_
               print_http_head()
               sys.exit(0)
       else:
           errors.append('unimplemented HTTP method')
           passwd_ok = False
   else:
       passwd_ok = check_passwd_cookie()
           
   if not passwd_ok:
       show_passwd_form()
       env_vals = [ '%s : %s' % v for v in os.environ
       print '\n'.join(errors) + '<br/>'
       print '<br/>\n'.join(sorted(env_vals))
       sys.exit(0)
   
   serve_file(path_info)

template = """<!DOCTYPE html PUBLIC "-//W3C
ttp://www.w3.org/TR/xhtml1/DTD/xhtml1-trans
<html xmlns="http://www.w3.org/1999/xhtml" 
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="te
<style>...</style>
</head>
<body> 
<form method="POST" action="${cgi_script}">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_pass
<p><input type="submit" name="login" value=
reset"> 
</p>
</form>
</body>
</html>
"""

if __name__ == "__main__":
    main()

Open in new window

0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
I propose to ask further not 100% related questions in a new thread.

Otherwise it will be very difficult for other EE-users t to find questions / answers  via the search function.


Concerning your most recent reply.

It's not Apache who is serving data outside of the doc root of a virtual server  (see my previous post)
It's the CGI script running on the server, from the virtual hosts cgi directory which is just reading files from whereveyou specifyy from wherever it has read permissions and sending it via stdout (print) to apache, which will just forward it to your web browser.






0
 

Author Comment

by:sara_bellum
Comment Utility
Thanks for the code!! Unfortunately your copy cuts off the right-hand margin. I'm trying to fill in the missing parts but it's a guess...
0
 
LVL 16

Accepted Solution

by:
gelonida earned 500 total points
Comment Utility
Arghhh!

Here again

#!/usr/bin/python
import os, sys
import cgi, Cookie
from string import Template

good_password = 'topsecret'
my_path = os.path.abspath(os.path.dirname(__file__))
protected_path = os.path.join(my_path, 'protected')
request_method = os.environ.get('REQUEST_METHOD', 'GET')
cgi_script = os.environ.get('SCRIPT_NAME','')
path_info = os.environ.get('PATH_INFO', '')
form = cgi.FieldStorage() 
errors = []
http_head = []

def print_http_head():
    print '\n'.join(http_head)+'\n'


def show_passwd_form(): # displays the login form
    s = Template(template)
    http_head.append('Content-type: text/html')
    print_http_head()
    print s.substitute(fill_me=' '.join(errors), cgi_script=cgi_script)

def check_form_passwd(): # 
    # if password given passwd cookie will be set.
    # return value: True if password is good
    submit_value  =  form.getvalue('login')
    if submit_value != 'SUBMIT': # abort submit wasn't pressed
        return False 
    passwd =  form.getvalue('j_password')
    if passwd is not None:
        cookie = Cookie.SimpleCookie()
        cookie['password'] = passwd
        http_head.append(str(cookie))
    passwd_match =  passwd == good_password
    if not passwd_match:
        errors.append('wrong password')
    return passwd_match

def check_passwd_cookie():
    cookies = Cookie.SimpleCookie(os.environ.get('HTTP_COOKIE', ''))
    passwd = cookies['password'].value
    errors.append('cookiepasswd is &lt;%s&gt;' % passwd)
    return passwd == good_password

def is_subdir_of(subpath, parentpath):
    """ make sure to not serve any file located outside of the
        protected directory
    """
    subpath = os.path.normpath(os.path.abspath(subpath))
    parentpath = os.path.normpath(os.path.abspath(parentpath))
    parent_len = len(parentpath)
    if len(subpath) <= parent_len: 
        return False # can't be below parent path is too short
    if subpath[:parent_len] != parentpath:
        return False # doesn't even start the same way
    if subpath[parent_len] == '/':
        return True # seems to be a sub dir
    return False

def serve_file(path_info):
    file_to_serve = os.path.join(protected_path, path_info[1:])
    if not is_subdir_of(file_to_serve, protected_path):
        return # could display error message
    ## should determine correct Content type depending on suffix
    http_head.append('Content-type: text/html') 
    print_http_head()
    fin = open(file_to_serve)
    content = fin.read()
    fin.close()
    sys.stdout.write(content)
    
def main():
    if path_info == '':  
        if request_method == 'GET':
            show_passwd_form()
            sys.exit(0)
        elif request_method == 'POST':
            passwd_ok = check_form_passwd()  
            if passwd_ok:
                index_url = cgi_script + '/index.html'
                http_head.append('Location: ' + index_url)
                print_http_head()
                sys.exit(0)
        else:
            errors.append('unimplemented HTTP method')
            passwd_ok = False
    else:
        passwd_ok = check_passwd_cookie()
            
    if not passwd_ok:
        show_passwd_form()
        env_vals = [ '%s : %s' % v for v in os.environ.items()]
        print '\n'.join(errors) + '<br/>'
        print '<br/>\n'.join(sorted(env_vals))
        sys.exit(0)

    serve_file(path_info)

template = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>...</style>
</head>
<body> 
<form method="POST" action="${cgi_script}">
<p>${fill_me}</p>
<p>Password:<input type="text" name="j_password" value=""></p>
<p><input type="submit" name="login" value="SUBMIT"> <input type="reset" name="reset"> 
</p>
</form>
</body>
</html>
"""

if __name__ == "__main__":
    main()

Open in new window

0
 

Author Comment

by:sara_bellum
Comment Utility
It works, thanks very much. I'm not clear about how it works but I was able to make small changes. The rest will take time I expect.

I wanted to see what would happen if I linked the protected index.html page to another directory. I was allowed in, in spite of the is_subdir_of(subpath, parentpath) method, but then realized that this function is designed to protect subdirectories of the protected directory from access. So far, so good.

I'll ponder this for a bit and then close the question.
0
 
LVL 16

Expert Comment

by:gelonida
Comment Utility
Glad it is working!
I'm also glad you could adapt the script!

Some more explanations / links:

line 7/8 or just one way to create a directory relative to your cgi path.
replace this path with any path, that suits our setup.
For production the path 'protected' should probably NOT be a subdirectory of your cgi path and probably not a sub directory of document_root of your virtual server (except you block access with a restrictive .htaccess rule or an alias)

To read a litle more about PATH_INFO:
http://www.perlfect.com/articles/cgi_env.shtml
http://www.peterbenjamin.com/seminars/cgi/cgi4programmers.html

A little more about handling Cookies:
http://docs.python.org/release/2.6.7/library/cookie.html?highlight=cookie#Cookie.BaseCookie
http://www.jayconrod.com/cgi/view_post.py?17 (however they do not use Cookeis to set cookies, but just print the HTTP cookie headers manually)


A little more about HTTP redirection:
line 84/85 use an HTTP 'Location:' header redirecting you to another url (index.html of your protected directory)
This is probably not 100% clean, but should be working.
I think to be 100% compliant with the standard you should use other redirect techniques.
I just set the location header, but not the status code which should probably be 302
Further I don't provide a complete url, but just a relative one.

More reading:
http://en.wikipedia.org/wiki/URL_redirection
http://www.instant-web-site-tools.com/html-redirect.html
http://johnbokma.com/perl/redirectioncgiscript.html



Concerning the function is_subdir_of():

The function is_subdir_of()  is intended to prohinit users from trying to get access file outside
 of your protected directory.

It tries to protect from tricks like:

# please note the double slash '//' after yourcgiscript.py
http://yourserver/yourcgiscript.py//etc/password
PATH_INFO would thus be /etc/passwd

or tricks like
http://yourserver/yourcgiscript.py/../../etc/password
( assumingyour protected path were  /www/protected )


However currently the cgi script follows symbolic links.
(Perhaps you want to intentionally link some files to protected)

If due to whatever reason you do not want to follow symbolic links leading outside of your protected path then you could do following:
replace in line 53/54 'normpath' with 'realpath'
This should compare the real paths (with all symlinks resolved) and check then whether
subpath is really located below parentpath.




0
 

Author Comment

by:sara_bellum
Comment Utility
That's a lot to absorb, thanks for the run-down!!

I just tried implementing an alias, which, according to apache mod_alias, is used to hide the true path, where the true path is typically outside the site's doc root, as you pointed out.  

It all seems to work, although the process is still quite mysterious to me.  The log-in page takes me to the index page with the same url as before; without the alias this path would return an error because it no longer contains any files, since all of the protected html pages are now located in a directory outside the site's doc root.

That's probably the intended behavior, but the index page links to the photo pages. If I look at the source code on these photo pages, the paths to the images should all be broken. (I copied the photo directory to the new path-outside-site-doc-root but haven't yet added a directory definition for it in apache's site config; the location of the new photo directory relative to the protected html pages is now different anyway.)

My firefox client may  simply be holding the images in cache; I'll look into that tomorrow.
0
 
LVL 16

Assisted Solution

by:gelonida
gelonida earned 500 total points
Comment Utility
Sara,

you should NOT add an alias to get access to your photo directory.
This would defy the whole idea of the access script.
The idea is, that apache itself cannot DIRECTLY access your protected directory.

Only the cgi script, can parse PATH info and access the files in the directory below.


I noticed, that my suggested redirect does not work exactly as I expected and does therefore complicate the comprehension of how the script issupposed to work.

So I propose tho change the piece of code currently intended for the redirection.
You need one click more to access the album, but perhaps it's better understandable.

       elif request_method == 'POST':
            passwd_ok = check_form_passwd()  
            if passwd_ok:
                index_url = cgi_script + '/index.html'
                http_head.append('Content-type: text/html')
                s = Template(template_logged_in)
                print_http_head()
                print s.substitute(index_url=index_url)
                sys.exit(0)



Additionally you had to declare the template for the successful login, just add it immediately afterthe the assignment of template:

template_logged_in = """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<title>Log-on page</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<style>...</style>
</head>
<body>
Congratulations! You are now logged in:<br/>
You can view the photo album by clicking on <br/>
<a href="${index_url}"> View Photo Album</a>
</body>
</html>
"""

Please note also, that you have to adapt the content type (line 67) for files, which are not html files.

you can use the function mimetypes.guess_type()   from the module mimetypes.


0
 

Author Comment

by:sara_bellum
Comment Utility
I removed the duplicate copy of the photo directory from the path outside the site doc root, based on your latest comment. The cgi script reads the path to the protected directory outside the doc root, but the path to the photo directory remains unchanged (inside doc root).

I also made sure that apache doesn't list the photo directory contents. I thought I had already fixed that problem, but it turns out that my earlier solution was incorrect. Fyi the fix was to change the apache default "Options Indexes FollowSymLinks MultiViews" to "Options -Indexes FollowSymLinks MultiViews" (with a minus sign in front of Indexes)

Finally, I checked apache docs regarding the cgi-bin path for virtual hosts, and moved it outside of the site doc root. I don't like seeing cgi-bin in the url either, so I changed the alias from /cgi-bin/ to another name. It still works :)

It's genius to learn how a cgi script can parse PATH info to access files, thanks very much for this!! I didn't use your second solution (which is probably more bullet-proof than the first) because I want to minimize the number of clicks required for the user to get to the destination directory.

It's late now so I'll close and award points tomorrow.
0
 

Author Closing Comment

by:sara_bellum
Comment Utility
This was lots of fun, in spite of how long it took to get to the final solution :)
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Variable is a place holder or reserved memory locations to store any value. Which means whenever we create a variable, indirectly we are reserving some space in the memory. The interpreter assigns or allocates some space in the memory based on the d…
Introduction On September 29, 2012, the Python 3.3.0 was released; nothing extremely unexpected,  yet another, better version of Python. But, if you work in Microsoft Windows, you should notice that the Python Launcher for Windows was introduced wi…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now