unix_admin777
asked on
Match a Location tag in Apache httpd.conf with python
Hi,
I need some help parsing an httpd.conf file using Python. I'm unable to match a multiline block of text for some reason. I am trying to open this file to read and match the <Location /testing> </location> block and print out the matching text only. Can someone please provide me some Python code to do this. HEre is the sample text:
I need some help parsing an httpd.conf file using Python. I'm unable to match a multiline block of text for some reason. I am trying to open this file to read and match the <Location /testing> </location> block and print out the matching text only. Can someone please provide me some Python code to do this. HEre is the sample text:
<VirtualHost *:80>
ServerAdmin helpdesk@test.com
# DocumentRoot "/var/www/html"
ServerName gq-svn-01.test.com
ServerAlias gqsvntest.test.com
LogLevel debug
<Location />
AuthBasicProvider ldap
AuthzLDAPAuthoritative off
AuthType Basic
AuthName
</Location>
<Location /testing>
DAV svn
SVNPath /home/repos/testing
</Limit>
</location>
</VirtualHost>
Firstly, the content of the httpd.conf is broken (from the XML point of view). The <Location /> at the line 7 means an empty XML element (a single tag where the closing </Location> is not expected after). The "/" must be added as some attribute of the element. The same holds for the line 13. The line 16 is not paired with any opening <Limit> tag. The line 17 must be </Location> as XML is case sensitive.
I do recommend to use the standard xml.etree.ElementTree module for parsing XML files instead of the regular expressions. Try the following as the start point (docs.python.org/library/x ml.etree.e lementtree .html):
a.py
I do recommend to use the standard xml.etree.ElementTree module for parsing XML files instead of the regular expressions. Try the following as the start point (docs.python.org/library/x
a.py
import xml.etree.ElementTree as ET
tree = ET.parse('httpd.conf')
ET.dump(tree)
Sorry. Back to the trees :) I did not noticed that the httpd.conf is not a XML file. Try the following:
b.py
It prints on my console (Windows):
b.py
import re
def getLocationLines(fname, loc_id):
rexLocationOpen = re.compile(r'<Location\s+(\S+)?\s*>')
rexLocationClose = re.compile(r'</location>') # should be </Location> with capital L
f = open(fname)
status = 0
for line in f:
if status == 0: # waiting for the opening line
m = rexLocationOpen.search(line)
if m: # found?
if m.group(1) == loc_id:
status = 1 # let's generate the content between
elif status == 1: # generating lines between the tags
m = rexLocationClose.search(line)
if m:
status = 2
else:
yield line # yield the next interesting line
elif status == 2: # just pass the rest of the lines
pass
f.close()
if __name__ == '__main__':
for line in getLocationLines('httpd.conf', '/testing'):
print line.rstrip()
It prints on my console (Windows):
c:\tmp\___python\unix_admin777\Q_27673761>python b.py
DAV svn
SVNPath /home/repos/testing
</Limit>
ASKER
Thank you for the help. It seems to work, but I also want to match the </Location /testing> </location> tags as well. Also, even after looking at this code for a while, I still don't understand what it is doing. Is there any way you can summarize the logic here? It seems like you have somehow tagged the matched block, but I don't understand how.
This is what confuses me:
This is what confuses me:
for line in f:
if status == 0: # waiting for the opening line
m = rexLocationOpen.search(line)
if m: # found?
if m.group(1) == loc_id:
status = 1 # let's generate the content between
elif status == 1: # generating lines between the tags
m = rexLocationClose.search(line)
if m:
status = 2
else:
yield line # yield the next interesting line
elif status == 2: # just pass the rest of the lines
pass
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Open in new window