Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

issue with 'path' on python script

Posted on 2012-04-02
15
Medium Priority
?
404 Views
Last Modified: 2012-05-29
so i have a python script that parses through a folder on disk, opens every TXT file, strips out the email addresses within the file, then appends the email address to the master file... this used to work well but not i keep getting an error/issue with the path arg that was passed to the script...

#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
	#Create a list of the locations of all of the bounceback e-mails:
	dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

	#Create a list of the target e-mail addresses in each bounceback
	#from the line in the bounceback that begins "To:":
	bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

	#Open the master list of e-mails and strip each e-mail
	#of newline characters:
	masterlist=[email.strip() for email in open(str(master)).readlines()]

	#Create a new master list of e-mails by removing those that are in
	#the bouncer list and adding newline characters for later file writing:
	newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

	#Get today's date:
	today=str(datetime.date.today())
	
	#Create a new master e-mail file name with today's date appended:
	newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
	
	#Open the new master file for writing:
	newmasterfile=open(newmasterfilename,'w')
	
	#Write the new master list to the file:
	newmasterfile.writelines(newmasterlist)
	
	#Close the new master file:
	newmasterfile.close()

	#Some dialogue to confirm successful execution:
	print "Old e-mail list size: ", len(masterlist)
	print "New e-mail list size: ", len(newmasterlist)
	print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
	print "Location of new master list:", newmasterfilename
	

if __name__=="__main__":
	import sys, os
	try:
		emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
	except:
		print "You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list."
		print "Please run the command in the following form (in the path formatting of your platform):"
		print "python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'"
	

Open in new window

then to run you pass the params for the DIR and the File

 python emailparser.py '/home/email_parser/bounced/' '/home/email_parser/bounced/master_list_emails.txt'

Open in new window


when i run it i get the "path not found"
0
Comment
Question by:Timothy Golden
  • 7
  • 4
  • 4
15 Comments
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37796749
wha ti really want to do is parse through 6000+ bounced emails from my mail damon and remove them from my excel data file... so this python script used to work well but for some dum reason it wont any more...

so if there is a better way to parse through 6000+ .EML files and extract the the email address into another file.. please indulge me... either way, making this script work, or making a new script that does the same thing would be what i'm looking to get fixed..

the underlying issue here is that i have 6000+ .EML bounced emails in my outlook that i want to find an easy way to parse through them and generate a list of the 6000+ email so i can remove them from my database..
0
 
LVL 9

Expert Comment

by:Hamed Zaghaghi
ID: 37797035
Hi,

Is directory path contains space?
What Error do you get exactly?
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37797066
the path has no spaces... the error is in the try/catch..
path = /home/email_parser/bounced/

Open in new window


error
You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list.
Please run the command in the following form (in the path formatting of your platform):
python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'
	

Open in new window

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 41

Expert Comment

by:HonorGod
ID: 37797102
You could add some code to verify the user input.

For example,
  if os.path.exists( directory ) and os.path.isdir( directory ) :
    dirlist=glob.glob( os.path.join( directory, '*.txt' ) )
  else :
    print "Input error - specified location doesn't exist, or isn't a directory:", directory

Open in new window


Note: You don't need to use str(sys.argv[1]) since sys.argv is a list/array of strings.
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37797112
i will try taht code.. BUT i am putting the full path in correctly
0
 
LVL 9

Expert Comment

by:Hamed Zaghaghi
ID: 37797390
Please, remove try except block so that we can see the actual error message!
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37797438
Or, better yet, in the except clause, immediately after the "except", add the following statements:

  Type, value = sys.exc_info()[ : 2 ]
  print 'Exception type :', str( Type )
  print 'Exception value:', str( value )

Open in new window

0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37798112
now i have this:
AttributeError: 'str' object has no attribute 'format'
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37798288
what?!?

str() is a built in function that converts the parameter to a string.

What, exactly, do you have in the except clause?
0
 
LVL 9

Expert Comment

by:Hamed Zaghaghi
ID: 37799227
Hi,

If you want to use format you must run your script with python 3.x or 2.6 or 2.7. I think you have python 2.5.
if you don't want to change python, use % operator instead of format.

http://docs.python.org/release/3.0.1/whatsnew/2.6.html#pep-3101-advanced-string-formatting

change this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))

to
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37800703
thanks @zaghaghiP i changed the line and now get the following:
<code>

  File "bounceparser.py", line 30
    newmasterfile=open(newmasterfilename,'w')
    ^
SyntaxError: invalid syntax
[root@lamp bounced]#

</code>
Line 30 =
<code>
      newmasterfile=open(newmasterfilename,'w')
</code>

py version
<code>
Python 2.4!
</code>

whole script:
<code>
#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
      #Create a list of the locations of all of the bounceback e-mails:
      dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

      #Create a list of the target e-mail addresses in each bounceback
      #from the line in the bounceback that begins "To:":
      bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

      #Open the master list of e-mails and strip each e-mail
      #of newline characters:
      masterlist=[email.strip() for email in open(str(master)).readlines()]

      #Create a new master list of e-mails by removing those that are in
      #the bouncer list and adding newline characters for later file writing:
      newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

      #Get today's date:
      today=str(datetime.date.today())
      
      #Create a new master e-mail file name with today's date appended:
      #newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
      
      #Open the new master file for writing:
      newmasterfile=open(newmasterfilename,'w')
      
      #Write the new master list to the file:
      newmasterfile.writelines(newmasterlist)
      
      #Close the new master file:
      newmasterfile.close()

      #Some dialogue to confirm successful execution:
      print "Old e-mail list size: ", len(masterlist)
      print "New e-mail list size: ", len(newmasterlist)
      print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
      print "Location of new master list:", newmasterfilename
      
if __name__=="__main__":
      import sys, os
      emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
</code>
0
 
LVL 9

Expert Comment

by:Hamed Zaghaghi
ID: 37800783
Hi,
correct the indentation of this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37801125
The comment by zaghaghi means that this error:

SyntaxError: invalid syntax

Generally refers to the fact that you have inconsistent indentation.
Remember that Python uses indentation to identify / indicate whether statements are nested, or not.

For example:

if myString.startswith( 'something' ) :
  print "something found."

Open in new window


The print statement is indented with 2 blank spaces to indicate that it is the "then" clause of the "if statement".  All subsequent statements within this same clause must be indented the same amount of whitespace.

Special note: Some text editors will, by default, use tab characters, and spaces interchangeably. You should be careful about this.  I would strongly encourage you to replace any and all tab characters with the appropriate number of blank spaces.
0
 
LVL 3

Accepted Solution

by:
Timothy Golden earned 0 total points
ID: 38009950
none of these solutions worked and i gave up trying to make it work
0
 
LVL 3

Author Closing Comment

by:Timothy Golden
ID: 38021362
because i could not get anything to work
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Q&A with Course Creator, Mark Lassoff, on the importance of HTML5 in the career of a modern-day developer.
The SignAloud Glove is capable of translating American Sign Language signs into text and audio.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Introduction to Processes

877 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question