Solved

issue with 'path' on python script

Posted on 2012-04-02
15
392 Views
Last Modified: 2012-05-29
so i have a python script that parses through a folder on disk, opens every TXT file, strips out the email addresses within the file, then appends the email address to the master file... this used to work well but not i keep getting an error/issue with the path arg that was passed to the script...

#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
	#Create a list of the locations of all of the bounceback e-mails:
	dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

	#Create a list of the target e-mail addresses in each bounceback
	#from the line in the bounceback that begins "To:":
	bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

	#Open the master list of e-mails and strip each e-mail
	#of newline characters:
	masterlist=[email.strip() for email in open(str(master)).readlines()]

	#Create a new master list of e-mails by removing those that are in
	#the bouncer list and adding newline characters for later file writing:
	newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

	#Get today's date:
	today=str(datetime.date.today())
	
	#Create a new master e-mail file name with today's date appended:
	newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
	
	#Open the new master file for writing:
	newmasterfile=open(newmasterfilename,'w')
	
	#Write the new master list to the file:
	newmasterfile.writelines(newmasterlist)
	
	#Close the new master file:
	newmasterfile.close()

	#Some dialogue to confirm successful execution:
	print "Old e-mail list size: ", len(masterlist)
	print "New e-mail list size: ", len(newmasterlist)
	print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
	print "Location of new master list:", newmasterfilename
	

if __name__=="__main__":
	import sys, os
	try:
		emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
	except:
		print "You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list."
		print "Please run the command in the following form (in the path formatting of your platform):"
		print "python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'"
	

Open in new window

then to run you pass the params for the DIR and the File

 python emailparser.py '/home/email_parser/bounced/' '/home/email_parser/bounced/master_list_emails.txt'

Open in new window


when i run it i get the "path not found"
0
Comment
Question by:Timothy Golden
  • 7
  • 4
  • 4
15 Comments
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37796749
wha ti really want to do is parse through 6000+ bounced emails from my mail damon and remove them from my excel data file... so this python script used to work well but for some dum reason it wont any more...

so if there is a better way to parse through 6000+ .EML files and extract the the email address into another file.. please indulge me... either way, making this script work, or making a new script that does the same thing would be what i'm looking to get fixed..

the underlying issue here is that i have 6000+ .EML bounced emails in my outlook that i want to find an easy way to parse through them and generate a list of the 6000+ email so i can remove them from my database..
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37797035
Hi,

Is directory path contains space?
What Error do you get exactly?
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37797066
the path has no spaces... the error is in the try/catch..
path = /home/email_parser/bounced/

Open in new window


error
You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list.
Please run the command in the following form (in the path formatting of your platform):
python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'
	

Open in new window

0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37797102
You could add some code to verify the user input.

For example,
  if os.path.exists( directory ) and os.path.isdir( directory ) :
    dirlist=glob.glob( os.path.join( directory, '*.txt' ) )
  else :
    print "Input error - specified location doesn't exist, or isn't a directory:", directory

Open in new window


Note: You don't need to use str(sys.argv[1]) since sys.argv is a list/array of strings.
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37797112
i will try taht code.. BUT i am putting the full path in correctly
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37797390
Please, remove try except block so that we can see the actual error message!
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37797438
Or, better yet, in the except clause, immediately after the "except", add the following statements:

  Type, value = sys.exc_info()[ : 2 ]
  print 'Exception type :', str( Type )
  print 'Exception value:', str( value )

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 3

Author Comment

by:Timothy Golden
ID: 37798112
now i have this:
AttributeError: 'str' object has no attribute 'format'
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37798288
what?!?

str() is a built in function that converts the parameter to a string.

What, exactly, do you have in the except clause?
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37799227
Hi,

If you want to use format you must run your script with python 3.x or 2.6 or 2.7. I think you have python 2.5.
if you don't want to change python, use % operator instead of format.

http://docs.python.org/release/3.0.1/whatsnew/2.6.html#pep-3101-advanced-string-formatting

change this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))

to
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 3

Author Comment

by:Timothy Golden
ID: 37800703
thanks @zaghaghiP i changed the line and now get the following:
<code>

  File "bounceparser.py", line 30
    newmasterfile=open(newmasterfilename,'w')
    ^
SyntaxError: invalid syntax
[root@lamp bounced]#

</code>
Line 30 =
<code>
      newmasterfile=open(newmasterfilename,'w')
</code>

py version
<code>
Python 2.4!
</code>

whole script:
<code>
#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
      #Create a list of the locations of all of the bounceback e-mails:
      dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

      #Create a list of the target e-mail addresses in each bounceback
      #from the line in the bounceback that begins "To:":
      bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

      #Open the master list of e-mails and strip each e-mail
      #of newline characters:
      masterlist=[email.strip() for email in open(str(master)).readlines()]

      #Create a new master list of e-mails by removing those that are in
      #the bouncer list and adding newline characters for later file writing:
      newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

      #Get today's date:
      today=str(datetime.date.today())
      
      #Create a new master e-mail file name with today's date appended:
      #newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
      
      #Open the new master file for writing:
      newmasterfile=open(newmasterfilename,'w')
      
      #Write the new master list to the file:
      newmasterfile.writelines(newmasterlist)
      
      #Close the new master file:
      newmasterfile.close()

      #Some dialogue to confirm successful execution:
      print "Old e-mail list size: ", len(masterlist)
      print "New e-mail list size: ", len(newmasterlist)
      print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
      print "Location of new master list:", newmasterfilename
      
if __name__=="__main__":
      import sys, os
      emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
</code>
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37800783
Hi,
correct the indentation of this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37801125
The comment by zaghaghi means that this error:

SyntaxError: invalid syntax

Generally refers to the fact that you have inconsistent indentation.
Remember that Python uses indentation to identify / indicate whether statements are nested, or not.

For example:

if myString.startswith( 'something' ) :
  print "something found."

Open in new window


The print statement is indented with 2 blank spaces to indicate that it is the "then" clause of the "if statement".  All subsequent statements within this same clause must be indented the same amount of whitespace.

Special note: Some text editors will, by default, use tab characters, and spaces interchangeably. You should be careful about this.  I would strongly encourage you to replace any and all tab characters with the appropriate number of blank spaces.
0
 
LVL 3

Accepted Solution

by:
Timothy Golden earned 0 total points
ID: 38009950
none of these solutions worked and i gave up trying to make it work
0
 
LVL 3

Author Closing Comment

by:Timothy Golden
ID: 38021362
because i could not get anything to work
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

896 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now