Solved

issue with 'path' on python script

Posted on 2012-04-02
15
391 Views
Last Modified: 2012-05-29
so i have a python script that parses through a folder on disk, opens every TXT file, strips out the email addresses within the file, then appends the email address to the master file... this used to work well but not i keep getting an error/issue with the path arg that was passed to the script...

#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
	#Create a list of the locations of all of the bounceback e-mails:
	dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

	#Create a list of the target e-mail addresses in each bounceback
	#from the line in the bounceback that begins "To:":
	bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

	#Open the master list of e-mails and strip each e-mail
	#of newline characters:
	masterlist=[email.strip() for email in open(str(master)).readlines()]

	#Create a new master list of e-mails by removing those that are in
	#the bouncer list and adding newline characters for later file writing:
	newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

	#Get today's date:
	today=str(datetime.date.today())
	
	#Create a new master e-mail file name with today's date appended:
	newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
	
	#Open the new master file for writing:
	newmasterfile=open(newmasterfilename,'w')
	
	#Write the new master list to the file:
	newmasterfile.writelines(newmasterlist)
	
	#Close the new master file:
	newmasterfile.close()

	#Some dialogue to confirm successful execution:
	print "Old e-mail list size: ", len(masterlist)
	print "New e-mail list size: ", len(newmasterlist)
	print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
	print "Location of new master list:", newmasterfilename
	

if __name__=="__main__":
	import sys, os
	try:
		emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
	except:
		print "You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list."
		print "Please run the command in the following form (in the path formatting of your platform):"
		print "python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'"
	

Open in new window

then to run you pass the params for the DIR and the File

 python emailparser.py '/home/email_parser/bounced/' '/home/email_parser/bounced/master_list_emails.txt'

Open in new window


when i run it i get the "path not found"
0
Comment
Question by:luckynh
  • 7
  • 4
  • 4
15 Comments
 
LVL 3

Author Comment

by:luckynh
ID: 37796749
wha ti really want to do is parse through 6000+ bounced emails from my mail damon and remove them from my excel data file... so this python script used to work well but for some dum reason it wont any more...

so if there is a better way to parse through 6000+ .EML files and extract the the email address into another file.. please indulge me... either way, making this script work, or making a new script that does the same thing would be what i'm looking to get fixed..

the underlying issue here is that i have 6000+ .EML bounced emails in my outlook that i want to find an easy way to parse through them and generate a list of the 6000+ email so i can remove them from my database..
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37797035
Hi,

Is directory path contains space?
What Error do you get exactly?
0
 
LVL 3

Author Comment

by:luckynh
ID: 37797066
the path has no spaces... the error is in the try/catch..
path = /home/email_parser/bounced/

Open in new window


error
You did not enter the absolute path to the directory of the bounced e-mail files and/or the path to the master e-mail list.
Please run the command in the following form (in the path formatting of your platform):
python emailparser.py '/Path/to/directory/' '/Path/to/file/master_list_emails.txt'
	

Open in new window

0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37797102
You could add some code to verify the user input.

For example,
  if os.path.exists( directory ) and os.path.isdir( directory ) :
    dirlist=glob.glob( os.path.join( directory, '*.txt' ) )
  else :
    print "Input error - specified location doesn't exist, or isn't a directory:", directory

Open in new window


Note: You don't need to use str(sys.argv[1]) since sys.argv is a list/array of strings.
0
 
LVL 3

Author Comment

by:luckynh
ID: 37797112
i will try taht code.. BUT i am putting the full path in correctly
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37797390
Please, remove try except block so that we can see the actual error message!
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37797438
Or, better yet, in the except clause, immediately after the "except", add the following statements:

  Type, value = sys.exc_info()[ : 2 ]
  print 'Exception type :', str( Type )
  print 'Exception value:', str( value )

Open in new window

0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 3

Author Comment

by:luckynh
ID: 37798112
now i have this:
AttributeError: 'str' object has no attribute 'format'
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37798288
what?!?

str() is a built in function that converts the parameter to a string.

What, exactly, do you have in the except clause?
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37799227
Hi,

If you want to use format you must run your script with python 3.x or 2.6 or 2.7. I think you have python 2.5.
if you don't want to change python, use % operator instead of format.

http://docs.python.org/release/3.0.1/whatsnew/2.6.html#pep-3101-advanced-string-formatting

change this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))

to
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 3

Author Comment

by:luckynh
ID: 37800703
thanks @zaghaghiP i changed the line and now get the following:
<code>

  File "bounceparser.py", line 30
    newmasterfile=open(newmasterfilename,'w')
    ^
SyntaxError: invalid syntax
[root@lamp bounced]#

</code>
Line 30 =
<code>
      newmasterfile=open(newmasterfilename,'w')
</code>

py version
<code>
Python 2.4!
</code>

whole script:
<code>
#!/usr/bin/python

#Import the necessary modules for use later:
import sys, os, fileinput, glob, datetime

def emailparser(directory, master):
      #Create a list of the locations of all of the bounceback e-mails:
      dirlist=glob.glob(os.path.join(str(directory),'*.txt'))

      #Create a list of the target e-mail addresses in each bounceback
      #from the line in the bounceback that begins "To:":
      bouncers=[email[3:].strip() for email in fileinput.input(dirlist) if email[0:3]=="To:"]

      #Open the master list of e-mails and strip each e-mail
      #of newline characters:
      masterlist=[email.strip() for email in open(str(master)).readlines()]

      #Create a new master list of e-mails by removing those that are in
      #the bouncer list and adding newline characters for later file writing:
      newmasterlist=[email+"\r\n" for email in masterlist if email not in bouncers]

      #Get today's date:
      today=str(datetime.date.today())
      
      #Create a new master e-mail file name with today's date appended:
      #newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.{0}.txt'.format(today))
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
      
      #Open the new master file for writing:
      newmasterfile=open(newmasterfilename,'w')
      
      #Write the new master list to the file:
      newmasterfile.writelines(newmasterlist)
      
      #Close the new master file:
      newmasterfile.close()

      #Some dialogue to confirm successful execution:
      print "Old e-mail list size: ", len(masterlist)
      print "New e-mail list size: ", len(newmasterlist)
      print "E-mails pruned: ", len(masterlist)-len(newmasterlist)
      print "Location of new master list:", newmasterfilename
      
if __name__=="__main__":
      import sys, os
      emailparser(directory=str(sys.argv[1]), master=str(sys.argv[2]))
</code>
0
 
LVL 9

Expert Comment

by:zaghaghi
ID: 37800783
Hi,
correct the indentation of this line:
newmasterfilename=os.path.join(os.path.dirname(master),'master_list_emails.%s.txt' % (today))
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 37801125
The comment by zaghaghi means that this error:

SyntaxError: invalid syntax

Generally refers to the fact that you have inconsistent indentation.
Remember that Python uses indentation to identify / indicate whether statements are nested, or not.

For example:

if myString.startswith( 'something' ) :
  print "something found."

Open in new window


The print statement is indented with 2 blank spaces to indicate that it is the "then" clause of the "if statement".  All subsequent statements within this same clause must be indented the same amount of whitespace.

Special note: Some text editors will, by default, use tab characters, and spaces interchangeably. You should be careful about this.  I would strongly encourage you to replace any and all tab characters with the appropriate number of blank spaces.
0
 
LVL 3

Accepted Solution

by:
luckynh earned 0 total points
ID: 38009950
none of these solutions worked and i gave up trying to make it work
0
 
LVL 3

Author Closing Comment

by:luckynh
ID: 38021362
because i could not get anything to work
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
This is an explanation of a simple data model to help parse a JSON feed
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now